向量数据库深度解析

Souloss

公告

欢迎来到我的博客！这是一条示例公告

Learn More

标签

Souloss

公告

欢迎来到我的博客！这是一条示例公告

Learn More

标签

Souloss

公告

欢迎来到我的博客！这是一条示例公告

Learn More

标签

477 字

1 分钟

向量数据库深度解析

2025-07-29

AI

RAG

/

AI

一、向量数据库概述#

1.1 为什么需要向量数据库#

flowchart TB subgraph Traditional[" 传统数据库"] direction TB T1[" 结构化数据 表格/关系"] T2[" 精确匹配 SQL = WHERE"] T3[" 固定 Schema 预定义字段"] T4[" 局限性 无法理解语义"] end subgraph Vector[" 向量数据库"] direction TB V1[" 高维向量 Embedding"] V2[" 相似性搜索 KNN/ANN"] V3[" 灵活 Schema 动态元数据"] V4[" 优势 语义理解能力"] end Traditional -->|"演进"| Vector style Traditional fill:#ffe6e6,stroke:#ff6666 style Vector fill:#e6ffe6,stroke:#66cc66

graph TB subgraph "传统数据库局限" A["结构化数据"] B["精确匹配"] C["SQL 查询"] end subgraph "向量数据库优势" D["非结构化数据"] E["相似性搜索"] F["语义理解"] end A --> D B --> E C --> F

维度	传统数据库	向量数据库
数据类型	结构化	高维向量
查询方式	精确匹配	相似性搜索
应用场景	业务数据	AI 语义检索
召回率	100%	可调节（召回 vs 速度）

1.2 核心应用场景#

flowchart TB subgraph ImageSearch[" 图像检索"] direction LR I1["用户上传图片"] --> I2["CNN 提取特征"] I2 --> I3["特征向量"] I3 --> I4["向量检索"] I4 --> I5["相似图片"] end subgraph VoiceSearch[" 语音搜索"] direction LR V1["语音输入"] --> V2["ASR 转文本"] V2 --> V3["Embedding"] V3 --> V4["向量检索"] V4 --> V5["搜索结果"] end subgraph RAGSearch[" RAG 知识库"] direction LR R1["用户问题"] --> R2["问题向量化"] R2 --> R3["知识库检索"] R3 --> R4["相关文档"] R4 --> R5["LLM 生成答案"] end style ImageSearch fill:#e3f2fd style VoiceSearch fill:#fff8e1 style RAGSearch fill:#e8f5e9

graph LR A["用户上传图片"] --> B["提取特征向量"] B --> C["向量数据库检索"] C --> D["返回相似图片"] E["语音输入"] --> F["ASR 转文本"] F --> G["Embedding 向量化"] G --> H["向量检索"] H --> I["语音搜索结果"]

二、主流向量数据库对比#

2.0 架构对比概览#

flowchart TB subgraph MilvusArch[" Milvus 架构"] direction TB M1["SDK/API"] --> M2["Proxy"] M2 --> M3["Coordinator 协调器集群"] M3 --> M4["Worker Nodes 查询/索引/数据节点"] M4 --> M5["Object Storage MinIO/S3"] style MilvusArch fill:#e3f2fd end subgraph QdrantArch[" Qdrant 架构"] direction TB Q1["REST/gRPC API"] --> Q2["Collection Manager"] Q2 --> Q3["HNSW Index Rust 实现"] Q3 --> Q4["Storage RocksDB/内存"] style QdrantArch fill:#fff8e1 end subgraph PineconeArch[" Pinecone 架构"] direction TB P1["Client SDK"] --> P2["Pinecone Cloud"] P2 --> P3["Managed Index 全托管"] P3 --> P4["Auto Scaling 自动扩缩"] style PineconeArch fill:#e8f5e9 end subgraph ChromaArch[" Chroma 架构"] direction TB C1["Python SDK"] --> C2["DuckDB 嵌入式存储"] C2 --> C3["HNSW Index hnswlib"] style ChromaArch fill:#fce4ec end

2.1 功能对比#

特性	Milvus	Qdrant	Pinecone	Chroma	Weaviate
部署方式	自托管	自托管	云服务	嵌入式	自托管
索引算法	HNSW/IVF	HNSW	HNSW	HNSW	HNSW
混合搜索
全文搜索
分布式
元数据过滤
Python SDK
Rust 核心

2.2 性能对比#

1
# 向量数据库性能基准测试框架
2
class VectorDBBenchmark:
3
    def __init__(self, db_type: str):
4
        self.db_type = db_type
5

6
    def benchmark(self, dataset_size: int = 1000000):
7
        """QPS 与召回率基准测试"""
8
        results = {
9
            "milvus": {"qps": 5000, "recall": 0.95, "p99_latency": "15ms"},
10
            "qdrant": {"qps": 8000, "recall": 0.97, "p99_latency": "8ms"},
11
            "weaviate": {"qps": 3000, "recall": 0.93, "p99_latency": "20ms"},
12
        }
13
        return results.get(self.db_type, {})

三、索引算法详解#

3.1 HNSW 算法原理#

flowchart TB subgraph L2[" Layer 2 - 快速跳转"] direction LR A2["A"] --- B2["B"] B2 --- C2["C"] end subgraph L1[" Layer 1 - 中间过渡"] direction LR A1["A"] --- B1["D"] B1 --- C1["E"] D1["D"] --- E1["F"] end subgraph L0[" Layer 0 - 精确搜索"] direction LR A0["A"] --- B0["G"] B0 --- C0["H"] C0 --- D0["I"] D0 --- E0["J"] E0 --- F0["K"] F0 --- G0["L"] end L2 -.->|"下沉"| L1 L1 -.->|"下沉"| L0 Search[" 搜索过程: 顶层入口 → 贪心搜索 → 逐层下沉"] style L2 fill:#e3f2fd,stroke:#1976d2 style L1 fill:#fff8e1,stroke:#f57c00 style L0 fill:#e8f5e9,stroke:#388e3c style Search fill:#f3e5f5

graph TB subgraph "Layer 2" A2["Node A"] B2["Node B"] C2["Node C"] end subgraph "Layer 1" A1["Node A"] --> B1["Node D"] B1 --> C1["Node E"] D1 --> E1["Node F"] end subgraph "Layer 0" A0["Node A"] --> B0["Node G"] B0 --> C0["Node H"] C0 --> D0["Node I"] D0 --> E0["Node J"] end

1
# HNSW 搜索算法伪代码
2
class HNSWIndex:
3
    def __init__(self, m: int = 16, ef_construction: int = 200):
4
        self.m = m  # 每一层最多连接数
5
        self.ef_construction = ef_construction  # 构建时动态列表大小
6

7
    def search(self, query_vector: list, ef: int = 100):
8
        """
9
        搜索过程：
10
        1. 从顶层开始，找到最近邻
11
        2. 逐层向下，更新候选集
12
        3. 最终在底层返回精确结果
13
        """
14
        candidates = self._search_layer_0(query_vector, ef)
15

16
        # 层间跳转
17
        for layer in range(self.max_layer, 0, -1):
18
            candidates = self._expand_neighbors(
19
                candidates,
20
                query_vector,
21
                ef=ef_construction
22
            )
23

24
        return candidates[:ef]

3.2 IVF 倒排索引#

1
# IVF 索引原理
2
class IVFIndex:
3
    def __init__(self, nlist: int = 1024):
4
        self.nlist = nlist  # 聚类中心数量
5

6
    def build(self, vectors: list):
7
        """1. k-means 聚类建立倒排表"""
8
        self.centers, assignments = kmeans(vectors, self.nlist)
9

10
        # 2. 建立倒排表
11
        self.inverted_index = {}
12
        for i, cluster_id in enumerate(assignments):
13
            if cluster_id not in self.inverted_index:
14
                self.inverted_index[cluster_id] = []
15
            self.inverted_index[cluster_id].append(i)
16

17
    def search(self, query: list, nprobe: int = 10):
18
        """搜索时只扫描 nprobe 个聚类中心"""
19
        # 1. 找到最近的 nprobe 个聚类中心
20
        nearest_centers = self._find_nearest_centers(
21
            query,
22
            nprobe
23
        )
24

25
        # 2. 在这些聚类中暴力搜索
26
        candidates = []
27
        for center_id in nearest_centers:
28
            candidates.extend(self.inverted_index[center_id])
29

30
        return self._brute_force_search(query, candidates)

3.3 PQ 量化压缩#

1
# Product Quantization 原理
2
class PQIndex:
3
    def __init__(self, m: int = 8, ks: int = 256):
4
        """
5
        m: 分段数（通常 8-16）
6
        ks: 每个子空间码本大小（通常 256）
7
        """
8
        self.m = m
9
        self.ks = ks
10
        self.codebooks = []
11

12
    def train(self, vectors: list):
13
        """训练码书"""
14
        dim = len(vectors[0])
15
        sub_dim = dim // self.m
16

17
        for i in range(self.m):
18
            # 对第 i 个子空间进行 k-means
19
            sub_vectors = vectors[:, i*sub_dim:(i+1)*sub_dim]
20
            codebook = kmeans(sub_vectors, self.ks)
21
            self.codebooks.append(codebook)
22

23
    def encode(self, vector: list):
24
        """编码：每个子空间找最近码字"""
25
        codes = []
26
        for i, codebook in enumerate(self.codebooks):
27
            sub_vec = vector[i*self.sub_dim:(i+1)*self.sub_dim]
28
            code = np.argmin(np.linalg.norm(codebook - sub_vec, axis=1))
29
            codes.append(code)
30
        return codes
31

32
    def search(self, query: list, candidates: list):
33
        """距离计算使用查表而非暴力计算"""
34
        # 预计算查询向量各子空间到各码字距离
35
        distance_table = self._build_distance_table(query)
36

37
        # 查表累加获取最终距离
38
        for vector_id in candidates:
39
            total_dist = sum(
40
                distance_table[i][self.codes[vector_id][i]]
41
                for i in range(self.m)
42
            )

四、Milvus 深度解析#

4.0 Milvus 架构详解#

flowchart TB subgraph Client[" 客户端"] SDK["SDK Python/Go/Java/Node"] end subgraph Access[" 接入层"] direction LR P1["Proxy 1"] P2["Proxy 2"] P3["Proxy N"] P1 --- P2 --- P3 end subgraph Coordinator[" 协调层"] direction TB RC["Root Coordinator 元数据管理"] QC["Query Coordinator 查询调度"] DC["Data Coordinator 数据管理"] IC["Index Coordinator 索引管理"] end subgraph Worker[" 工作节点"] direction TB QN["Query Node 查询执行"] DN["Data Node 数据写入"] IN["Index Node 索引构建"] end subgraph Storage[" 存储层"] direction TB MS["Message Storage Kafka/Pulsar"] OS["Object Storage MinIO/S3"] MD["Meta Storage etcd"] end SDK --> Access Access --> Coordinator Coordinator --> Worker Worker --> Storage style Client fill:#e8eaf6 style Access fill:#e3f2fd style Coordinator fill:#fff8e1 style Worker fill:#f3e5f5 style Storage fill:#e8f5e9

4.1 架构设计#

graph TB subgraph "接入层" A["SDK"] --> B["Proxy"] B --> C["RBAC"] end subgraph "协调层" C --> D["Root Coordinator"] C --> E["Query Coordinator"] C --> F["Index Coordinator"] end subgraph "执行层" D --> G["Query Node"] E --> H["Index Node"] F --> I["Data Node"] end subgraph "存储层" G --> J["Object Storage"] H --> J I --> J J --> K["MinIO/S3"] end

4.2 Milvus 使用示例#

1
from pymilvus import connections, Collection, FieldSchema, CollectionSchema, DataType
2

3
# 1. 连接 Milvus
4
connections.connect(host='localhost', port='19530')
5

6
# 2. 定义 Schema
7
fields = [
8
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True),
9
    FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=768),
10
    FieldSchema(name="category", dtype=DataType.VARCHAR, max_length=100),
11
]
12
schema = CollectionSchema(fields=fields, description="向量检索集合")
13

14
# 3. 创建 Collection
15
collection = Collection(name="documents", schema=schema)
16

17
# 4. 创建索引
18
index_params = {
19
    "index_type": "HNSW",
20
    "params": {"M": 16, "efConstruction": 200},
21
    "metric_type": "L2"
22
}
23
collection.create_index(field_name="embedding", index_params=index_params)
24

25
# 5. 插入数据
26
entities = [
27
    [1, 2, 3],  # id
28
    [[0.1]*768, [0.2]*768],  # embeddings
29
    ["科技", "娱乐"]  # category
30
]
31
collection.insert(entities)
32

33
# 6. 搜索
34
search_params = {"metric_type": "L2", "params": {"ef": 100}}
35
results = collection.search(
36
    data=[[0.1]*768],
37
    anns_field="embedding",
38
    param=search_params,
39
    limit=10,
40
    expr="category == '科技'"
41
)

五、Qdrant 深度解析#

5.0 Qdrant 架构详解#

flowchart TB subgraph Client[" 客户端"] direction LR C1["Python SDK"] C2["Rust SDK"] C3["REST API"] end subgraph API[" API 层"] direction TB REST["REST API HTTP/gRPC"] end subgraph Core[" 核心引擎 - Rust"] direction TB CM["Collection Manager 集合管理"] SEG["Segment 数据分片"] HNSW["HNSW Index 高性能图索引"] QC["Quantization 向量量化"] end subgraph Storage[" 存储层"] direction TB MEM["Memory 热数据"] ROCK["RocksDB 持久化存储"] SNAP["Snapshot 备份恢复"] end Client --> API API --> Core Core --> Storage style Client fill:#e8eaf6 style API fill:#e3f2fd style Core fill:#fff8e1 style Storage fill:#e8f5e9

flowchart LR subgraph QdrantFeatures[" Qdrant 特色"] direction TB F1[" Rust 实现 内存安全+高性能"] F2[" 单二进制部署 运维简单"] F3[" 原生过滤 元数据+向量联合"] F4[" 动态负载 自动均衡"] end style QdrantFeatures fill:#f3e5f5

5.1 核心特性#

1
# Qdrant 配置文件
2
storage:
3
  # 存储路径
4
  storage_path: /qdrant/storage
5

6
  # HNSW 参数
7
  hnsw_index:
8
    m: 16 # 连接数
9
    ef_construct: 100 # 构建时搜索深度
10
    full_scan_threshold: 10000 # 小于此规模用暴力搜索
11

12
  # 量化配置
13
  quantization:
14
    binary: false
15
    product:
16
      compression: 8 # 压缩比

5.2 Qdrant 使用示例#

1
from qdrant_client import QdrantClient
2
from qdrant_client.models import Distance, VectorParams, Filter
3

4
client = QdrantClient(host="localhost", port=6333)
5

6
# 1. 创建 Collection
7
client.create_collection(
8
    collection_name="articles",
9
    vectors_config=VectorParams(size=768, distance=Distance.COSINE)
10
)
11

12
# 2. 插入向量
13
client.upsert(
14
    collection_name="articles",
15
    points=[
16
        {
17
            "id": 1,
18
            "vector": [0.1]*768,
19
            "payload": {"title": "Python 教程", "category": "编程"}
20
        },
21
    ]
22
)
23

24
# 3. 搜索
25
results = client.search(
26
    collection_name="articles",
27
    query_vector=[0.1]*768,
28
    query_filter=Filter(
29
        must=[
30
            {"key": "category", "match": {"value": "编程"}}
31
        ]
32
    ),
33
    limit=5
34
)
35

36
# 4. 范围搜索（按分数过滤）
37
results = client.search(
38
    collection_name="articles",
39
    query_vector=[0.1]*768,
40
    score_threshold=0.8,  # 只返回相似度 > 0.8 的结果
41
    limit=10
42
)

六、选型指南#

6.0 四大数据库对比#

flowchart TB subgraph Milvus[" Milvus"] direction TB M_Pros["分布式架构 超大规模支持 功能最全面"] M_Cons["部署复杂 运维成本高"] M_Use[" 适用: 大型企业 亿级向量"] end subgraph Qdrant[" Qdrant"] direction TB Q_Pros["性能优异 部署简单 原生过滤"] Q_Cons["社区较小 云服务起步"] Q_Use[" 适用: 高性能需求 百万-千万级"] end subgraph Pinecone[" Pinecone"] direction TB P_Pros["全托管 零运维 快速上手"] P_Cons["成本较高 数据主权问题"] P_Use[" 适用: 快速迭代 SaaS 产品"] end subgraph Chroma[" Chroma"] direction TB C_Pros["嵌入式 一行代码启动 开源免费"] C_Cons["不支持分布式 性能有限"] C_Use[" 适用: 原型开发 小规模场景"] end style Milvus fill:#e3f2fd style Qdrant fill:#fff8e1 style Pinecone fill:#e8f5e9 style Chroma fill:#fce4ec

6.1 场景化选型#

场景	推荐方案	原因
快速原型	Chroma	嵌入式，一行代码启动
生产环境自托管	Qdrant/Milvus	性能好，功能完整
云原生 SaaS	Pinecone	全托管，免运维
需要混合检索	Qdrant/Weaviate	支持向量+全文联合检索
超大规模（>1亿）	Milvus	分布式架构成熟
低延迟实时系统	Qdrant	Rust 实现，性能优异

6.2 性能优化建议#

1
# 向量数据库性能优化清单
2
optimization_checklist = {
3
    "索引参数": {
4
        "HNSW_M": "16-64，越大越精确但越慢",
5
        "HNSW_ef": "100-500，搜索时动态调整",
6
        "HNSW_ef_construction": "200-400，构建时精度"
7
    },
8
    "量化策略": {
9
        "binary": "内存减半，速度加倍，精度略降",
10
        "product": "内存压缩 4-16 倍，精度可调",
11
        "scalar": "不影响精度，略微提升速度"
12
    },
13
    "硬件配置": {
14
        "内存": "能装下全部向量 + 索引 > 80% 命中率",
15
        "CPU": "影响索引构建速度，HNSW 构建 CPU 密集",
16
        "SSD": "向量量大时必须，内存不足时代替"
17
    }
18
}

七、总结#

mindmap root((向量数据库)) Milvus 分布式架构云原生设计功能最全适用大规模 Qdrant Rust 高性能单机部署简单原生过滤强大适用高并发 Pinecone 全托管 SaaS 零运维快速上线适用初创 Chroma 嵌入式轻量开发便捷原型首选适用小规模选型因素数据规模性能需求运维能力成本预算

数据库	优势	适用场景
Milvus	功能全面，分布式成熟	超大规模生产环境
Qdrant	性能优异，Rust 实现	高并发低延迟需求
Pinecone	全托管，运维简单	云原生，快速上线
Chroma	轻量级，原型快速	小规模，本地开发
Weaviate	混合搜索，原生 GraphQL	知识图谱+向量融合

flowchart TB Start[" 开始选型"] --> Q1{"数据规模?"} Q1 -->|"≤ 100 万"| Q2{"是否需要 全文搜索?"} Q1 -->|"100 万 - 1 亿"| Q3{"性能优先 还是功能优先?"} Q1 -->|"≥ 1 亿"| Milvus[" Milvus 分布式架构"] Q2 -->|"是"| Weaviate[" Weaviate"] Q2 -->|"否"| Q4{"开发阶段?"} Q4 -->|"原型/开发"| Chroma[" Chroma"] Q4 -->|"生产环境"| Qdrant[" Qdrant"] Q3 -->|"性能优先"| Qdrant Q3 -->|"功能优先"| Milvus Q5{"运维能力?"} -->|"强"| Qdrant Q5 -->|"弱"| Pinecone[" Pinecone"] style Start fill:#e8eaf6 style Milvus fill:#e3f2fd style Qdrant fill:#fff8e1 style Pinecone fill:#e8f5e9 style Chroma fill:#fce4ec ```