Agent 记忆系统：短期、长期与向量数据库

Souloss

公告

欢迎来到我的博客！这是一条示例公告

Learn More

标签

Souloss

公告

欢迎来到我的博客！这是一条示例公告

Learn More

标签

Souloss

公告

欢迎来到我的博客！这是一条示例公告

Learn More

标签

1259 字

3 分钟

Agent 记忆系统：短期、长期与向量数据库

2024-12-20

AI

/

Agent

/

RAG

「你叫什么名字？」「我叫小智。」

「我叫什么名字？」「抱歉，我不知道您的名字。」

这是很多 AI 系统的真实写照——它们没有记忆。

没有记忆的 Agent 就像《记忆碎片》的主角，每次对话都是全新的开始，无法积累经验，无法建立用户画像，无法提供个性化服务。

记忆系统是 Agent 从「一次性工具」进化为「长期伙伴」的关键基础设施。

本文要点#

Agent 记忆的三层架构
短期记忆：上下文窗口管理
长期记忆：向量数据库选型与实践
情景记忆：任务执行状态追踪
记忆检索策略
记忆压缩与遗忘机制

一、为什么 Agent 需要记忆？#

1.1 没有记忆的问题#

1
用户：「帮我分析一下苹果公司的股票」
2

3
AI：「好的，苹果公司股票分析如下...」
4

5
（三天后）
6

7
用户：「上次那个分析，能再详细说说吗？」
8

9
AI：「抱歉，我不知道您指的哪个分析...」
10

11
问题：
12
1. 无法记住历史对话
13
2. 无法理解用户指代
14
3. 无法积累用户偏好
15
4. 每次都要从头开始

1.2 记忆的价值#

flowchart TD A[记忆系统] --> B[用户画像] A --> C[任务上下文] A --> D[知识积累] A --> E[经验学习] B --> B1[个性化服务] C --> C1[多轮任务支持] D --> D1[知识库构建] E --> E1[能力持续提升]

记忆带来的核心价值：

价值	说明	示例
连续性	支持跨会话对话	「上次讨论的项目进展如何？」
个性化	记住用户偏好	「按您习惯的格式生成报告」
学习性	从经验中改进	避免重复犯同样的错误
效率	减少重复沟通	无需每次解释背景信息

二、Agent 记忆的三层架构#

2.1 架构总览#

flowchart TB subgraph 感知层 A[用户输入] end subgraph 记忆层 B[短期记忆<br/>Working Memory] C[长期记忆<br/>Long-term Memory] D[情景记忆<br/>Episodic Memory] end subgraph 存储层 E[内存/缓存] F[向量数据库] G[关系数据库] end A --> B B <--> C B <--> D C --> F D --> G B --> E

2.2 三层记忆对比#

类型	存储位置	生命周期	容量	用途
短期记忆	内存/缓存	会话级	有限	当前对话上下文
长期记忆	向量数据库	持久化	大规模	用户画像、知识库
情景记忆	关系数据库	任务级	中等	任务执行状态

三、短期记忆（Working Memory）#

3.1 什么是短期记忆？#

短期记忆是 Agent 处理当前任务时的「工作台」，存储正在使用的临时信息。

1
┌─────────────────────────────────────────────────────────────┐
2
│                    短期记忆结构                              │
3
├─────────────────────────────────────────────────────────────┤
4
│                                                             │
5
│  系统提示词（System Prompt）                                 │
6
│  ├── 角色定义                                               │
7
│  ├── 能力说明                                               │
8
│  └── 输出规范                                               │
9
│                                                             │
10
│  对话历史（Conversation History）                           │
11
│  ├── 用户消息 1                                             │
12
│  ├── 助手回复 1                                             │
13
│  ├── 用户消息 2                                             │
14
│  ├── 助手回复 2                                             │
15
│  └── ...                                                    │
16
│                                                             │
17
│  工具调用记录（Tool Calls）                                  │
18
│  ├── 调用：search("天气")                                   │
19
│  ├── 结果：北京晴 25°C                                      │
20
│  └── ...                                                    │
21
│                                                             │
22
│  当前任务状态（Task State）                                  │
23
│  ├── 已完成步骤                                             │
24
│  ├── 待执行步骤                                             │
25
│  └── 中间结果                                               │
26
│                                                             │
27
└─────────────────────────────────────────────────────────────┘

3.2 上下文窗口管理#

大模型的上下文窗口有限（如 128K tokens），需要合理管理。

flowchart LR A[新消息] --> B{窗口是否已满?} B -->|否| C[直接添加] B -->|是| D[压缩/摘要] D --> E[替换旧内容] E --> C C --> F[更新窗口]

管理策略#

策略 1：滑动窗口（Sliding Window）

1
class SlidingWindowMemory:
2
    """滑动窗口记忆管理"""
3

4
    def __init__(self, max_messages: int = 20):
5
        self.max_messages = max_messages
6
        self.messages = []
7

8
    def add_message(self, role: str, content: str):
9
        self.messages.append({"role": role, "content": content})
10
        # 保持窗口大小
11
        if len(self.messages) > self.max_messages:
12
            self.messages = self.messages[-self.max_messages:]
13

14
    def get_context(self) -> list:
15
        return self.messages

问题：简单粗暴，可能丢失重要信息。

策略 2：摘要压缩（Summarization）

1
class SummaryMemory:
2
    """摘要压缩记忆管理"""
3

4
    def __init__(self, llm, max_tokens: int = 4000):
5
        self.llm = llm
6
        self.max_tokens = max_tokens
7
        self.summary = ""
8
        self.recent_messages = []
9

10
    def add_message(self, role: str, content: str):
11
        self.recent_messages.append({"role": role, "content": content})
12

13
        # 超过阈值时压缩
14
        if self._estimate_tokens() > self.max_tokens:
15
            self._compress()
16

17
    def _compress(self):
18
        """用 LLM 生成摘要"""
19
        prompt = f"""请总结以下对话的关键信息：
20

21
{self._format_messages(self.recent_messages)}
22

23
当前摘要：{self.summary}
24

25
请生成新的综合摘要，保留重要信息。"""
26

27
        self.summary = self.llm.generate(prompt)
28
        # 保留最近几条消息
29
        self.recent_messages = self.recent_messages[-5:]
30

31
    def get_context(self) -> list:
32
        context = []
33
        if self.summary:
34
            context.append({
35
                "role": "system",
36
                "content": f"历史对话摘要：{self.summary}"
37
            })
38
        context.extend(self.recent_messages)
39
        return context

策略 3：混合策略（Hybrid）

1
class HybridMemory:
2
    """混合记忆策略：重要消息保留 + 普通消息压缩"""
3

4
    def __init__(self, llm, max_tokens: int = 8000):
5
        self.llm = llm
6
        self.max_tokens = max_tokens
7
        self.important_messages = []  # 重要消息永久保留
8
        self.regular_messages = []    # 普通消息可压缩
9
        self.summary = ""
10

11
    def add_message(self, role: str, content: str, important: bool = False):
12
        message = {"role": role, "content": content, "important": important}
13

14
        if important:
15
            self.important_messages.append(message)
16
        else:
17
            self.regular_messages.append(message)
18

19
        if self._estimate_tokens() > self.max_tokens:
20
            self._compress_regular_messages()
21

22
    def _compress_regular_messages(self):
23
        """压缩普通消息"""
24
        if len(self.regular_messages) > 3:
25
            to_compress = self.regular_messages[:-3]
26
            self.regular_messages = self.regular_messages[-3:]
27

28
            prompt = f"总结以下对话内容：\n{self._format_messages(to_compress)}"
29
            new_summary = self.llm.generate(prompt)
30

31
            self.summary = f"{self.summary}\n{new_summary}" if self.summary else new_summary

四、长期记忆（Long-term Memory）#

4.1 什么是长期记忆？#

长期记忆用于持久化存储用户信息、知识库、历史交互等，支持跨会话检索。

flowchart TD A[长期记忆] --> B[用户画像] A --> C[知识库] A --> D[历史交互] B --> B1[偏好设置] B --> B2[个人信息] B --> B3[行为习惯] C --> C1[文档知识] C --> C2[领域知识] C --> C3[操作手册] D --> D1[历史对话] D --> D2[任务记录] D --> D3[反馈结果]

4.2 向量数据库选型#

长期记忆通常使用向量数据库存储，支持语义检索。

1
┌─────────────────────────────────────────────────────────────┐
2
│                  主流向量数据库对比                          │
3
├─────────────────────────────────────────────────────────────┤
4
│                                                             │
5
│  ChromaDB                                                   │
6
│  ├── 特点：轻量级、易上手、开源                              │
7
│  ├── 适用：中小规模、快速原型                                │
8
│  └── 部署：本地/Docker                                      │
9
│                                                             │
10
│  FAISS（Meta）                                              │
11
│  ├── 特点：高性能、纯向量搜索                                │
12
│  ├── 适用：大规模向量检索                                    │
13
│  └── 部署：本地库                                           │
14
│                                                             │
15
│  Milvus                                                     │
16
│  ├── 特点：分布式、企业级、功能丰富                          │
17
│  ├── 适用：大规模生产环境                                    │
18
│  └── 部署：云原生/Kubernetes                                │
19
│                                                             │
20
│  Pinecone                                                   │
21
│  ├── 特点：托管服务、零运维                                  │
22
│  ├── 适用：快速上线、不想管基础设施                          │
23
│  └── 部署：云服务                                           │
24
│                                                             │
25
│  Weaviate                                                   │
26
│  ├── 特点：GraphQL API、混合搜索                             │
27
│  ├── 适用：需要结构化+向量混合查询                           │
28
│  └── 部署：Docker/云服务                                    │
29
│                                                             │
30
└─────────────────────────────────────────────────────────────┘

4.3 实现示例：ChromaDB#

1
from chromadb import Client
2
from chromadb.config import Settings
3
from openai import OpenAI
4
from typing import List, Dict
5
import uuid
6

7
class LongTermMemory:
8
    """基于 ChromaDB 的长期记忆系统"""
9

10
    def __init__(self, collection_name: str = "agent_memory"):
11
        self.client = Client(Settings(
12
            chroma_db_impl="duckdb+parquet",
13
            persist_directory="./chroma_db"
14
        ))
15
        self.collection = self.client.get_or_create_collection(
16
            name=collection_name,
17
            metadata={"hnsw:space": "cosine"}
18
        )
19
        self.embedder = OpenAI()  # 用于生成 embedding
20

21
    def _get_embedding(self, text: str) -> List[float]:
22
        """生成文本的向量表示"""
23
        response = self.embedder.embeddings.create(
24
            model="text-embedding-3-small",
25
            input=text
26
        )
27
        return response.data[0].embedding
28

29
    def store(self, content: str, metadata: Dict = None) -> str:
30
        """存储记忆"""
31
        memory_id = str(uuid.uuid4())
32
        embedding = self._get_embedding(content)
33

34
        self.collection.add(
35
            ids=[memory_id],
36
            embeddings=[embedding],
37
            documents=[content],
38
            metadatas=[metadata or {}]
39
        )
40

41
        return memory_id
42

43
    def recall(self, query: str, n_results: int = 5) -> List[Dict]:
44
        """检索相关记忆"""
45
        query_embedding = self._get_embedding(query)
46

47
        results = self.collection.query(
48
            query_embeddings=[query_embedding],
49
            n_results=n_results,
50
            include=["documents", "metadatas", "distances"]
51
        )
52

53
        memories = []
54
        for i in range(len(results["ids"][0])):
55
            memories.append({
56
                "id": results["ids"][0][i],
57
                "content": results["documents"][0][i],
58
                "metadata": results["metadatas"][0][i],
59
                "distance": results["distances"][0][i]
60
            })
61

62
        return memories
63

64
    def forget(self, memory_id: str):
65
        """删除记忆"""
66
        self.collection.delete(ids=[memory_id])
67

68
    def update(self, memory_id: str, new_content: str, new_metadata: Dict = None):
69
        """更新记忆"""
70
        embedding = self._get_embedding(new_content)
71

72
        self.collection.update(
73
            ids=[memory_id],
74
            embeddings=[embedding],
75
            documents=[new_content],
76
            metadatas=[new_metadata or {}]
77
        )

4.4 用户画像存储#

1
class UserProfileMemory:
2
    """用户画像记忆系统"""
3

4
    def __init__(self, long_term_memory: LongTermMemory):
5
        self.memory = long_term_memory
6

7
    def learn_preference(self, user_id: str, preference: str, value: str):
8
        """学习用户偏好"""
9
        content = f"用户偏好：{preference} = {value}"
10
        metadata = {
11
            "type": "preference",
12
            "user_id": user_id,
13
            "preference_key": preference
14
        }
15
        self.memory.store(content, metadata)
16

17
    def get_user_preferences(self, user_id: str) -> Dict[str, str]:
18
        """获取用户所有偏好"""
19
        # 这里可以扩展为更复杂的查询
20
        results = self.memory.recall(f"用户 {user_id} 偏好", n_results=20)
21

22
        preferences = {}
23
        for item in results:
24
            if item["metadata"].get("type") == "preference":
25
                key = item["metadata"].get("preference_key")
26
                # 解析偏好内容
27
                if "=" in item["content"]:
28
                    value = item["content"].split("=")[1].strip()
29
                    preferences[key] = value
30

31
        return preferences

五、情景记忆（Episodic Memory）#

5.1 什么是情景记忆？#

情景记忆记录特定任务执行的完整过程，包括步骤、决策、结果。

flowchart TD A[情景记忆] --> B[任务目标] A --> C[执行步骤] A --> D[决策过程] A --> E[执行结果] A --> F[经验总结] C --> C1[Step 1: 搜索] C --> C2[Step 2: 分析] C --> C3[Step 3: 生成] D --> D1[为什么选择这个工具] D --> D2[为什么做出这个决策]

5.2 情景记忆的结构#

1
from dataclasses import dataclass, field
2
from datetime import datetime
3
from typing import List, Dict, Any
4
import json
5

6
@dataclass
7
class ExecutionStep:
8
    """执行步骤"""
9
    step_id: int
10
    action: str           # 执行的动作
11
    tool: str             # 使用的工具
12
    input: Dict           # 输入参数
13
    output: Any           # 输出结果
14
    success: bool         # 是否成功
15
    timestamp: datetime = field(default_factory=datetime.now)
16

17
@dataclass
18
class EpisodicMemory:
19
    """情景记忆"""
20
    episode_id: str
21
    task: str                    # 任务描述
22
    goal: str                    # 任务目标
23
    steps: List[ExecutionStep]   # 执行步骤
24
    decisions: List[Dict]        # 决策过程
25
    outcome: str                 # 最终结果
26
    success: bool                # 是否成功
27
    lessons_learned: str         # 经验总结
28
    created_at: datetime = field(default_factory=datetime.now)
29

30
    def to_dict(self) -> Dict:
31
        return {
32
            "episode_id": self.episode_id,
33
            "task": self.task,
34
            "goal": self.goal,
35
            "steps": [
36
                {
37
                    "step_id": s.step_id,
38
                    "action": s.action,
39
                    "tool": s.tool,
40
                    "input": s.input,
41
                    "output": str(s.output),
42
                    "success": s.success,
43
                    "timestamp": s.timestamp.isoformat()
44
                }
45
                for s in self.steps
46
            ],
47
            "decisions": self.decisions,
48
            "outcome": self.outcome,
49
            "success": self.success,
50
            "lessons_learned": self.lessons_learned,
51
            "created_at": self.created_at.isoformat()
52
        }

5.3 情景记忆管理器#

1
import sqlite3
2
from typing import List, Optional
3
from datetime import datetime
4

5
class EpisodicMemoryManager:
6
    """情景记忆管理器 - 使用 SQLite 存储"""
7

8
    def __init__(self, db_path: str = "./episodes.db"):
9
        self.conn = sqlite3.connect(db_path)
10
        self._init_db()
11

12
    def _init_db(self):
13
        """初始化数据库"""
14
        self.conn.execute("""
15
            CREATE TABLE IF NOT EXISTS episodes (
16
                episode_id TEXT PRIMARY KEY,
17
                task TEXT,
18
                goal TEXT,
19
                steps_json TEXT,
20
                decisions_json TEXT,
21
                outcome TEXT,
22
                success BOOLEAN,
23
                lessons_learned TEXT,
24
                created_at TIMESTAMP
25
            )
26
        """)
27
        self.conn.commit()
28

29
    def save_episode(self, episode: EpisodicMemory):
30
        """保存情景记忆"""
31
        self.conn.execute("""
32
            INSERT OR REPLACE INTO episodes
33
            VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
34
        """, (
35
            episode.episode_id,
36
            episode.task,
37
            episode.goal,
38
            json.dumps([s.__dict__ for s in episode.steps], default=str),
39
            json.dumps(episode.decisions),
40
            episode.outcome,
41
            episode.success,
42
            episode.lessons_learned,
43
            episode.created_at
44
        ))
45
        self.conn.commit()
46

47
    def get_episode(self, episode_id: str) -> Optional[EpisodicMemory]:
48
        """获取特定情景"""
49
        cursor = self.conn.execute(
50
            "SELECT * FROM episodes WHERE episode_id = ?", (episode_id,)
51
        )
52
        row = cursor.fetchone()
53
        if row:
54
            return self._row_to_episode(row)
55
        return None
56

57
    def search_episodes(self, query: str, limit: int = 10) -> List[EpisodicMemory]:
58
        """搜索相关情景"""
59
        cursor = self.conn.execute("""
60
            SELECT * FROM episodes
61
            WHERE task LIKE ? OR outcome LIKE ? OR lessons_learned LIKE ?
62
            ORDER BY created_at DESC
63
            LIMIT ?
64
        """, (f"%{query}%", f"%{query}%", f"%{query}%", limit))
65

66
        return [self._row_to_episode(row) for row in cursor.fetchall()]
67

68
    def get_successful_episodes(self, task_type: str = None) -> List[EpisodicMemory]:
69
        """获取成功的情景（用于学习）"""
70
        if task_type:
71
            cursor = self.conn.execute(
72
                "SELECT * FROM episodes WHERE success = 1 AND task LIKE ?",
73
                (f"%{task_type}%",)
74
            )
75
        else:
76
            cursor = self.conn.execute(
77
                "SELECT * FROM episodes WHERE success = 1"
78
            )
79
        return [self._row_to_episode(row) for row in cursor.fetchall()]

六、记忆检索策略#

6.1 检索时机#

flowchart TD A[用户请求] --> B{需要记忆?} B -->|新任务| C[检索长期记忆] B -->|继续对话| D[使用短期记忆] B -->|类似任务| E[检索情景记忆] C --> F[语义检索] E --> G[关键词+语义] F --> H[整合上下文] G --> H D --> H H --> I[生成响应]

6.2 混合检索策略#

1
class HybridRetrieval:
2
    """混合检索：关键词 + 语义 + 时间"""
3

4
    def __init__(self, long_term_memory: LongTermMemory):
5
        self.memory = long_term_memory
6

7
    def retrieve(self, query: str, user_id: str = None,
8
                 top_k: int = 5, recency_weight: float = 0.3) -> List[Dict]:
9
        """混合检索"""
10

11
        # 1. 语义检索
12
        semantic_results = self.memory.recall(query, n_results=top_k * 2)
13

14
        # 2. 过滤和重排序
15
        scored_results = []
16
        for item in semantic_results:
17
            score = 1 - item["distance"]  # 语义相似度
18

19
            # 时间衰减
20
            if "timestamp" in item["metadata"]:
21
                age_days = (datetime.now() -
22
                    datetime.fromisoformat(item["metadata"]["timestamp"])).days
23
                recency_score = 1 / (1 + age_days / 30)  # 30天半衰期
24
                score = (1 - recency_weight) * score + recency_weight * recency_score
25

26
            # 用户相关性
27
            if user_id and item["metadata"].get("user_id") == user_id:
28
                score *= 1.2  # 用户相关记忆加权
29

30
            scored_results.append({**item, "score": score})
31

32
        # 3. 按综合分数排序
33
        scored_results.sort(key=lambda x: x["score"], reverse=True)
34

35
        return scored_results[:top_k]

6.3 记忆重要性评估#

1
class MemoryImportance:
2
    """评估记忆的重要性"""
3

4
    @staticmethod
5
    def calculate_importance(memory_item: Dict) -> float:
6
        """计算记忆重要性分数"""
7

8
        score = 0.5  # 基础分
9

10
        # 1. 访问频率
11
        access_count = memory_item.get("access_count", 0)
12
        score += min(access_count * 0.05, 0.2)
13

14
        # 2. 类型权重
15
        memory_type = memory_item.get("type", "")
16
        type_weights = {
17
            "preference": 0.3,      # 用户偏好很重要
18
            "fact": 0.2,            # 事实信息
19
            "experience": 0.25,     # 经验教训
20
            "temporary": -0.1       # 临时信息降权
21
        }
22
        score += type_weights.get(memory_type, 0)
23

24
        # 3. 反馈信号
25
        if memory_item.get("positive_feedback"):
26
            score += 0.15
27
        if memory_item.get("negative_feedback"):
28
            score -= 0.1
29

30
        # 4. 确定性
31
        confidence = memory_item.get("confidence", 0.5)
32
        score += (confidence - 0.5) * 0.2
33

34
        return max(0, min(1, score))  # 归一化到 [0, 1]

七、记忆压缩与遗忘#

7.1 为什么需要压缩和遗忘？#

1
问题：
2
1. 存储成本：记忆无限增长会消耗大量存储
3
2. 检索效率：记忆越多，检索越慢
4
3. 噪声干扰：过多低价值记忆影响决策
5
4. 过时信息：旧记忆可能已不相关
6

7
解决：
8
1. 压缩：将多条记忆合并为摘要
9
2. 遗忘：删除低价值或过时记忆
10
3. 归档：将冷数据移到低成本存储

7.2 记忆压缩策略#

flowchart TD A[原始记忆] --> B{压缩判断} B -->|相似度高| C[合并] B -->|时间跨度大| D[摘要] B -->|同类型多| E[聚类压缩] C --> F[新记忆条目] D --> F E --> F F --> G[删除原始记忆]

压缩实现示例：

1
class MemoryCompressor:
2
    """记忆压缩器"""
3

4
    def __init__(self, llm):
5
        self.llm = llm
6
        self.similarity_threshold = 0.85
7

8
    def compress_similar_memories(self, memories: List[Dict]) -> Dict:
9
        """压缩相似记忆"""
10

11
        if len(memories) < 2:
12
            return memories[0] if memories else None
13

14
        # 合并内容
15
        combined_content = "\n".join([
16
            f"- {m['content']}" for m in memories
17
        ])
18

19
        # 使用 LLM 生成压缩摘要
20
        prompt = f"""请将以下多条相似记忆压缩为一条，保留关键信息：
21

22
{combined_content}
23

24
要求：
25
1. 保留所有独特信息
26
2. 去除重复内容
27
3. 保持简洁
28
4. 一句话概括"""
29

30
        compressed = self.llm.generate(prompt)
31

32
        # 合并元数据
33
        merged_metadata = self._merge_metadata(memories)
34

35
        return {
36
            "content": compressed,
37
            "metadata": merged_metadata,
38
            "compressed_from": len(memories)
39
        }
40

41
    def _merge_metadata(self, memories: List[Dict]) -> Dict:
42
        """合并元数据"""
43
        merged = {}
44
        for m in memories:
45
            metadata = m.get("metadata", {})
46
            for key, value in metadata.items():
47
                if key not in merged:
48
                    merged[key] = value
49
                elif isinstance(merged[key], list):
50
                    if value not in merged[key]:
51
                        merged[key].append(value)
52
                elif merged[key] != value:
53
                    merged[key] = [merged[key], value]
54
        return merged

7.3 遗忘机制#

1
from datetime import datetime, timedelta
2
from typing import List
3

4
class ForgettingMechanism:
5
    """遗忘机制"""
6

7
    def __init__(self, memory_manager):
8
        self.memory_manager = memory_manager
9

10
        # 遗忘策略参数
11
        self.max_age_days = 365        # 最大保留天数
12
        self.min_importance = 0.1      # 最低重要性阈值
13
        self.max_access_interval = 90  # 最大未访问天数
14

15
    def should_forget(self, memory_item: Dict) -> bool:
16
        """判断是否应该遗忘"""
17

18
        # 1. 时间因素
19
        created_at = datetime.fromisoformat(
20
            memory_item.get("created_at", datetime.now().isoformat())
21
        )
22
        age_days = (datetime.now() - created_at).days
23

24
        if age_days > self.max_age_days:
25
            return True
26

27
        # 2. 重要性因素
28
        importance = MemoryImportance.calculate_importance(memory_item)
29
        if importance < self.min_importance:
30
            return True
31

32
        # 3. 访问频率因素
33
        last_accessed = memory_item.get("last_accessed")
34
        if last_accessed:
35
            last_access = datetime.fromisoformat(last_accessed)
36
            days_since_access = (datetime.now() - last_access).days
37

38
            # 长期未访问且重要性低
39
            if days_since_access > self.max_access_interval and importance < 0.3:
40
                return True
41

42
        # 4. 特殊标记
43
        if memory_item.get("marked_for_deletion"):
44
            return True
45

46
        return False
47

48
    def apply_forgetting(self, batch_size: int = 100) -> int:
49
        """批量应用遗忘策略"""
50
        forgotten_count = 0
51

52
        # 获取所有记忆
53
        all_memories = self.memory_manager.get_all_memories(limit=10000)
54

55
        for memory in all_memories:
56
            if self.should_forget(memory):
57
                self.memory_manager.delete(memory["id"])
58
                forgotten_count += 1
59

60
                if forgotten_count >= batch_size:
61
                    break
62

63
        return forgotten_count
64

65
    def archive_old_memories(self, days: int = 180) -> int:
66
        """归档旧记忆"""
67
        threshold = datetime.now() - timedelta(days=days)
68

69
        archived = self.memory_manager.move_to_archive(
70
            condition={"created_before": threshold.isoformat()}
71
        )
72

73
        return archived

7.4 Ebbinghaus 遗忘曲线#

xychart-beta title "Ebbinghaus 遗忘曲线与复习策略" x-axis ["1天", "2天", "6天", "14天", "30天", "60天"] y-axis "记忆保留率 %" 0 --> 100 line [33, 25, 20, 15, 10, 5] line [100, 100, 100, 100, 100, 100]

1
class SpacedRepetition:
2
    """间隔重复：对抗遗忘"""
3

4
    def __init__(self):
5
        # 复习间隔（天）
6
        self.intervals = [1, 2, 6, 14, 30, 60]
7

8
    def get_next_review_date(self, memory_item: Dict) -> datetime:
9
        """计算下次复习日期"""
10
        review_count = memory_item.get("review_count", 0)
11

12
        if review_count >= len(self.intervals):
13
            # 已完成所有复习，转为长期记忆
14
            return None
15

16
        interval = self.intervals[review_count]
17
        return datetime.now() + timedelta(days=interval)
18

19
    def review_memory(self, memory_id: str, success: bool):
20
        """复习记忆"""
21
        memory = self.memory_manager.get(memory_id)
22

23
        if success:
24
            # 复习成功，增加复习次数
25
            memory["review_count"] = memory.get("review_count", 0) + 1
26
            memory["next_review"] = self.get_next_review_date(memory)
27
            memory["importance"] = min(1.0, memory.get("importance", 0.5) + 0.1)
28
        else:
29
            # 复习失败，重置
30
            memory["review_count"] = 0
31
            memory["next_review"] = datetime.now() + timedelta(days=1)
32

33
        self.memory_manager.update(memory)

八、完整记忆系统架构#

1
class AgentMemorySystem:
2
    """Agent 完整记忆系统"""
3

4
    def __init__(self, llm, config: Dict = None):
5
        self.config = config or {}
6

7
        # 三层记忆
8
        self.working_memory = HybridMemory(llm)
9
        self.long_term_memory = LongTermMemory()
10
        self.episodic_memory = EpisodicMemoryManager()
11

12
        # 辅助组件
13
        self.retriever = HybridRetrieval(self.long_term_memory)
14
        self.compressor = MemoryCompressor(llm)
15
        self.forgetting = ForgettingMechanism(self.long_term_memory)
16

17
    def remember(self, content: str, memory_type: str = "short",
18
                 metadata: Dict = None):
19
        """存储记忆"""
20
        if memory_type == "short":
21
            self.working_memory.add_message(
22
                "system", content, important=metadata.get("important", False)
23
            )
24
        elif memory_type == "long":
25
            metadata = metadata or {}
26
            metadata["created_at"] = datetime.now().isoformat()
27
            self.long_term_memory.store(content, metadata)
28

29
    def recall(self, query: str, include_types: List[str] = None) -> Dict:
30
        """检索记忆"""
31
        include_types = include_types or ["short", "long", "episodic"]
32

33
        results = {}
34

35
        if "short" in include_types:
36
            results["working"] = self.working_memory.get_context()
37

38
        if "long" in include_types:
39
            results["long_term"] = self.retriever.retrieve(query)
40

41
        if "episodic" in include_types:
42
            results["episodes"] = self.episodic_memory.search_episodes(query)
43

44
        return results
45

46
    def save_episode(self, episode: EpisodicMemory):
47
        """保存情景记忆"""
48
        self.episodic_memory.save_episode(episode)
49

50
        # 提取经验教训存入长期记忆
51
        if episode.lessons_learned:
52
            self.remember(
53
                f"经验：{episode.lessons_learned}",
54
                memory_type="long",
55
                metadata={"type": "experience", "task": episode.task}
56
            )
57

58
    def maintain(self):
59
        """记忆维护：压缩和遗忘"""
60
        # 定期执行
61
        forgotten = self.forgetting.apply_forgetting()
62
        archived = self.forgetting.archive_old_memories()
63

64
        return {"forgotten": forgotten, "archived": archived}

常见问题 FAQ#

Q1：短期记忆和长期记忆如何协同工作？

A：

短期记忆处理当前对话，实时更新
重要信息从短期记忆提取，存入长期记忆
长期记忆通过检索补充短期记忆

Q2：向量数据库检索会不会很慢？

A：

现代向量数据库支持 ANN（近似最近邻）算法，检索速度快
通常在毫秒级别完成 TOP-K 检索
可以通过索引优化、分区策略进一步提升

Q3：如何避免记忆中存储敏感信息？

A：

在存储前进行敏感信息检测和脱敏
设置访问控制，限制记忆的访问范围
支持用户主动删除特定记忆

Q4：记忆系统会增加多少成本？

A：

向量存储成本：约 $0.1/GB/月
Embedding 成本：约 $0.0001/1K tokens
对于个人用户，月成本通常在 $1-10 范围

Q5：如何评估记忆系统的效果？

A：

检索准确率：检索结果与实际需求的匹配度
记忆命中率：Agent 使用记忆的频率
用户满意度：个性化服务的反馈

小结#

记忆系统是 Agent 的「大脑存储」，决定了 Agent 能否持续进化。

核心要点回顾：

1
┌─────────────────────────────────────────────────────────────┐
2
│                    记忆系统核心总结                          │
3
├─────────────────────────────────────────────────────────────┤
4
│                                                             │
5
│  三层架构：短期 + 长期 + 情景                                │
6
│                                                             │
7
│  短期记忆：上下文窗口管理（滑动窗口/摘要/混合）               │
8
│                                                             │
9
│  长期记忆：向量数据库（ChromaDB/Milvus/FAISS）              │
10
│                                                             │
11
│  情景记忆：任务执行过程的完整记录                            │
12
│                                                             │
13
│  检索策略：语义 + 关键词 + 时间 + 重要性                     │
14
│                                                             │
15
│  压缩遗忘：控制存储成本，保持记忆质量                        │
16
│                                                             │
17
└─────────────────────────────────────────────────────────────┘