Agent 架构模式：Handoffs、Fan-out 与 Supervisor

Souloss

公告

欢迎来到我的博客！这是一条示例公告

Learn More

标签

Souloss

公告

欢迎来到我的博客！这是一条示例公告

Learn More

标签

Souloss

公告

欢迎来到我的博客！这是一条示例公告

Learn More

标签

1375 字

4 分钟

Agent 架构模式：Handoffs、Fan-out 与 Supervisor

2025-03-22

AI

/

Agent

/

工程实践

前言#

在构建复杂 Agent 系统时，选择合适的架构模式至关重要。本章深入讲解三种核心架构模式：Handoffs、Fan-out 和 Supervisor，并扩展讨论 Single Agent vs Multi-Agent 的选型决策、工具增强与知识增强的架构差异，以及记忆系统的架构模式。

一、Handoffs 交接模式#

1.1 什么是 Handoffs#

Handoffs 允许 Agent 将任务转交给更专业的 Agent：

1
# OpenAI Agents SDK 中的 Handoffs 示例
2
from agents import Agent, handoff
3

4
research_agent = Agent(name="researcher", ...)
5
coding_agent = Agent(name="coder", ...)
6
review_agent = Agent(name="reviewer", ...)
7

8
# 定义交接函数
9
def research_handoff(message: str):
10
    return research_agent
11

12
def code_handoff(message: str):
13
    return coding_agent
14

15
# 使用 handoff
16
orchestrator = Agent(
17
    name="orchestrator",
18
    handoffs=[research_handoff, code_handoff]
19
)

1.2 交接执行流程#

sequenceDiagram participant User participant Orchestrator participant ResearchAgent participant CodingAgent User->>Orchestrator: 请求分析代码问题 Orchestrator->>ResearchAgent: 交接研究任务 ResearchAgent-->>Orchestrator: 返回研究结果 Orchestrator->>CodingAgent: 交接编码任务 CodingAgent-->>User: 返回代码解决方案

1.3 Handoffs 的使用场景#

场景	说明
专业分工	编码 Agent vs 审查 Agent
路由选择	根据意图选择不同 Agent
任务分解	复杂任务交给专业处理

1.4 交接注意事项#

1
# Handoffs 需要传递上下文
2
def create_handoff_with_context(agent, context):
3
    """交接时带上关键上下文信息"""
4
    def handoff_fn(message: str):
5
        # 将上下文注入消息
6
        enhanced_message = f"{context}\n\n用户请求: {message}"
7
        return agent
8
    return handoff_fn

1.5 Handoffs 的上下文管理策略#

交接过程中上下文丢失是最常见的问题。以下是几种实用的上下文管理策略：

1
from dataclasses import dataclass, field
2
from typing import Any
3

4
@dataclass
5
class HandoffContext:
6
    """交接上下文容器"""
7
    original_query: str
8
    accumulated_findings: list[str] = field(default_factory=list)
9
    tool_results: dict[str, Any] = field(default_factory=dict)
10
    metadata: dict[str, Any] = field(default_factory=dict)
11
    parent_agent: str = ""
12
    depth: int = 0  # 防止无限递归交接
13

14
    def add_finding(self, finding: str):
15
        self.accumulated_findings.append(finding)
16

17
    def to_prompt_segment(self) -> str:
18
        findings_text = "\n".join(f"- {f}" for f in self.accumulated_findings)
19
        return f"""
20
<交接上下文>
21
原始请求: {self.original_query}
22
已有发现:
23
{findings_text}
24
深度: {self.depth}
25
</交接上下文>
26
"""

深度限制防止无限交接循环：

1
MAX_HANDOFF_DEPTH = 3
2

3
def safe_handoff(context: HandoffContext, target_agent: Agent) -> Any:
4
    if context.depth >= MAX_HANDOFF_DEPTH:
5
        return "已达到最大交接深度，返回当前结果。"
6

7
    new_context = HandoffContext(
8
        original_query=context.original_query,
9
        accumulated_findings=context.accumulated_findings.copy(),
10
        depth=context.depth + 1,
11
        parent_agent=context.parent_agent,
12
    )
13
    return target_agent.run(new_context)

二、Fan-out/Fan-in 并行模式#

2.1 原理#

Fan-out 将任务分发给多个 Agent，Fan-in 将结果汇总：

1
# Fan-out/Fan-in 示例
2
async def parallel_research(query: str, sources: list):
3
    # Fan-out: 分发任务给多个研究 Agent
4
    tasks = []
5
    for source in sources:
6
        task = research_agent.run(f"研究 {source} 的 {query}")
7
        tasks.append(task)
8

9
    # Fan-in: 汇总结果
10
    results = await asyncio.gather(*tasks)
11
    synthesis = synthesis_agent.run(f"汇总: {results}")
12

13
    return synthesis

2.2 执行流程#

graph TB A["用户请求"] --> B["Fan-out 分发"] B --> C["Agent 1"] B --> D["Agent 2"] B --> E["Agent 3"] C --> F["Fan-in 汇总"] D --> F E --> F F --> G["最终回答"]

2.3 超时与错误处理#

1
import asyncio
2
from tenacity import retry, stop_after_attempt
3

4
@retry(stop=stop_after_attempt(3))
5
async def fan_out_with_retry(query: str, sources: list):
6
    try:
7
        tasks = [
8
            research_agent.run(query, source=source)
9
            for source in sources
10
        ]
11
        results = await asyncio.wait_for(
12
            asyncio.gather(*tasks, return_exceptions=True),
13
            timeout=30.0
14
        )
15
        # 过滤错误结果
16
        valid_results = [r for r in results if not isinstance(r, Exception)]
17
        return synthesis_agent.run(valid_results)
18
    except asyncio.TimeoutError:
19
        return "部分研究超时，返回已有结果"

2.4 Fan-in 的汇总策略#

多路结果汇总不只是简单拼接，需要根据任务特点选择不同的合并策略：

1
from enum import Enum
2
from typing import Any
3

4
class MergeStrategy(Enum):
5
    VOTE = "vote"             # 多数投票
6
    RANK = "rank"             # 排序取最优
7
    INTERSECT = "intersect"   # 取交集
8
    UNION = "union"           # 取并集
9
    SUMMARIZE = "summarize"   # LLM 摘要合并
10

11
async def fan_in(results: list[Any], strategy: MergeStrategy) -> Any:
12
    """根据策略合并并行结果"""
13
    if strategy == MergeStrategy.VOTE:
14
        # 投票策略：多数一致的答案胜出
15
        from collections import Counter
16
        counter = Counter(str(r) for r in results)
17
        return counter.most_common(1)[0][0]
18

19
    elif strategy == MergeStrategy.RANK:
20
        # 排序策略：用 LLM 评估每个结果质量
21
        scored = []
22
        for r in results:
23
            score = await judge_quality(r)
24
            scored.append((score, r))
25
        scored.sort(reverse=True)
26
        return scored[0][1]
27

28
    elif strategy == MergeStrategy.SUMMARIZE:
29
        # 摘要策略：让 LLM 合并多个来源的信息
30
        combined = "\n\n---\n\n".join(str(r) for r in results)
31
        return await synthesis_llm.complete(
32
            f"将以下多路研究结果合并为一份完整报告：\n{combined}"
33
        )
34

35
    elif strategy == MergeStrategy.INTERSECT:
36
        # 交集策略：只保留所有来源一致认可的信息
37
        sets = [set(normalize(r)) for r in results]
38
        return set.intersection(*sets)
39

40
    elif strategy == MergeStrategy.UNION:
41
        # 并集策略：汇总所有来源的独特信息
42
        sets = [set(normalize(r)) for r in results]
43
        return set.union(*sets)

三、Supervisor 监督模式#

3.1 原理#

Supervisor 模式中，一个协调者决定调用哪个子 Agent：

1
from enum import Enum
2

3
class Supervisor:
4
    def __init__(self):
5
        self.agents = {
6
            "research": research_agent,
7
            "code": coding_agent,
8
            "review": review_agent
9
        }
10
        self.supervisor_llm = supervisor_llm
11

12
    async def route(self, message: str) -> str:
13
        """LLM 决定调用哪个 Agent"""
14
        decision = await self.supervisor_llm.complete(
15
            f"决定使用哪个 Agent: {message}"
16
        )
17
        return decision.agent_choice
18

19
    async def run(self, message: str):
20
        agent_name = await self.route(message)
21
        agent = self.agents[agent_name]
22
        return await agent.run(message)

3.2 流程图#

flowchart TD A["用户请求"] --> B["Supervisor LLM"] B --> C{"判断"} C -->|"研究"| D["Research Agent"] C -->|"编码"| E["Coding Agent"] C -->|"审查"| F["Review Agent"] D --> G["返回结果"] E --> G F --> G

3.3 带状态追踪的 Supervisor#

生产环境中 Supervisor 需要追踪任务进度，防止重复调用和遗漏步骤：

1
from dataclasses import dataclass, field
2
from datetime import datetime
3

4
@dataclass
5
class TaskState:
6
    task_id: str
7
    original_query: str
8
    status: str = "pending"  # pending / in_progress / completed / failed
9
    assigned_agents: list[str] = field(default_factory=list)
10
    results: dict[str, Any] = field(default_factory=dict)
11
    created_at: datetime = field(default_factory=datetime.now)
12
    updated_at: datetime = field(default_factory=datetime.now)
13

14
class StatefulSupervisor:
15
    def __init__(self):
16
        self.agents: dict[str, Agent] = {}
17
        self.active_tasks: dict[str, TaskState] = {}
18

19
    async def run(self, query: str) -> Any:
20
        task_id = generate_task_id()
21
        state = TaskState(task_id=task_id, original_query=query)
22
        self.active_tasks[task_id] = state
23

24
        # Supervisor 决策循环
25
        for step in range(MAX_STEPS):
26
            state.status = "in_progress"
27
            state.updated_at = datetime.now()
28

29
            # 决定下一步
30
            next_action = await self._decide(state)
31

32
            if next_action.action == "complete":
33
                state.status = "completed"
34
                return self._compile_result(state)
35

36
            # 分配给子 Agent
37
            agent_name = next_action.agent_name
38
            state.assigned_agents.append(agent_name)
39

40
            try:
41
                result = await self.agents[agent_name].run(
42
                    next_action.prompt, context=state.results
43
                )
44
                state.results[agent_name] = result
45
            except Exception as e:
46
                state.results[agent_name] = f"Error: {str(e)}"
47

48
        state.status = "failed"
49
        return "达到最大步骤数，任务未完成。"

四、模式对比#

模式	适用场景	复杂度	并行度
Handoffs	专业分工明确	中	串行
Fan-out/Fan-in	任务可分解、独立	高	完全并行
Supervisor	需要智能路由	中	串行

五、组合模式实战#

5.1 复杂工作流#

1
async def complex_workflow(user_request: str):
2
    # 1. Supervisor 决定任务类型
3
    task_type = await supervisor.route(user_request)
4

5
    # 2. 根据类型选择模式
6
    if task_type == "research_report":
7
        # Fan-out 研究 + Handoffs 给不同专家
8
        results = await fan_out_research(user_request)
9
        return await handoff_to_report_agent(results)
10

11
    elif task_type == "code_task":
12
        # Handoffs 给编码 Agent
13
        return await handoff_to_coder(user_request)

5.2 多级 Supervisor#

graph TD A["顶层 Supervisor"] --> B["研发 Supervisor"] A --> C["运营 Supervisor"] B --> D["前端 Agent"] B --> E["后端 Agent"] C --> F["客服 Agent"] C --> G["分析 Agent"]

5.3 组合模式的工程化考量#

组合多种模式时，需要特别注意以下工程细节：

1
@dataclass
2
class WorkflowConfig:
3
    """工作流配置"""
4
    max_parallel_agents: int = 5        # 最大并行 Agent 数
5
    handoff_timeout_seconds: float = 30  # 交接超时
6
    supervisor_max_retries: int = 3      # Supervisor 重试次数
7
    enable_tracing: bool = True          # 是否开启追踪
8
    cost_budget_usd: float = 1.0         # 单次工作流成本预算
9

10
class HybridWorkflow:
11
    """混合工作流引擎：组合 Handoffs + Fan-out + Supervisor"""
12

13
    def __init__(self, config: WorkflowConfig):
14
        self.config = config
15
        self.semaphore = asyncio.Semaphore(config.max_parallel_agents)
16
        self.tracer = get_tracer() if config.enable_tracing else None
17

18
    async def execute(self, request: str) -> dict:
19
        total_cost = 0.0
20

21
        # Phase 1: Supervisor 分类
22
        with self._trace("supervisor_classify"):
23
            classification = await supervisor.classify(request)
24

25
        # Phase 2: 根据分类选择执行策略
26
        if classification.type == "parallel_research":
27
            results = await self._fan_out_phase(classification.subtasks)
28
        elif classification.type == "sequential_pipeline":
29
            results = await self._handoff_phase(classification.pipeline)
30
        else:
31
            results = await self._single_agent_phase(classification.agent, request)
32

33
        # Phase 3: 结果验证
34
        with self._trace("validate"):
35
            validated = await self._validate_results(results)
36

37
        return {"results": validated, "cost_usd": total_cost}
38

39
    async def _fan_out_phase(self, subtasks: list) -> list:
40
        async def run_with_semaphore(task):
41
            async with self.semaphore:
42
                return await agent.run(task)
43

44
        tasks = [run_with_semaphore(t) for t in subtasks]
45
        raw_results = await asyncio.gather(*tasks, return_exceptions=True)
46
        return [r for r in raw_results if not isinstance(r, Exception)]
47

48
    async def _handoff_phase(self, pipeline: list) -> list:
49
        results = []
50
        context = ""
51
        for step in pipeline:
52
            agent = self.agents[step["agent"]]
53
            result = await agent.run(step["prompt"], context=context)
54
            context += f"\n{result}"
55
            results.append(result)
56
        return results

六、Single Agent vs Multi-Agent 决策框架#

6.1 决策树#

不是所有场景都需要 Multi-Agent。Single Agent 适合简单任务，Multi-Agent 适合复杂协作。以下是决策框架：

flowchart TD A["任务需求"] --> B{"需要多种专业能力？"} B -->|"否"| C{"步骤可并行？"} B -->|"是"| D{"子任务独立？"} C -->|"否"| E["Single Agent"] C -->|"是"| F["Fan-out/Fan-in"] D -->|"否"| G["Supervisor 模式"] D -->|"是"| H{"需要上下文传递？"} H -->|"是"| G H -->|"否"| F

6.2 选型对比#

维度	Single Agent	Multi-Agent
开发复杂度	低	高
调试难度	低	高
Token 成本	低	高（多次系统提示词）
任务完成质量	简单任务好，复杂一般	复杂任务好
并行能力	无	有
适用场景	单一领域、步骤少于 5	多领域、需要协作

6.3 实际选型示例#

1
class ArchitectureSelector:
2
    """根据任务特征自动选择架构"""
3

4
    def select(self, task_description: str) -> str:
5
        analysis = self._analyze(task_description)
6

7
        # 规则 1：步骤少于 3 且领域单一 → Single Agent
8
        if analysis.step_count <= 3 and analysis.domain_count == 1:
9
            return "single_agent"
10

11
        # 规则 2：需要搜索+编码+审查 → Supervisor
12
        if analysis.requires_domains(["research", "coding", "review"]):
13
            return "supervisor"
14

15
        # 规则 3：多个独立数据源 → Fan-out
16
        if analysis.parallel_data_sources > 1:
17
            return "fan_out_fan_in"
18

19
        # 规则 4：线性流程且步骤明确 → Handoffs
20
        if analysis.is_sequential and analysis.step_count > 3:
21
            return "handoffs"
22

23
        return "supervisor"  # 默认用 Supervisor

七、ReAct vs Plan-and-Execute vs Reflection 架构#

7.1 三种核心推理架构对比#

前面介绍了通信模式（Handoffs、Fan-out、Supervisor），这里补充三种推理架构。它们决定了 Agent 内部的思考方式。

架构	核心思想	优势	劣势
ReAct	边想边做，交替推理行动	灵活、实时	可能陷入循环
Plan-Execute	先规划再执行	全局视野	计划可能过时
Reflection	执行后自我反思改进	持续优化	额外 Token

7.2 ReAct 架构实现#

1
from typing import TypedDict
2

3
class ReActState(TypedDict):
4
    thought: str
5
    action: str
6
    action_input: dict
7
    observation: str
8

9
REACT_PROMPT = """你是一个使用 ReAct 范式的智能助手。
10

11
可用工具:
12
{tool_descriptions}
13

14
严格按照以下格式回答:
15
Thought: 思考当前状况和下一步
16
Action: 工具名称
17
Action Input: {{"param": "value"}}
18
Observation: (工具返回结果)
19
... (可以重复多次 Thought/Action/Observation)
20
Thought: 我现在知道最终答案了
21
Final Answer: 最终答案
22

23
注意: 最多执行 {max_steps} 步。
24

25
Question: {question}"""
26

27
async def react_loop(question: str, tools: dict, max_steps: int = 5) -> str:
28
    """ReAct 推理循环"""
29
    messages = [
30
        {"role": "system", "content": REACT_PROMPT.format(
31
            tool_descriptions=format_tools(tools),
32
            max_steps=max_steps,
33
            question=question
34
        )}
35
    ]
36

37
    for step in range(max_steps):
38
        response = await llm.complete(messages)
39
        messages.append({"role": "assistant", "content": response})
40

41
        # 解析 Action
42
        action, action_input = parse_action(response)
43

44
        if action == "Final Answer":
45
            return parse_final_answer(response)
46

47
        # 执行工具
48
        if action in tools:
49
            observation = await tools[action](**action_input)
50
        else:
51
            observation = f"错误: 工具 {action} 不存在"
52

53
        messages.append({"role": "user", "content": f"Observation: {observation}"})
54

55
    return "达到最大步数限制，未能完成任务。"

7.3 Plan-and-Execute 架构实现#

1
@dataclass
2
class PlanStep:
3
    step_id: int
4
    description: str
5
    tool: str | None = None
6
    tool_input: dict | None = None
7
    status: str = "pending"  # pending / running / done / failed
8
    result: Any = None
9

10
@dataclass
11
class Plan:
12
    goal: str
13
    steps: list[PlanStep]
14
    current_step: int = 0
15

16
PLAN_PROMPT = """为以下任务制定执行计划。
17

18
任务: {goal}
19

20
可用工具: {tools}
21

22
输出 JSON 格式的计划:
23
{{
24
    "steps": [
25
        {{"description": "步骤描述", "tool": "工具名", "tool_input": {{"param": "value"}}}},
26
        ...
27
    ]
28
}}"""
29

30
async def plan_and_execute(goal: str, tools: dict) -> str:
31
    """Plan-and-Execute 执行器"""
32

33
    # Phase 1: 制定计划
34
    plan_response = await planner_llm.complete(
35
        PLAN_PROMPT.format(goal=goal, tools=format_tools(tools))
36
    )
37
    plan = parse_plan(plan_response)
38

39
    # Phase 2: 逐步执行
40
    for step in plan.steps:
41
        step.status = "running"
42

43
        try:
44
            if step.tool and step.tool in tools:
45
                step.result = await tools[step.tool](**step.tool_input)
46
            else:
47
                # 无工具的推理步骤
48
                step.result = await executor_llm.complete(
49
                    f"任务: {step.description}\n上下文: {get_previous_results(plan)}"
50
                )
51
            step.status = "done"
52
        except Exception as e:
53
            step.status = "failed"
54
            step.result = str(e)
55

56
            # 重新规划
57
            new_plan = await replan(plan, str(e))
58
            plan.steps.extend(new_plan.steps)
59

60
    # Phase 3: 汇总结果
61
    return await synthesize(plan)

7.4 Reflection（自我反思）架构#

1
REFLECT_PROMPT = """你刚完成了以下任务:
2

3
任务: {task}
4
执行过程: {trajectory}
5
最终结果: {result}
6

7
请反思:
8
1. 结果是否完整准确地完成了任务？
9
2. 有没有更高效的方法？
10
3. 中间步骤有没有可以优化的地方？
11

12
给出改进建议和评分 (0-10)。"""
13

14
async def reflective_agent(task: str, max_iterations: int = 3) -> str:
15
    """带反思的 Agent"""
16
    best_result = None
17
    best_score = 0
18

19
    for i in range(max_iterations):
20
        # 执行任务
21
        result = await execute_agent(task, previous_reflection=best_result)
22

23
        # 反思评估
24
        reflection = await reflector_llm.complete(
25
            REFLECT_PROMPT.format(
26
                task=task,
27
                trajectory=get_trajectory(),
28
                result=result
29
            )
30
        )
31
        score = parse_score(reflection)
32

33
        if score > best_score:
34
            best_score = score
35
            best_result = result
36

37
        # 如果分数够高，提前结束
38
        if score >= 8.0:
39
            break
40

41
    return best_result

7.5 架构选择流程#

flowchart TD A["任务特点"] --> B{"任务是否可预先规划？"} B -->|"是"| C{"执行过程会出错吗？"} B -->|"否"| D["ReAct 架构"] C -->|"经常出错"| E["Reflection 架构"] C -->|"通常顺利"| F["Plan-and-Execute"] D --> G["搜索、客服、实时问答"] F --> H["报告生成、数据分析"] E --> I["代码生成、复杂推理"]

八、工具增强 vs 知识增强架构#

8.1 两种增强路径#

Agent 的能力增强有两个方向：工具增强（Tool-augmented）和知识增强（Knowledge-augmented）。

flowchart LR subgraph ToolAugmented["工具增强架构"] T1["LLM"] --> T2["搜索引擎"] T1 --> T3["代码执行器"] T1 --> T4["API 调用"] T1 --> T5["数据库"] end subgraph KnowledgeAugmented["知识增强架构"] K1["LLM"] --> K2["向量数据库"] K1 --> K3["知识图谱"] K1 --> K4["文档检索"] K1 --> K5["FAQ 库"] end

8.2 工具增强架构详解#

工具增强让 Agent 能够执行实际操作，适合需要与外部系统交互的场景：

1
from langchain.tools import tool
2
from langchain.agents import create_react_agent
3

4
@tool
5
def search_web(query: str) -> str:
6
    """搜索互联网获取最新信息"""
7
    return web_search_api(query)
8

9
@tool
10
def execute_python(code: str) -> str:
11
    """执行 Python 代码并返回结果"""
12
    return sandbox.run(code, timeout=30)
13

14
@tool
15
def query_database(sql: str) -> str:
16
    """查询数据库"""
17
    # 安全检查：只允许 SELECT
18
    if not sql.strip().upper().startswith("SELECT"):
19
        return "错误: 只允许 SELECT 查询"
20
    return db.execute(sql)
21

22
@tool
23
def send_email(to: str, subject: str, body: str) -> str:
24
    """发送邮件"""
25
    return email_service.send(to, subject, body)
26

27
# 组合为工具增强 Agent
28
tools = [search_web, execute_python, query_database, send_email]
29
tool_agent = create_react_agent(llm, tools, prompt)

8.3 知识增强架构详解#

知识增强让 Agent 能够利用私有数据，适合需要专业领域知识的场景：

1
from langchain.vectorstores import Chroma
2
from langchain.embeddings import OpenAIEmbeddings
3

4
class KnowledgeAugmentedAgent:
5
    """知识增强 Agent"""
6

7
    def __init__(self, knowledge_sources: list[str]):
8
        # 构建向量索引
9
        self.vectorstore = Chroma.from_documents(
10
            documents=load_documents(knowledge_sources),
11
            embedding=OpenAIEmbeddings()
12
        )
13

14
    async def answer(self, query: str) -> str:
15
        # 1. 检索相关知识
16
        docs = self.vectorstore.similarity_search(query, k=5)
17
        context = format_documents(docs)
18

19
        # 2. 基于知识生成回答
20
        response = await llm.complete(f"""
21
基于以下知识回答问题。如果知识中没有相关信息，请明确说明。
22

23
知识:
24
{context}
25

26
问题: {query}
27
""")
28
        return response

8.4 混合架构：工具 + 知识#

生产环境通常需要两者结合：

1
class HybridAgent:
2
    """工具 + 知识混合增强"""
3

4
    def __init__(self):
5
        self.vectorstore = Chroma(...)       # 知识库
6
        self.tools = [search, calculator]     # 工具集
7

8
    async def process(self, query: str) -> str:
9
        # Step 1: 先查知识库
10
        relevant_docs = self.vectorstore.similarity_search(query, k=3)
11
        knowledge_context = format_documents(relevant_docs)
12

13
        # Step 2: 检查知识库是否足够
14
        coverage = await assess_coverage(query, knowledge_context)
15

16
        if coverage.score >= 0.8:
17
            # 知识库已覆盖，直接回答
18
            return await self._answer_from_knowledge(query, knowledge_context)
19
        else:
20
            # 知识不足，使用工具补充
21
            return await self._answer_with_tools(query, knowledge_context)

8.5 选型建议#

场景	增强方式	典型实现
客服问答	知识增强	RAG + FAQ 库
数据分析	工具增强	SQL + Python 执行器
研究助手	混合	搜索 + 文献库
代码助手	工具增强	代码执行 + Git 操作
合规审查	知识增强	法规向量库 + 规则引擎

九、记忆架构模式#

9.1 三层记忆架构#

Agent 的记忆系统通常分为三层，每层有不同的存储介质和访问模式：

flowchart TB A["工作记忆 Working Memory"] --> B["短期记忆 Short-term Memory"] B --> C["长期记忆 Long-term Memory"] A --> A1["当前对话上下文 存储: 内存/Token 窗口"] B --> B1["会话级信息 存储: Redis/Session"] C --> C1["持久化知识 存储: 向量数据库"] style A fill:#90EE90 style B fill:#87CEEB style C fill:#DDA0DD

9.2 工作记忆实现#

工作记忆就是当前对话的上下文窗口，直接放在 LLM 的 prompt 中：

1
class WorkingMemory:
2
    """工作记忆：管理当前对话的上下文窗口"""
3

4
    def __init__(self, max_tokens: int = 4000):
5
        self.messages: list[dict] = []
6
        self.max_tokens = max_tokens
7

8
    def add(self, role: str, content: str):
9
        self.messages.append({"role": role, "content": content})
10
        self._compress_if_needed()
11

12
    def get_context(self) -> list[dict]:
13
        return self.messages
14

15
    def _compress_if_needed(self):
16
        """当 Token 超限时压缩历史"""
17
        total = count_tokens(self.messages)
18
        if total <= self.max_tokens:
19
            return
20

21
        # 策略：保留 system + 最近 N 轮 + 旧消息摘要
22
        system = [m for m in self.messages if m["role"] == "system"]
23
        recent = self.messages[-6:]  # 最近 3 轮
24

25
        old = [m for m in self.messages if m not in system and m not in recent]
26
        if old:
27
            summary = summarize_messages(old)
28
            self.messages = system + [{"role": "system", "content": f"历史摘要: {summary}"}] + recent

9.3 短期记忆实现#

短期记忆跨对话轮次保持信息，使用外部存储：

1
import redis
2
import json
3

4
class ShortTermMemory:
5
    """短期记忆：会话级别的信息存储"""
6

7
    def __init__(self, session_id: str, ttl: int = 3600):
8
        self.client = redis.Redis()
9
        self.session_id = session_id
10
        self.ttl = ttl
11

12
    def save(self, key: str, value: Any):
13
        """保存会话级信息"""
14
        full_key = f"session:{self.session_id}:{key}"
15
        self.client.setex(full_key, self.ttl, json.dumps(value))
16

17
    def load(self, key: str) -> Any | None:
18
        """读取会话级信息"""
19
        full_key = f"session:{self.session_id}:{key}"
20
        data = self.client.get(full_key)
21
        return json.loads(data) if data else None
22

23
    def save_user_preference(self, preference: dict):
24
        """保存用户在当前会话中表达的偏好"""
25
        existing = self.load("preferences") or {}
26
        existing.update(preference)
27
        self.save("preferences", existing)

9.4 长期记忆实现#

长期记忆使用向量数据库存储，支持语义检索：

1
from datetime import datetime
2

3
class LongTermMemory:
4
    """长期记忆：持久化的知识和经验"""
5

6
    def __init__(self, vectorstore):
7
        self.vectorstore = vectorstore
8

9
    async def store(self, content: str, metadata: dict | None = None):
10
        """存储一条长期记忆"""
11
        memory_entry = {
12
            "content": content,
13
            "timestamp": datetime.now().isoformat(),
14
            "access_count": 0,
15
            **(metadata or {})
16
        }
17
        self.vectorstore.add_documents([memory_entry])
18

19
    async def recall(self, query: str, k: int = 5) -> list[dict]:
20
        """语义检索相关记忆"""
21
        results = self.vectorstore.similarity_search(query, k=k)
22
        # 更新访问计数（越常访问的记忆越重要）
23
        for doc in results:
24
            doc.metadata["access_count"] += 1
25
            doc.metadata["last_accessed"] = datetime.now().isoformat()
26
        return results
27

28
    async def consolidate(self):
29
        """记忆巩固：合并相似记忆，删除过时记忆"""
30
        all_memories = self.vectorstore.get_all()
31

32
        # 找出相似度很高的记忆对
33
        merged = set()
34
        for i, m1 in enumerate(all_memories):
35
            if i in merged:
36
                continue
37
            for j, m2 in enumerate(all_memories[i+1:], i+1):
38
                if j in merged:
39
                    continue
40
                similarity = compute_similarity(m1, m2)
41
                if similarity > 0.95:
42
                    # 合并为一条更强的记忆
43
                    combined = merge_memories(m1, m2)
44
                    await self.store(combined.content, combined.metadata)
45
                    merged.add(i)
46
                    merged.add(j)
47

48
        # 删除过时记忆（超过 90 天且访问次数为 0）
49
        cutoff = datetime.now() - timedelta(days=90)
50
        for memory in all_memories:
51
            if (parse_date(memory.metadata.get("timestamp")) < cutoff
52
                    and memory.metadata.get("access_count", 0) == 0):
53
                self.vectorstore.delete(memory.id)

9.5 情景记忆模式#

情景记忆存储具体的事件和经历，支持按时间和情境检索：

1
@dataclass
2
class EpisodicMemory:
3
    """情景记忆：记录具体事件"""
4
    event: str               # 事件描述
5
    timestamp: datetime      # 发生时间
6
    context: str             # 上下文
7
    outcome: str             # 结果
8
    emotional_valence: float # 情感正负值 (-1 to 1)
9
    lessons: list[str]       # 从中学到的教训
10

11
class EpisodicMemoryStore:
12
    def __init__(self, vectorstore):
13
        self.vectorstore = vectorstore
14

15
    async def record_episode(self, event: EpisodicMemory):
16
        """记录一次经历"""
17
        doc = Document(
18
            page_content=f"{event.event} -> {event.outcome}",
19
            metadata={
20
                "timestamp": event.timestamp.isoformat(),
21
                "lessons": event.lessons,
22
                "valence": event.emotional_valence,
23
            }
24
        )
25
        self.vectorstore.add_documents([doc])
26

27
    async def recall_similar_episodes(self, current_situation: str, k: int = 3):
28
        """回忆类似的经历，避免重蹈覆辙"""
29
        results = self.vectorstore.similarity_search(
30
            current_situation, k=k
31
        )
32
        # 优先返回负面经历的教训（避免犯同样错误）
33
        results.sort(key=lambda x: x.metadata.get("valence", 0))
34
        return results

十、真实场景架构示例#

10.1 客服系统架构#

flowchart TD U["用户消息"] --> R["路由层 意图分类"] R -->|"查询类"| KB["知识库 Agent RAG + FAQ"] R -->|"操作类"| OP["操作 Agent 工单/退款"] R -->|"投诉类"| SP["投诉处理 Agent 升级流程"] KB --> V["结果验证"] OP --> V SP --> V V --> U2["用户回复"] KB -.->|"未解决"| H["人工客服"] SP -.->|"严重"| H

1
class CustomerSupportArchitecture:
2
    """客服系统架构"""
3

4
    def __init__(self):
5
        self.router = IntentRouter()
6
        self.kb_agent = KnowledgeBaseAgent(vectorstore=faq_store)
7
        self.ops_agent = OperationsAgent(systems=[ticket_system, refund_api])
8
        self.complaint_agent = ComplaintAgent(escalation=human_queue)
9

10
    async def handle(self, message: str, user_id: str) -> str:
11
        # 路由
12
        intent = await self.router.classify(message)
13

14
        if intent.type == "inquiry":
15
            result = await self.kb_agent.answer(message, user_id)
16
        elif intent.type == "operation":
17
            result = await self.ops_agent.process(message, user_id)
18
        elif intent.type == "complaint":
19
            result = await self.complaint_agent.handle(message, user_id)
20
        else:
21
            result = await self.kb_agent.answer(message, user_id)
22

23
        # 验证
24
        if result.confidence < 0.7:
25
            return "让我帮您转接人工客服..."
26
        return result.response

10.2 研究助手架构#

1
class ResearchAssistantArchitecture:
2
    """研究助手：Fan-out + Supervisor 混合架构"""
3

4
    def __init__(self):
5
        self.supervisor = ResearchSupervisor()
6
        self.search_agent = SearchAgent()
7
        self.arxiv_agent = ArxivAgent()
8
        self.analysis_agent = AnalysisAgent()
9
        self.writing_agent = WritingAgent()
10

11
    async def research(self, topic: str, depth: str = "medium") -> str:
12
        # Phase 1: Fan-out 搜索
13
        search_tasks = [
14
            self.search_agent.search(topic, source="web"),
15
            self.search_agent.search(topic, source="news"),
16
            self.arxiv_agent.search(topic),
17
        ]
18
        raw_results = await asyncio.gather(*search_tasks)
19

20
        # Phase 2: Supervisor 筛选
21
        filtered = await self.supervisor.filter_relevant(raw_results, threshold=0.6)
22

23
        # Phase 3: 分析
24
        analysis = await self.analysis_agent.analyze(filtered, topic)
25

26
        # Phase 4: 生成报告
27
        report = await self.writing_agent.write_report(
28
            topic=topic,
29
            sources=filtered,
30
            analysis=analysis
31
        )
32

33
        return report

10.3 编程助手架构#

1
class CodingAssistantArchitecture:
2
    """编程助手：Handoffs 串行流水线"""
3

4
    def __init__(self):
5
        self.understand_agent = CodeUnderstandingAgent()
6
        self.plan_agent = CodePlanningAgent()
7
        self.code_agent = CodeGenerationAgent()
8
        self.review_agent = CodeReviewAgent()
9
        self.test_agent = TestGenerationAgent()
10

11
    async def implement(self, requirement: str, codebase: dict) -> dict:
12
        # Step 1: 理解需求和代码库
13
        context = await self.understand_agent.analyze(requirement, codebase)
14

15
        # Step 2: 制定实现计划
16
        plan = await self.plan_agent.create_plan(requirement, context)
17

18
        # Step 3: 编写代码
19
        code = await self.code_agent.generate(plan, context)
20

21
        # Step 4: 代码审查
22
        review = await self.review_agent.review(code, plan)
23
        if review.has_issues:
24
            code = await self.code_agent.fix(code, review.issues)
25

26
        # Step 5: 生成测试
27
        tests = await self.test_agent.generate(code, plan)
28

29
        return {
30
            "code": code,
31
            "tests": tests,
32
            "review": review.summary,
33
        }

十一、总结#

模式	核心优势	典型框架
Handoffs	专业化分工	OpenAI Agents SDK
Fan-out/in	并行加速	LangChain
Supervisor	智能路由	AutoGen

11.1 架构选型速查#

场景需求	推荐架构
简单问答	Single Agent + RAG
多专业领域协作	Supervisor
多源数据并行研究	Fan-out/Fan-in
流水线式任务处理	Handoffs
需要反复试错	Reflection
复杂多步任务	Plan-and-Execute
混合型复杂系统	Supervisor + Fan-out