2022 年,普林斯顿大学和 Google Research 发表了一篇改变 AI Agent 格局的论文。
标题是:《ReAct: Synergizing Reasoning and Acting in Language Models》
核心思想简洁有力:让大模型在「推理」和「行动」之间交替进行,而不是孤立地做其中之一。
这个范式成为现代 AI Agent 的基础架构,被 LangChain、AutoGPT 等几乎所有 Agent 框架采用。
ReAct 定义了 AI Agent 的基本工作方式。
本文要点
- ReAct 论文背景与动机
- Reasoning + Acting 协同范式
- Thought/Action/Observation 循环
- HotpotQA/FEVER 实验结果
- 与纯推理/纯行动的对比
- Tool Use 的整合方式
一、论文背景
1.1 研究动机
1.2 论文信息
论文:ReAct: Synergizing Reasoning and Acting in Language Models作者:Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Ishaan Gulrajani, Karthik Narasimhan, Yuan Cao机构:Princeton University, Google Research发表:ICLR 2023引用:5000+ 次二、ReAct 范式详解
2.1 Thought/Action/Observation 循环
2.2 完整示例
2.3 代码实现
class ReActAgent: """ReAct Agent 实现"""
def __init__(self, llm, tools, max_iterations=10): self.llm = llm self.tools = {tool.name: tool for tool in tools} self.max_iterations = max_iterations
def run(self, task: str) -> str: """执行 ReAct 循环""" history = [] prompt = self._build_prompt(task)
for i in range(self.max_iterations): # 1. LLM 生成 Thought 和 Action response = self.llm.generate(prompt + self._format_history(history)) thought, action, action_input = self._parse_response(response)
history.append({ "thought": thought, "action": action, "action_input": action_input })
# 2. 检查是否完成 if action == "Finish": return action_input
# 3. 执行 Action if action in self.tools: observation = self.tools[action].run(action_input) else: observation = f"Error: Unknown action {action}"
history[-1]["observation"] = observation
return "Maximum iterations reached"
def _build_prompt(self, task: str) -> str: """构建 ReAct 提示词""" return f"""Answer the following questions as best you can. You have access to the following tools:
{self._format_tools()}
Use the following format:
Question: the input question you must answerThought: you should always think about what to doAction: the action to take, should be one of [{self._tool_names()}]Action Input: the input to the actionObservation: the result of the action... (this Thought/Action/Action Input/Observation can repeat N times)Thought: I now know the final answerFinal Answer: the final answer to the original input question
Begin!
Question: {task}"""
def _parse_response(self, response: str): """解析 LLM 响应""" thought = "" action = "" action_input = ""
lines = response.strip().split("\n") for line in lines: if line.startswith("Thought:"): thought = line.replace("Thought:", "").strip() elif line.startswith("Action:"): action = line.replace("Action:", "").strip() elif line.startswith("Action Input:"): action_input = line.replace("Action Input:", "").strip()
return thought, action, action_input三、与纯推理/纯行动对比
3.1 三种模式对比
3.2 HotpotQA 实验结果
| 方法 | HotpotQA | FEVER | 特点 ||---------------|-------------|-------------|------------------|| Act-only | 28.0% | 60.9% | 行动盲目 || Reason-only | 29.4% | 56.3% | 缺乏外部信息 || ReAct | 35.1% | 64.0% | 协同最优 || ReAct + Reflexion | 37.2% | 66.5% | 加反思更强 |
关键发现:• ReAct 显著优于单一模式• 推理和行动相互促进• 外部信息减少幻觉3.3 错误类型分析
四、Tool Use 整合
4.1 工具定义
from dataclasses import dataclassfrom typing import Callable, Dict, Any
@dataclassclass Tool: """工具定义""" name: str description: str func: Callable parameters: Dict[str, Any] # JSON Schema
# 常用工具示例search_tool = Tool( name="search", description="Search for information on the web", func=search_function, parameters={ "type": "object", "properties": { "query": { "type": "string", "description": "The search query" } }, "required": ["query"] })
calculate_tool = Tool( name="calculate", description="Perform mathematical calculations", func=calculate_function, parameters={ "type": "object", "properties": { "expression": { "type": "string", "description": "The mathematical expression to evaluate" } }, "required": ["expression"] })
lookup_tool = Tool( name="lookup", description="Look up a specific entity in a knowledge base", func=lookup_function, parameters={ "type": "object", "properties": { "entity": { "type": "string", "description": "The entity to look up" } }, "required": ["entity"] })4.2 工具选择策略
五、ReAct 的变体与改进
5.1 Reflexion + ReAct
def reflexion_react(agent, task, max_attempts=3): """Reflexion + ReAct""" reflections = []
for attempt in range(max_attempts): # 执行 ReAct result = agent.run(task) success = evaluate(result, task)
if success: return result
# 反思失败原因 reflection = agent.reflect(task, result) reflections.append(reflection)
# 更新提示词,加入反思 agent.update_prompt(reflections)
return result5.2 Plan-and-Execute
六、ReAct 对 Agent 的影响
6.1 成为 Agent 标准范式
6.2 现代 Agent 架构演进
常见问题 FAQ
Q1:ReAct 和 Chain of Thought 有什么区别?
A:CoT 是纯推理,在模型内部进行思考链。ReAct 是推理+行动的交替,会调用外部工具获取信息,推理基于真实数据。
Q2:ReAct 的最大步数如何设置?
A:根据任务复杂度设置。简单问答 3-5 步,复杂研究 10-20 步。太多步数可能导致成本过高。
Q3:ReAct 能解决所有任务吗?
A:不能。ReAct 适合需要外部信息的任务。对于纯知识问答、创意写作等,不需要工具调用的任务,直接生成更高效。
Q4:如何评估 ReAct Agent 的效果?
A:1)任务成功率;2)步骤效率;3)工具使用正确率;4)最终答案质量。
Q5:ReAct 和 Function Calling 有什么关系?
A:ReAct 是一种范式,Function Calling 是技术实现。ReAct 使用 Function Calling 来执行 Action。两者是范式与实现的关系。
小结
ReAct 定义了 AI Agent 的基本工作方式:推理与行动交替进行。
核心贡献:
ReAct 让 AI 从「只思考」变成「边思考边行动」。
参考资料
- ReAct: Synergizing Reasoning and Acting in Language Models - Yao et al. 2022
- Reflexion: Language Agents with Verbal Reinforcement Learning - Shinn et al. 2023
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models - Wei et al. 2022
- MRKL Systems: A Modular, Neuro-Symbolic Architecture - Karpas et al. 2022
支持与分享
如果这篇文章对你有帮助,欢迎支持作者或分享给更多人
部分信息可能已经过时






