端云协同 + Agent 怎么做？-智慧文博士

端云协同 + Agent =
端侧负责“实时、安全、低成本的感知与执行”，
云侧负责“重推理、全局知识与持续学习”，
Agent 负责“决策、调度与自我管理”。

一、为什么一定要「端 + 云 + Agent」？

如果没有 Agent：

端：只能执行
云：只能回答
系统：不可控、不可扩展

Agent 的作用是：

把“模型”升级成“可行动的系统”

二、三层分工

1、端侧（Device Agent / Edge Agent）

职责关键词：

实时
安全
稳定
离线兜底

端侧做什么？

端侧小模型：感知 / 触发 / 判断
端侧大模型（可选）：指令理解
Agent Runtime：
- 状态管理
- 本地工具调用
- 云通信管理

端侧 Agent = 快速执行官

传感器 → 小模型 → Agent(状态机) → 执行

2、云侧（Cloud Agent / Brain Agent）

职责关键词：

深度推理
长期记忆
多 Agent 协作
持续学习

云侧做什么？

大模型（10B+ / GPT 类）
RAG（知识、日志、历史）
规划 / 反思 / 评测
策略下发

云侧 Agent = 战略大脑

3、协同层（最容易被忽略）

这是 90% 系统失败的原因。

协同层负责：

什么时候上云
什么信息上传
怎么兜底
如何回退

本质是：决策与成本控制系统

三、Agent 在系统中的真实形态

不是一句 “LLM Agent”。

一个真实 Agent 至少包含：

Perception 感知 Memory 记忆 Planner 规划 Executor 执行 Monitor 监控

四、典型「端云协同 + Agent」工作流

场景：智能设备执行复杂指令

“如果发现异常，就拍照分析并决定是否报警”

Step 1：端侧感知（小模型）

传感器 → 异常检测模型 → anomaly = True

快
低功耗
100% 可用

Step 2：端侧 Agent 判断

if confidence < threshold: call_cloud() else: local_action()

Agent 不是模型，是决策逻辑

Step 3：云侧 Agent 深度推理

输入： - 端侧摘要 - 图片 / 日志 - 历史上下文 云侧： - 多步推理 - 调用工具 - 给出决策建议

Step 4：策略回传 + 本地执行

云策略 → 端侧 Agent → 执行动作

Step 5：评测 & 学习闭环

行为结果 → 日志 → 云端评测 → 策略更新

五、标准架构图

┌─────────────┐ │ Cloud │ │ LLM Agent │ │ RAG / Eval │ └──────┬──────┘ │策略 / 召回 ┌──────▼──────┐ │ Edge Agent │ │ Planner │ │ Tool Router │ └──────┬──────┘ │ ┌──────▼──────┐ │ Edge Models │ │ CV / ASR │ └──────┬──────┘ │ 真实世界

六、端云协同的5 个关键工程原则

1、端侧永远要能独立活着

云 = 增强，不是依赖

2、云侧永远不能直接控制设备

云只给“建议”，端有最终裁决权

3、信息必须“摘要化”上传

embedding
结构化状态
压缩日志

4、Agent 必须可回放、可评测

replay
step-level log
LLM-Judge

5、失败路径比成功路径重要

云失败
模型失败
网络失败

七、可练手项目

端云协同 AI Agent Demo

端侧（模拟）

Python + 小模型（或 mock）
状态机 Agent
本地工具调用

云侧

大模型 API
RAG
Planner Agent

协同

触发阈值
成本控制
fallback

简化伪代码

class EdgeAgent: def step(self, obs): if obs.risk > T: decision = cloud_agent(obs.summary()) else: decision = self.local_policy(obs) self.execute(decision)

八、总结

端云协同 Agent 系统中，端侧负责实时、安全与兜底，
云侧负责重推理与长期记忆，
Agent 作为中枢，决定“何时上云、用什么模型、如何回退”。
系统设计的关键不在模型，而在失败控制与评测闭环。

LangGraph + LLM-Judge 评测 Demo

端云协同 Agent 在一个完整任务中：
是否做对决策、是否合理上云、是否安全执行、是否能兜底

一、Demo 场景定义

场景：智能端侧设备异常处理

任务描述：

端侧检测到异常 →
Agent 决定是否上云 →
云给出建议 →
端侧决定是否执行 →
评测整个行为轨迹

二、整体架构

┌──────────────┐ │ Scenario │ ← 测试样本 └──────┬───────┘ │ ┌──────▼───────┐ │ Edge Agent │ ← LangGraph │ (StateGraph) │ └──────┬───────┘ │ cloud_call? ┌──────▼───────┐ │ Cloud Agent │ ← LLM └──────┬───────┘ │ ┌──────▼───────┐ │ Action │ └──────┬───────┘ │ ┌──────▼────────────┐ │ Trajectory Logger │ ← 评测关键 └──────┬────────────┘ │ ┌──────▼────────────┐ │ LLM-Judge + Rules │ └───────────────────┘

三、Step 1：定义 Agent State

重点：State 里必须包含“可评测信息”

from typing import TypedDict, List, Optional class AgentState(TypedDict): observation: str # 端侧感知摘要 risk_score: float # 小模型输出 cloud_called: bool cloud_response: Optional[str] action: Optional[str] trajectory: List[str] # 每一步决策日志

trajectory 是评测的“黄金字段”

四、Step 2：用 LangGraph 写端侧 Agent

from langgraph.graph import StateGraph, END def edge_decision(state: AgentState): state["trajectory"].append( f"EdgeDecision: risk={state['risk_score']}" ) if state["risk_score"] > 0.7: return "call_cloud" else: state["action"] = "local_ignore" state["trajectory"].append("Action: local_ignore") return END

云调用节点

def cloud_agent(state: AgentState): state["cloud_called"] = True # mock 云 LLM response = "建议立即停机并报警" state["cloud_response"] = response state["trajectory"].append( f"CloudResponse: {response}" ) return "execute"

执行节点

def execute_action(state: AgentState): if "停机" in state["cloud_response"]: state["action"] = "shutdown" else: state["action"] = "ignore" state["trajectory"].append( f"FinalAction: {state['action']}" ) return END

构建 LangGraph

graph = StateGraph(AgentState) graph.add_node("edge", edge_decision) graph.add_node("cloud", cloud_agent) graph.add_node("execute", execute_action) graph.set_entry_point("edge") graph.add_edge("edge", "cloud", condition=lambda s: s["risk_score"] > 0.7) graph.add_edge("cloud", "execute") agent = graph.compile()

五、Step 3：跑一个完整轨迹（评测输入）

initial_state = { "observation": "temperature spike", "risk_score": 0.85, "cloud_called": False, "cloud_response": None, "action": None, "trajectory": [] } final_state = agent.invoke(initial_state) print("\n".join(final_state["trajectory"]))

输出的是可回放的决策轨迹，不是一句话。

六、Step 4：规则评测

LLM-Judge 之前必须有的硬规则层。

def rule_based_eval(state): results = {} results["called_cloud_correctly"] = ( state["risk_score"] > 0.7 and state["cloud_called"] ) results["unsafe_action"] = ( state["action"] == "shutdown" and state["risk_score"] < 0.3 ) results["final_success"] = ( state["action"] == "shutdown" ) return results

七、Step 5：LLM-Judge

评合理性，不是对错

Judge Prompt

你是一个端云协同 Agent 系统评测专家。 给定以下决策轨迹，请评估： 1. 是否有不必要的云调用 2. 云建议是否被合理采纳 3. 整体行为是否稳健、安全 请给出： - 合理性评分（0-5） - 主要问题（如有）

调用 LLM-Judge

def llm_judge(trajectory: list): # mock return { "score": 4, "issues": "云调用合理，执行动作一致" }

八、Step 6：合并评测结果

eval_result = { "rule_eval": rule_based_eval(final_state), "llm_judge": llm_judge(final_state["trajectory"]), "trajectory": final_state["trajectory"] }

一个完整评测样本

✅ 用 LangGraph 建了可回放 Agent
✅ 区分了规则评测 vs LLM-Judge
✅ 评的是系统行为，不是文本
✅ 有能力扩展到：

多场景
对抗评测
A/B Agent 策略

用 LangGraph 构建端侧 Agent 的可回放决策图，
通过规则评测保证安全与硬约束，
再用 LLM-Judge 评估策略合理性，
从而在端云协同系统中实现行为级、轨迹级的自动评测闭环。