iconsulting/docs/architecture/07-memory-manager-agent.md

# 07 - Memory Manager Agent (记忆管理) 设计详解

## 1. 核心职责

Memory Manager 是系统中的**用户信息管家**。它负责从对话中提取用户信息、组织存储到长期记忆、以及按需检索用户历史上下文。它替代了旧架构中 `StrategyEngineService.extractUserInfo()` 的手动提取逻辑。

核心能力：
1. **信息提取** -- 从对话文本中自动识别和提取用户的个人信息（年龄、学历、工作等）
2. **记忆存储** -- 将提取的信息分类标注后保存到 knowledge-service 的长期记忆
3. **上下文加载** -- 检索用户的历史记忆，为其他 Agent 提供用户背景
4. **摘要生成** -- 将散落在多次对话中的用户信息整合为结构化的用户画像
5. **去重合并** -- 避免重复存储相同信息，新信息覆盖旧信息

> 设计原则：**只提取用户明确表述的事实，不做推测。** "我大概三十多岁"提取为"30+"而非具体数字。

## 2. 模型与参数

```typescript
{
  model: 'claude-haiku-3-5-20241022',  // 结构化提取任务，Haiku 足矣
  max_tokens: 1000,
  temperature: 0,                       // 信息提取必须确定性
  system: [
    {
      type: 'text',
      text: memoryManagerPrompt,
      cache_control: { type: 'ephemeral' }
    }
  ],
}
```

选用 Haiku 的理由：
- 信息提取是相对简单的 NLP 任务（实体识别 + 分类）
- 需要高频调用（每轮对话后都可能触发），成本敏感
- 输入输出格式高度结构化，不需要 Sonnet 的推理能力
- 速度要求高 -- 信息保存不应阻塞主对话流

## 3. 可用工具 (Available Tools)

Memory Manager 有 **2 个工具**：

### 3.1 save_user_memory

```typescript
{
  name: 'save_user_memory',
  description: '将用户信息保存到长期记忆存储。每条记忆包含类型、内容和相关移民类别。',
  input_schema: {
    type: 'object',
    properties: {
      memoryType: {
        type: 'string',
        enum: ['FACT', 'PREFERENCE', 'INTENT'],
        description: 'FACT=用户陈述的事实（年龄、学历等），PREFERENCE=用户偏好（倾向的移民方式），INTENT=用户意图（计划时间、预算等）'
      },
      content: {
        type: 'string',
        description: '要保存的记忆内容，用简洁的陈述句表达，如"用户32岁，清华大学计算机硕士"'
      },
      category: {
        type: 'string',
        enum: ['QMAS', 'GEP', 'IANG', 'TTPS', 'CIES', 'TECHTAS'],
        description: '相关的移民类别（如果有明确关联）'
      }
    },
    required: ['memoryType', 'content']
  }
}
```

**底层实现**：调用 knowledge-service 的 `POST /api/v1/memory/user`

```typescript
const response = await axios.post('http://knowledge-service:3003/api/v1/memory/user', {
  userId: context.userId,
  tenantId: context.tenantId,
  memoryType: input.memoryType,
  content: input.content,
  category: input.category,
  importance: IMPORTANCE_MAP[input.memoryType],  // FACT:70, INTENT:80, PREFERENCE:60
  source: 'conversation',
  conversationId: context.conversationId,
});
```

### 3.2 get_user_context

```typescript
{
  name: 'get_user_context',
  description: '获取用户的历史记忆和背景信息。返回按重要性和相关性排序的记忆列表。',
  input_schema: {
    type: 'object',
    properties: {
      query: {
        type: 'string',
        description: '检索查询，如"用户的基本信息和移民意向"'
      }
    },
    required: ['query']
  }
}
```

## 4. System Prompt 要点

```
# 身份
你是 iConsulting 的记忆管理专员。你负责从对话中提取、保存和检索用户信息。

# 核心原则
1. 只提取用户明确陈述的信息，不做推测
2. 模糊信息保留模糊性（"三十多岁" → "30+岁"，不要推测为35）
3. 同一信息不要重复保存，检查已有记忆后再决定是否保存
4. 每条记忆用简洁的陈述句表达
5. 正确分类 memoryType：
   - FACT: 客观事实（年龄、学历、工作、收入等）
   - PREFERENCE: 主观偏好（偏好的移民方式、时间偏好等）
   - INTENT: 行动意图（打算何时申请、预算范围、已在准备等）

# 需要提取的核心信息字段
## 基础信息
- age: 年龄
- education: 最高学历（博士/硕士/学士/大专/高中）
- university: 毕业院校
- major: 专业方向

## 职业信息
- currentJobTitle: 当前职位
- currentIndustry: 所属行业
- currentCompany: 当前公司（如果提到）
- totalYearsOfExperience: 工作年限
- annualIncome: 年收入（注意币种）

## 移民相关
- targetCategory: 感兴趣的移民类别
- hasHongKongEmployer: 是否有香港雇主
- hasTechBackground: 是否有科技背景
- investmentAmount: 可投资金额
- immigrationTimeline: 计划移民时间

## 家庭信息
- maritalStatus: 婚姻状况
- hasChildren: 是否有子女
- familySupport: 家人支持程度

# 操作类型
根据 Coordinator 传入的 action 参数执行不同操作：
- load_context: 加载用户上下文（调用 get_user_context）
- save_info: 从对话中提取信息并保存（调用 save_user_memory）
- summarize: 加载所有记忆并生成结构化摘要
```

## 5. 输入/输出格式

Memory Manager 的输入/输出根据 `action` 不同而变化：

### Action: load_context

**输入**：

```typescript
interface MemoryManagerLoadInput {
  action: 'load_context';
  /** 当前用户问题（用于检索相关记忆） */
  query: string;
}
```

**输出**：

```typescript
interface MemoryManagerLoadOutput {
  action: 'load_context';
  /** 检索到的用户记忆列表 */
  memories: Array<{
    type: 'FACT' | 'PREFERENCE' | 'INTENT';
    content: string;
    importance: number;
    category?: string;
    createdAt?: string;
  }>;
  /** 用户画像摘要（如果有足够信息） */
  user_profile_summary?: string;
  /** 记忆总数 */
  total_memories: number;
}
```

### Action: save_info

**输入**：

```typescript
interface MemoryManagerSaveInput {
  action: 'save_info';
  /** 用户最新消息 */
  userMessage: string;
  /** 助手最新回复 */
  assistantMessage: string;
  /** 已有的用户信息（避免重复提取） */
  existingInfo?: Record<string, unknown>;
}
```

**输出**：

```typescript
interface MemoryManagerSaveOutput {
  action: 'save_info';
  /** 本次提取并保存的信息 */
  extracted_info: Record<string, unknown>;
  /** 保存的记忆条数 */
  saved_count: number;
  /** 保存的记忆详情 */
  saved_memories: Array<{
    memoryType: 'FACT' | 'PREFERENCE' | 'INTENT';
    content: string;
    field: string;      // 对应的字段名
  }>;
  /** 跳过的信息（已存在） */
  skipped_fields: string[];
}
```

### Action: summarize

**输入**：

```typescript
interface MemoryManagerSummarizeInput {
  action: 'summarize';
}
```

**输出**：

```typescript
interface MemoryManagerSummarizeOutput {
  action: 'summarize';
  /** 结构化用户画像 */
  user_profile: {
    // 基础信息
    basic: {
      age?: string;
      education?: string;
      university?: string;
      major?: string;
      nationality?: string;
      location?: string;
    };
    // 职业信息
    career: {
      jobTitle?: string;
      industry?: string;
      company?: string;
      yearsOfExperience?: string;
      annualIncome?: string;
    };
    // 移民意向
    immigration: {
      targetCategories?: string[];
      timeline?: string;
      hasHKEmployer?: boolean;
      investmentCapacity?: string;
    };
    // 家庭情况
    family: {
      maritalStatus?: string;
      hasChildren?: boolean;
      familySupport?: string;
    };
  };
  /** 信息完整度 */
  completeness: {
    score: number;            // 0-100
    filled_fields: string[];
    missing_fields: string[];
  };
  /** 一段话总结 */
  narrative_summary: string;
}
```

## 6. 触发时机 (When to Trigger)

Coordinator 根据不同的 `action` 在不同时机调用 `invoke_memory_manager`：

### load_context 触发时机

| 场景 | 触发条件 | 目的 |
|------|----------|------|
| 对话开始 | 第 1 轮对话 | 加载用户历史信息，实现跨会话记忆 |
| 需要用户背景 | Coordinator 需要参考用户画像 | 为决策提供上下文 |
| 评估前准备 | 准备调 Assessment Expert 前 | 补充已有但本次对话未提及的信息 |

### save_info 触发时机

| 场景 | 触发条件 | 目的 |
|------|----------|------|
| 用户分享个人信息 | 对话中出现年龄、学历、工作等信息 | 实时保存用户数据 |
| 每轮对话后 | 助手回复生成后 | 确保不遗漏任何用户信息 |
| 用户表达偏好/意图 | "我比较想走高才通" / "打算明年申请" | 保存用户偏好和意图 |

### summarize 触发时机

| 场景 | 触发条件 | 目的 |
|------|----------|------|
| 评估前摘要 | 准备做全面评估时 | 生成完整的用户画像供 Assessment Expert 使用 |
| 对话结束时 | 对话即将结束 | 更新用户画像摘要 |

**调用频率建议**：
- `load_context`：每次新对话开始时调用 1 次
- `save_info`：不需要每轮都调用，Coordinator 判断用户提供了新信息时才调用
- `summarize`：整个对话过程中最多调用 1-2 次

## 7. 内部循环 (Internal Loop)

Memory Manager 的内部循环根据 action 不同而变化：

### load_context 流程

```
┌─────────────────────────────────────────────────────┐
│  load_context Loop (max 1 turn)                     │
│                                                      │
│  Turn 0:                                             │
│  ├── get_user_context({query: input.query})          │
│  ├── 整理返回的记忆列表                               │
│  ├── 如果记忆足够多，生成 user_profile_summary        │
│  └── 返回结构化输出                                   │
└─────────────────────────────────────────────────────┘
```

### save_info 流程

```
┌─────────────────────────────────────────────────────┐
│  save_info Loop (max 2 turns)                       │
│                                                      │
│  Turn 0: 提取信息                                    │
│  ├── LLM 分析 userMessage + assistantMessage         │
│  ├── 对比 existingInfo，筛除已有字段                  │
│  ├── 对新信息逐条调用 save_user_memory               │
│  │   （可能并行保存多条）                             │
│  │                                                   │
│  Turn 1: 确认保存结果（如需）                         │
│  ├── 检查 save 操作是否全部成功                       │
│  └── 返回保存结果摘要                                │
└─────────────────────────────────────────────────────┘
```

### summarize 流程

```
┌─────────────────────────────────────────────────────┐
│  summarize Loop (max 2 turns)                       │
│                                                      │
│  Turn 0: 加载所有记忆                                │
│  ├── get_user_context({query: "用户所有背景信息"})    │
│  │                                                   │
│  Turn 1: 生成摘要                                    │
│  ├── 将散碎的记忆整合为结构化 user_profile            │
│  ├── 计算 completeness score                         │
│  └── 生成 narrative_summary                          │
└─────────────────────────────────────────────────────┘
```

**save_info 的提取逻辑伪代码**：

```typescript
async function extractAndSave(input: MemoryManagerSaveInput): Promise<MemoryManagerSaveOutput> {
  // Step 1: LLM 提取信息（Haiku 快速完成）
  const extractionPrompt = `
    从以下对话中提取用户信息。只提取能明确确定的信息，不要猜测。
    已有信息（不要重复提取）：${JSON.stringify(input.existingInfo)}

    用户消息：${input.userMessage}
    助手回复：${input.assistantMessage}

    返回 JSON：{field: value} 格式。
  `;

  const extracted = await llm.extract(extractionPrompt);
  const newFields = filterOutExisting(extracted, input.existingInfo);

  // Step 2: 逐条保存（可并行）
  const savePromises = Object.entries(newFields).map(([field, value]) => {
    const memoryType = inferMemoryType(field);  // FACT/PREFERENCE/INTENT
    return saveUserMemory({
      memoryType,
      content: `${FIELD_LABELS[field]}：${value}`,
      category: inferCategory(field, value),
    });
  });

  const results = await Promise.all(savePromises);

  return {
    action: 'save_info',
    extracted_info: newFields,
    saved_count: results.filter(r => r.success).length,
    saved_memories: results.map(r => r.detail),
    skipped_fields: Object.keys(extracted).filter(k => input.existingInfo?.[k]),
  };
}
```

## 8. 与其他 Agent 的关系

```
┌──────────────┐     invoke_memory_manager      ┌──────────────┐
│              │ ─────────────────────────────→  │              │
│  Coordinator │     MemoryManagerOutput        │ Memory       │
│              │ ←─────────────────────────────  │ Manager      │
└──────┬───────┘                                 └──────┬───────┘
       │                                                │
       │  Memory Manager 是其他 Agent 的                  ├── save_user_memory
       │  数据供应商                                      └── get_user_context
       │                                                       │
       │                                                       ▼
       │                                              ┌──────────────┐
       │                                              │ Knowledge    │
       │                                              │ Service      │
       │                                              │ (Memory API) │
       │                                              └──────────────┘
       │
       │  协作模式：
       │
       │  1. 对话开始：
       │     Coordinator → invoke_memory_manager({action: 'load_context'})
       │     → 获取用户画像 → 注入到后续所有 Agent 调用的上下文中
       │
       │  2. 信息收集轮：
       │     用户发消息 → Coordinator 回复 → Coordinator 调用
       │     invoke_memory_manager({action: 'save_info', userMessage, assistantMessage})
       │     → 信息自动归档（异步，不阻塞主流程）
       │
       │  3. 评估准备：
       │     Coordinator → invoke_memory_manager({action: 'summarize'})
       │     → 获取完整用户画像 → 传入 Assessment Expert
       │
       │  4. 跨会话衔接：
       │     新对话开始 → load_context → 上一次对话的信息仍然可用
```

**与旧架构的对应关系**：

| 旧架构 | 新架构 |
|--------|--------|
| `StrategyEngineService.extractUserInfo()` | Memory Manager 的 `save_info` action |
| `ConsultingState.collectedInfo` | Memory Manager 从 knowledge-service 加载 |
| 手动在 System Prompt 中拼接用户信息 | Memory Manager 的 `load_context` + `summarize` |
| 信息只在单次对话中有效 | 信息持久化到 knowledge-service，跨会话有效 |

**异步保存优化**：
`save_info` 可以**异步执行**，不阻塞 Coordinator 的主回复流：

```typescript
// Coordinator 主流程
const reply = yield* generateReply(userMessage);  // 先回复用户

// 异步保存用户信息（不阻塞）
invokeMemoryManager({
  action: 'save_info',
  userMessage,
  assistantMessage: reply,
  existingInfo: currentCollectedInfo,
}).catch(err => logger.warn('Memory save failed:', err));
```

## 9. 示例场景

### 场景 1：从对话中提取信息 (save_info)

**Coordinator 调用**：

```json
{
  "tool": "invoke_memory_manager",
  "input": {
    "action": "save_info",
    "userMessage": "我今年32岁，浙大计算机硕士毕业的，现在在杭州做前端开发，工作8年了，年薪大概80万",
    "assistantMessage": "感谢您分享这些信息！您的背景很不错...",
    "existingInfo": {}
  }
}
```

**内部执行**：

```
Turn 0: LLM 分析对话，提取信息
→ 识别到 5 个字段：age=32, university="浙江大学", education="硕士",
  major="计算机", currentJobTitle="前端开发", totalYearsOfExperience=8,
  annualIncome=800000, currentLocation="杭州"

Turn 1: 并行保存 8 条记忆
→ save_user_memory({type: "FACT", content: "用户32岁"})
→ save_user_memory({type: "FACT", content: "浙江大学计算机硕士"})
→ save_user_memory({type: "FACT", content: "前端开发工程师，工作8年"})
→ save_user_memory({type: "FACT", content: "年薪约80万人民币"})
→ save_user_memory({type: "FACT", content: "目前在杭州工作"})
→ ... (其他字段)
```

**返回结果**：

```json
{
  "action": "save_info",
  "extracted_info": {
    "age": 32,
    "education": "硕士",
    "university": "浙江大学",
    "major": "计算机",
    "currentJobTitle": "前端开发",
    "totalYearsOfExperience": 8,
    "annualIncome": 800000,
    "currentLocation": "杭州"
  },
  "saved_count": 5,
  "saved_memories": [
    {"memoryType": "FACT", "content": "用户32岁", "field": "age"},
    {"memoryType": "FACT", "content": "浙江大学计算机硕士毕业", "field": "education+university+major"},
    {"memoryType": "FACT", "content": "从事前端开发工作8年", "field": "currentJobTitle+totalYearsOfExperience"},
    {"memoryType": "FACT", "content": "年薪约80万人民币", "field": "annualIncome"},
    {"memoryType": "FACT", "content": "目前在杭州工作", "field": "currentLocation"}
  ],
  "skipped_fields": []
}
```

### 场景 2：跨会话加载上下文 (load_context)

**Coordinator 调用**（新对话开始时）：

```json
{
  "tool": "invoke_memory_manager",
  "input": {
    "action": "load_context",
    "query": "用户的基本背景和移民意向"
  }
}
```

**返回结果**：

```json
{
  "action": "load_context",
  "memories": [
    {"type": "FACT", "content": "用户32岁", "importance": 70, "createdAt": "2026-02-05"},
    {"type": "FACT", "content": "浙江大学计算机硕士毕业", "importance": 70, "createdAt": "2026-02-05"},
    {"type": "FACT", "content": "从事前端开发工作8年", "importance": 70, "createdAt": "2026-02-05"},
    {"type": "FACT", "content": "年薪约80万人民币", "importance": 70, "createdAt": "2026-02-05"},
    {"type": "INTENT", "content": "对高才通B类非常感兴趣", "importance": 80, "category": "TTPS", "createdAt": "2026-02-05"},
    {"type": "PREFERENCE", "content": "希望尽快办理，时间越短越好", "importance": 60, "createdAt": "2026-02-05"}
  ],
  "user_profile_summary": "32岁浙大计算机硕士，8年前端开发经验，年薪80万。上次对话中对高才通B类表现出浓厚兴趣，偏好快速通道。",
  "total_memories": 6
}
```

### 场景 3：生成用户画像摘要 (summarize)

**Coordinator 调用**（准备做评估前）：

```json
{
  "tool": "invoke_memory_manager",
  "input": {
    "action": "summarize"
  }
}
```

**返回结果**：

```json
{
  "action": "summarize",
  "user_profile": {
    "basic": {
      "age": "32岁",
      "education": "硕士",
      "university": "浙江大学",
      "major": "计算机科学",
      "location": "杭州"
    },
    "career": {
      "jobTitle": "前端开发工程师",
      "industry": "IT/互联网",
      "yearsOfExperience": "8年",
      "annualIncome": "约80万人民币"
    },
    "immigration": {
      "targetCategories": ["TTPS"],
      "timeline": "尽快",
      "hasHKEmployer": false
    },
    "family": {
      "maritalStatus": "未知",
      "hasChildren": null,
      "familySupport": "未知"
    }
  },
  "completeness": {
    "score": 65,
    "filled_fields": ["age", "education", "university", "major", "location", "jobTitle", "industry", "yearsOfExperience", "annualIncome", "targetCategories", "timeline"],
    "missing_fields": ["nationality", "maritalStatus", "hasChildren", "languageSkills", "hasHongKongEmployer", "hasTechBackground", "investmentAmount"]
  },
  "narrative_summary": "用户是一位32岁的前端开发工程师，浙江大学计算机硕士毕业，拥有8年IT行业经验，年薪约80万人民币，目前在杭州工作。用户对高才通B类最感兴趣，希望尽快办理。家庭情况和语言能力等信息尚未收集。"
}
```