Commit Graph

26 Commits

Author SHA1 Message Date
hailin 1eca8b3af8 fix(mpc-service): reduce Kafka session timeout from 5min to 30s 2025-12-09 03:24:00 -08:00
hailin c05722bbb3 fix(mpc-service): enable shutdown hooks for graceful Kafka disconnect
Root cause: When mpc-service restarts, the old Kafka consumer doesn't
properly disconnect, causing the broker to wait for sessionTimeout
(~5 minutes) before completing rebalance. This blocks app startup.

Solution: Enable NestJS shutdown hooks with app.enableShutdownHooks().
This ensures onModuleDestroy() is called on SIGTERM/SIGINT, which calls
consumer.disconnect() and allows immediate rebalance on next startup.

Also reverted the "don't await consumer.run()" workaround since the
proper fix is graceful shutdown.

Sources:
- https://github.com/tulios/kafkajs/issues/807
- https://kafka.js.org/docs/consuming

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 03:12:33 -08:00
hailin af896db36c fix(mpc-service): don't await consumer.run() to prevent app startup blocking
consumer.run() never resolves as it runs continuously. Awaiting it
blocks onApplicationBootstrap which prevents app.listen() from being
called, causing the service to never start listening on port 3006.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-09 03:04:48 -08:00
hailin d983525aa5 fix(wallet): resolve account creation and wallet status query issues
This commit fixes three critical bugs that prevented the wallet creation
flow from completing successfully:

1. mpc-service: extraPayload not included in Kafka messages
   - KeygenCompletedEvent's extraPayload (containing userId, accountSequence,
     username, derivedAddresses) was being set dynamically but not serialized
   - identity-service received events without userId and skipped processing
   - Fix: Merge extraPayload into the published payload in event-publisher

2. mpc-service: KAFKA_BROKERS hostname mismatch
   - mpc-service used KAFKA_BROKERS=rwa-kafka:29092
   - Kafka advertises itself as kafka:29092 in cluster metadata
   - During consumer group rebalance, mpc-service couldn't connect to
     the coordinator address returned by Kafka
   - Fix: Use kafka:29092 to match Kafka's advertised listener

3. blockchain-service: recovery_mnemonics table missing
   - RecoveryMnemonic model exists in schema.prisma but not in migration
   - prisma migrate deploy found no pending migrations
   - Address derivation failed with "table does not exist" error
   - Fix: Use prisma db push instead of migrate deploy to sync schema

Tested: E2E flow now completes successfully
- POST /user/auto-create creates account
- MPC keygen completes and publishes event with extraPayload
- blockchain-service derives addresses and saves recovery mnemonic
- GET /user/wallet returns status=ready with 3 addresses and mnemonic

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-08 07:57:17 -08:00
hailin 1bfbaa06f1 feat(mnemonic): propagate accountSequence through MPC keygen flow (DDD)
Changes across all three services to properly associate recovery mnemonics
with account sequence numbers instead of user IDs, following DDD principles:

identity-service:
- Add accountSequence to MpcKeygenRequestedEvent payload
- Pass accountSequence when publishing keygen request
- Remove direct access to recoveryMnemonic table (now in blockchain-service)
- Call blockchain-service for mnemonic backup marking
- BlockchainWalletHandler no longer saves mnemonic (stored in blockchain-service)

mpc-service:
- Add accountSequence to KeygenRequestedPayload interface
- Pass accountSequence through to blockchain-service when deriving addresses
- Include accountSequence in KeygenCompleted event extraPayload

blockchain-service:
- Add accountSequence to derive-address API and internal interfaces
- Add accountSequence to KeygenCompletedPayload extraPayload
- Add PUT /internal/mnemonic/backup API for marking mnemonic as backed up
- Store recovery mnemonic with accountSequence association

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-08 01:08:27 -08:00
hailin ab8852907d fix(mpc-service): increase Kafka consumer session timeout
- Increase sessionTimeout from 30s to 5 minutes
- Increase heartbeatInterval from 3s to 10s
- Add rebalanceTimeout of 5 minutes
- This prevents consumer from being kicked out during long MPC keygen operations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-07 03:58:40 -08:00
hailin 84e653d284 fix(mpc-service): add /api/v1 prefix to blockchain-service calls
blockchain-service uses global API prefix, need to include it in URLs.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-07 02:11:57 -08:00
hailin 3925b19229 fix(mpc-service): use JWT auth instead of X-API-Key
mpc-account-service expects JWT Bearer tokens, not X-API-Key header.
Added JWT token generation and use MPC_JWT_SECRET env var.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-07 02:02:17 -08:00
hailin 2ae174692e fix(mpc-service): use correct env var name MPC_ACCOUNT_SERVICE_URL
Changed from MPC_SYSTEM_URL to MPC_ACCOUNT_SERVICE_URL to match docker-compose config.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-07 01:48:02 -08:00
hailin 0ab1bf0dcc feat(mpc-service): add blockchain-service client for address derivation
- Add BlockchainClientService to call blockchain-service /internal/derive-address
- Call derive addresses after keygen completes with MPC public key
- Include derived addresses (BSC, KAVA, DST) in keygen completed event

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 23:27:30 -08:00
hailin 383a9540a0 refactor: move backup-service client from identity-service to mpc-service
Architecture change: delegate share storage is now handled by mpc-service.
- identity-service no longer calls backup-service directly
- mpc-service calls backup-service after keygen completion
- This follows proper domain boundaries (MPC domain handles share storage)

Flow:
1. identity-service publishes mpc.KeygenRequested
2. mpc-service calls mpc-system for keygen
3. mpc-service stores delegate share to backup-service
4. mpc-service publishes mpc.KeygenCompleted
5. identity-service updates user wallet address

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 22:56:35 -08:00
hailin 23043d5d79 feat: add detailed debug logging for MPC Kafka event flow
- Add comprehensive [INIT], [CONNECT], [PUBLISH], [RECEIVE], [HANDLE] logs
  to identity-service and mpc-service Kafka services
- Add KeygenStarted event for tracking keygen progress
- Add MpcKeygenCompletedHandler to process keygen completion events
- Fix topic routing for MPC events between services

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 19:49:06 -08:00
hailin c459387c42 feat: add event-driven communication between identity-service and mpc-service
Replace synchronous HTTP polling with Kafka event-driven model for MPC operations:

- Add MPC event consumer service in mpc-service for keygen/signing requests
- Add keygen-requested and signing-requested event handlers
- Add MPC event consumer in identity-service for completion events
- Extend mpc-client.service with async event-driven methods
- Support backward compatibility via MPC_USE_EVENT_DRIVEN env var

Topics: mpc.KeygenRequested, mpc.SigningRequested, mpc.KeygenCompleted,
        mpc.SigningCompleted, mpc.SessionFailed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 18:17:44 -08:00
hailin 9c36e6772b refactor: simplify mpc-service to gateway mode
mpc-service 重新定位为网关服务,转发请求到 mpc-system:
- 简化 Prisma schema:只保留 MpcWallet 和 MpcShare
- 添加 delegate share 存储(keygen 后保存用户的 share)
- 保留六边形架构结构,清理不再需要的实现
- 删除旧的 command handlers、queries、repository 实现
- 简化 infrastructure module,只保留 PrismaService

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 17:16:14 -08:00
hailin d652f1d7a4 feat: add MPC coordinator service and keygen/signing API
- Add MPCController with keygen and sign endpoints
- Add MPCCoordinatorService to coordinate with mpc-system
- Add MpcWallet, MpcShare, MpcSession entities for data storage
- Update identity-service mpc-client to call mpc-service

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 16:03:17 -08:00
hailin 747e4ae8ef refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing
- Remove Address field from PartyEndpoint (parties connect to router themselves)
- Update K8s Discovery to only manage PartyID and Role labels
- Add Party registration and SessionEvent protobuf definitions
- Implement PartyRegistry and SessionEventBroadcaster domain logic
- Add RegisterParty and SubscribeSessionEvents gRPC handlers
- Prepare infrastructure for party-driven MPC coordination

This is the first phase of migrating from coordinator-driven to party-driven
architecture following international MPC system design patterns.
2025-12-05 08:11:28 -08:00
Developer c26a24b544 fix(mpc-service): 确保 keygen 会话包含完整的参与者列表
问题:account-service 要求 participants 数量必须等于 threshold_n
原因:createKeygenSession 传入的 participants 可能不足 3 个

修复:
- 在 createKeygenSession 中自动补全参与者列表
- 对于 2-of-3 配置,确保有 3 个参与者:
  - user-party (用户端)
  - server-party-1 (服务端)
  - server-party-2 (备份)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 06:04:47 -08:00
Developer 4db5534372 feat(mpc): 添加 server-party-api 服务,实现用户 share 生成
新增 mpc-system/services/server-party-api:
- 为 mpc-service 提供同步的 TSS keygen/signing API
- 参与 TSS 协议生成用户 share 并直接返回(不存储)
- 支持 API Key 认证
- 端口 8083 对外暴露

更新 mpc-service TSSWrapper:
- 改为调用 server-party-api 而非本地二进制
- 新增 MPC_SERVER_PARTY_API_URL 配置
- 超时时间调整为 10 分钟

架构: mpc-service -> account-service -> server-party-api -> TSS

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 05:32:41 -08:00
Developer 178a5c9f8b feat(mpc-service): 实现混合传输模式 (WebSocket + HTTP轮询)
- 优先尝试 WebSocket 连接 (5秒超时)
- WebSocket 失败自动降级到 HTTP 轮询
- HTTP 轮询间隔 100ms,总超时 5分钟
- 新增 getTransportMode() 方法查看当前传输模式
- 修复 message-router 404 导致的 socket hang up 问题

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-04 00:00:49 -08:00
Developer a701f55342 fix(mpc-service): 修复 WebSocket 导入方式
将 `import WebSocket from 'ws'` 改为 `import * as WebSocket from 'ws'`
以兼容 CommonJS 模块系统

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 22:58:00 -08:00
Developer e51edc2ce4 fix(mpc-service): 修复 MPC 会话流程,先创建会话再加入
问题:mpc-service 尝试用 identity-service 生成的 SHA256 哈希作为
joinToken 加入会话,但 session-coordinator 期望的是由它自己
CreateSession 接口生成的 JWT token。

修复:
- coordinator-client.ts: 添加 createSession() 方法
- participate-keygen.handler.ts: 先创建会话获取 JWT,再加入
- participate-signing.handler.ts: 同上
- rotate-share.handler.ts: 同上(使用 keygen 类型)

流程变更:
1. CreateSession -> 获取 sessionId + JWT joinToken
2. JoinSession 使用 JWT token 加入会话

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 21:21:22 -08:00
Developer 467206fd61 fix(mpc-service): 修复 coordinator-client 请求/响应格式
session-coordinator 使用 camelCase JSON 格式:

请求:
- session_id, party_id, join_token -> joinToken, partyId
- 添加必需字段 deviceType, deviceId

响应:
- session_info.session_id -> sessionId
- other_parties -> participants

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 21:11:16 -08:00
Developer e4abc7eb83 fix(mpc-service): 添加 /api/v1 前缀到 coordinator-client 路径
session-coordinator 的 API 路由注册在 /api/v1/sessions 下,
但 coordinator-client 调用的是 /sessions(404 错误)。

修复所有端点路径:
- /sessions/join -> /api/v1/sessions/join
- /sessions/report-completion -> /api/v1/sessions/report-completion
- /sessions/{id}/status -> /api/v1/sessions/{id}/status
- /sessions/report-failure -> /api/v1/sessions/report-failure

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 21:06:06 -08:00
Developer e068b99dc1 fix(mpc-service): 将 keygen/signing 接口标记为 Public
临时解决 identity-service 调用 mpc-service 时的 401 认证错误:
- keygen/participate
- keygen/participate-sync
- signing/participate
- signing/participate-sync

TODO: 添加适当的服务间认证机制(API key 或 service JWT)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-03 18:05:31 -08:00
Developer 00e359b412 fix(mpc-service): 直接从环境变量读取配置
ConfigService.get('port') 读取不到嵌套配置
改为直接使用 process.env.APP_PORT

修复服务监听错误端口 (6379 -> 3006)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-02 10:29:59 -08:00
hailin 6fa4d7ac1d feat: 添加MPC多方计算服务模块
新增 mpc-service 微服务,实现 MPC-TSS 门限签名功能:

架构设计:
- 采用六边形架构(Hexagonal Architecture)
- 实现 CQRS 命令查询职责分离模式
- 遵循 DDD 领域驱动设计原则

核心功能:
- Keygen: 分布式密钥生成协议参与
- Signing: 门限签名协议参与
- Share Rotation: 密钥份额轮换
- Share Management: 份额查询和管理

技术栈:
- NestJS + TypeScript
- Prisma ORM
- Redis (缓存和分布式锁)
- Kafka (事件发布)
- Jest (单元/集成/E2E测试)

测试覆盖:
- 单元测试: 81个
- 集成测试: 30个
- E2E测试: 15个
- 总计: 111个测试全部通过

文档:
- ARCHITECTURE.md: 架构设计文档
- API.md: REST API接口文档
- TESTING.md: 测试架构说明
- DEVELOPMENT.md: 开发指南
- DEPLOYMENT.md: 部署运维文档

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-11-27 17:31:43 -08:00