Commit Graph

23 Commits

Author SHA1 Message Date
hailin f305a8cd97 feat(session): broadcast participant_joined event via gRPC for real-time UI updates
Backend changes (session-coordinator):
- Add PublishParticipantJoined method to JoinSessionMessageRouterClient interface
- Implement PublishParticipantJoined in MessageRouterClient to broadcast events
- Call PublishParticipantJoined in join_session.go after participant joins
- Add detailed logging for debugging event broadcast

Android changes (service-party-android):
- Add detailed logging in TssRepository for session event handling
- Add detailed logging in MainViewModel for participant_joined processing
- Log activeSession state, event matching, and participant updates

This enables the initiator's waiting screen to receive real-time updates
when participants join the session, matching the expected behavior.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 08:34:47 -08:00
hailin 136a5ba851 fix(android): change address format from Cosmos to EVM and fix balance query
Changes:
- Change address derivation from deriveKavaAddress to deriveEvmAddress
  in TssRepository.kt (3 locations)
- Add AddressUtils.isEvmAddress() and getEvmAddress() helper methods
  to handle both old Cosmos and new EVM address formats
- Fix balance query for old wallets by deriving EVM address from
  public key when needed (MainViewModel.fetchBalanceForShare)
- Add retry logic for optimistic lock conflicts in join_session.go
  to prevent party_index collision during concurrent joins

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 07:48:52 -08:00
hailin e114723ab0 feat(mpc-system): add server-party-co-managed for co_managed_keygen sessions
- Create new server-party-co-managed service with two-phase event handling
  - Phase 1 (session_created): Store join token and wait
  - Phase 2 (session_started): Execute TSS protocol (same timing as service-party-app)
- Add PartyRoleCoManagedPersistent role to isolate from normal keygen/sign
- Update docker-compose.yml with 3 co-managed party instances
- Update deploy.sh service lists
- Modify selectPartiesByCompositionForCoManaged to use new role

This ensures co_managed_keygen sessions use dedicated parties that behave
100% compatible with service-party-app, without affecting existing keygen/sign flows.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 23:54:45 -08:00
hailin 6de545fcb9 fix(session-coordinator): generate wildcard token for co_managed_keygen external participants 2025-12-29 13:35:05 -08:00
hailin df8a14211e debug(mpc-system): 添加 joinToken 调试日志
- service-party-app: validateInviteCode 记录 token 长度
- service-party-app: joinSession 记录 token 信息
- service-party-app: 修复 ValidateInviteCodeResult 类型缺少 joinToken 字段
- session-coordinator: JoinSession 记录 token 解析详情

用于调试 "invalid token" 错误

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 05:55:46 -08:00
hailin 5f4c7c135f feat(mpc-system): 完善 co_managed_keygen 流程并添加调试控制台
主要改动:
- service-party-app: 发起方创建会话后自动加入并设置 activeKeygenSession
- service-party-app: 添加轮询机制确保 100% 可靠触发 keygen
- service-party-app: 添加 DebugConsole 组件 (Ctrl+Shift+D 打开)
- service-party-app: 主进程添加 debugLog 系统,日志可实时显示到前端
- session-coordinator: JoinSession 加入 messageRouterClient 发布事件
- session-coordinator: 添加 PublishSessionStarted 方法

修复:
- 发起方不设置 activeKeygenSession 导致无法触发 keygen 的问题
- 加入方可能错过 session_started 事件的时序问题

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 05:32:40 -08:00
hailin a5ab2e8350 fix(session-coordinator): 支持 co_managed_keygen 动态参与者加入
问题: 通过邀请码加入共管钱包会话时报 "party not invited" 错误
原因: 外部参与者不在 party pool 中,CreateSession 时无法预先选择

修复:
- join_session.go: 对于 co_managed_keygen + wildcard token,允许动态添加参与者
- create_session.go: 新增 selectPartiesByCompositionForCoManaged,跳过 TemporaryCount 选择
- report_completion.go: 使用 IsKeygen() 方法,co_managed_keygen 完成后也创建账户记录

注意: 所有修改仅对 co_managed_keygen 类型生效,不影响现有 keygen/sign 流程

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 04:25:11 -08:00
hailin 21985abde5 fix(session-coordinator): 保存 WalletName 和 InviteCode 到数据库
- CreateSessionInput 添加 WalletName 和 InviteCode 字段
- gRPC handler 从请求中读取并传递这些字段
- CreateSession use case 在创建会话时设置这些字段

修复: 通过邀请码查询会话时找不到记录的问题

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 03:07:44 -08:00
hailin a01284678d feat(wallet/mpc): 增强提现和充值流程可靠性
## 主要改进

### MPC 签名系统 (mpc-system)
- 添加签名缓存机制,避免重复签名请求
- 修复 yParity 恢复逻辑,确保签名格式正确
- 优化签名完成报告流程

### 区块链服务 (blockchain-service)
- EIP-1559 降级为 Legacy 交易(KAVA 测试网兼容)
- 修复 gas 估算逻辑

### 钱包服务 (wallet-service)
- 添加乐观锁机制 (version 字段) 防止并发修改
- 提现确认流程添加事务保护 + 乐观锁
- 提现失败时正确解冻 amount + fee
- 充值流程添加事务保护 + 乐观锁
- Kafka consumer 添加错误重抛,触发重试机制

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 19:47:20 -08:00
hailin ac4d9283dc fix: preserve original PartyIndex from keygen for signing sessions
- Add PartyIndex field to protobuf ParticipantInfo message
- Pass original PartyIndex from account shares to session coordinator
- Use original PartyIndex instead of loop variable when creating participants
- This fixes TSS signing failures when non-consecutive parties are selected
2025-12-06 10:45:05 -08:00
hailin 3d176e1132 feat: complete keygen_session_id implementation for signing sessions
- Regenerate protobuf Go code with KeygenSessionId fields
- Session Coordinator correctly parses, stores, and returns keygen_session_id
- Message Router Client parses keygen_session_id in JoinSession response
- participate_signing.go uses keygen_session_id for precise share lookup
- Database schema already includes keygen_session_id column

This fixes the signing issue where wrong keyshares were loaded for multi-account scenarios.
2025-12-06 08:57:30 -08:00
hailin 5f12404be4 fix: remove dynamic participant join to fix concurrent party_index assignment
- Remove dynamic participant addition in JoinSession
- Participants must be pre-created in CreateSession
- Add ErrPartyNotInvited error for unauthorized join attempts
- Fix Redis adapter to include version parameter in ReconstructSession
- This fixes VSS verification failures caused by inconsistent party indices
2025-12-06 04:54:40 -08:00
hailin b72268c1ce feat(mpc-system): implement optimistic locking for session updates
Implement version-based optimistic locking to prevent concurrent update conflicts
when multiple parties simultaneously report completion during keygen operations.

Changes:
- Add version column to mpc_sessions table (migration 004)
- Add Version field to MPCSession entity
- Define ErrOptimisticLockConflict error
- Update SessionPostgresRepo.Update() to check version and increment on success
- Add automatic retry logic (max 3 attempts) to ReportCompletionUseCase
- Update Save and all query methods (FindByStatus, FindExpired, etc.) to handle version field

This replaces pessimistic locking (FOR UPDATE) with optimistic locking using
the industry-standard pattern: WHERE version = $n and checking rowsAffected.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 04:16:32 -08:00
hailin fb9c85f883 debug(coordinator): add detailed logging to track concurrent update issue
Add comprehensive debug logs to:
1. report_completion.go - log all participant statuses at key points
2. session_postgres_repo.go - log before/after each participant update

This will help identify why server-party-1 status remains 'invited'
despite successfully reporting completion.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 02:11:28 -08:00
hailin 00b48bab50 fix(coordinator): handle all participant states in ReportCompletion with proper state transitions
- Add switch-case to handle Invited, Joined, and Ready states
- Auto-transition Invited -> Joined -> Ready -> Completed
- Auto-transition Joined -> Ready -> Completed
- Auto-transition Ready -> Completed
- Return error for invalid states (Failed, Completed, etc.)
- Fixes 'cannot transition to completed status' error
- Applies to all parties including server-party-api
2025-12-06 01:09:49 -08:00
hailin 4e14212147 fix(coordinator): auto-transition participant to Ready before Completed
ReportCompletion was failing with "cannot transition to completed status"
because participants were in Joined state trying to transition directly to
Completed, which violates the state machine flow: Joined -> Ready -> Completed.

Changes:
- Check participant status before marking as Completed
- Auto-transition Joined -> Ready if needed
- Then transition Ready -> Completed
- Add debug logging for auto-transition

This fixes the error seen during keygen completion.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 00:33:22 -08:00
hailin aa74e2b2e2 feat(mpc-system): add signing parties configuration and delegate signing support
- Add signing-config API endpoints (POST/PUT/DELETE/GET) for configuring
  which parties should participate in signing operations
- Add SigningParties field to Account entity with database migration
- Modify CreateSigningSession to use configured parties if set,
  otherwise use all active parties (backward compatible)
- Add delegate party signing support: user provides encrypted share
  at sign time for delegate party to use
- Update protobuf definitions for DelegateUserShare in session events
- Add ShareTypeDelegate to support hybrid custody model

API endpoints:
- POST /accounts/:id/signing-config - Set signing parties (first time)
- PUT /accounts/:id/signing-config - Update signing parties
- DELETE /accounts/:id/signing-config - Clear config (use all parties)
- GET /accounts/:id/signing-config - Get current configuration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 22:47:55 -08:00
hailin 135e821386 feat(mpc-system): integrate reliability mechanisms and enable party-driven architecture
- Enable SubscribeSessionEvents for automatic session participation
- Integrate heartbeat mechanism with pending message count
- Add ACK sending after message receipt for reliable delivery
- Add party activity tracking in session coordinator
- Add CountPendingByParty for heartbeat response
- Add retry package with exponential backoff for gRPC clients
- Add memory-based message broker and event publisher adapters
- Add account service integration for keygen completion
- Add party timeout checking background job
- Add notification service stub for future implementation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 20:30:03 -08:00
hailin c976fd3eb1 feat(mpc-system): implement party-driven architecture with SessionEvent broadcasting
Fully implemented party-driven architecture according to international standards (Fireblocks, ING Bank, ZenGo patterns):

**Architecture Changes:**
- Parties actively connect to Message Router (not passively called by coordinator)
- Session Coordinator publishes SessionEvents when creating sessions
- Parties automatically subscribe and respond to SessionEvents
- PartyID-based routing instead of network addresses

**New Features:**
1. Session Coordinator → Message Router gRPC Client
   - PublishSessionEvent RPC for broadcasting session lifecycle events
   - Automatic event publishing after session creation

2. Message Router SessionEvent Broadcasting
   - SubscribeSessionEvents RPC for party subscriptions
   - PublishSessionEvent RPC for coordinator publishing
   - Targeted broadcasting to selected parties

3. Server-Party Auto-Registration & Subscription
   - RegisterParty on startup with role (persistent/delegate/temporary)
   - SubscribeSessionEvents for automatic session notifications
   - Event handler for automatic MPC participation

**Files Modified:**
- api/proto/message_router.proto: Added SessionEvent messages and RPCs
- services/message-router/adapters/input/grpc/message_grpc_handler.go: PublishSessionEvent handler
- services/session-coordinator/adapters/output/grpc/message_router_client.go: NEW - gRPC client
- services/session-coordinator/application/use_cases/create_session.go: SessionEvent publishing
- services/session-coordinator/cmd/server/main.go: Message Router client initialization
- services/server-party/adapters/output/grpc/message_router_client.go: RegisterParty + SubscribeSessionEvents
- services/server-party/cmd/server/main.go: Party registration and event subscription (commented pending full integration)
- go.mod/go.sum: Updated grpc to v1.77.0

**Technical Details:**
- gRPC streaming for SessionEvent subscriptions
- Non-blocking channel broadcasts prevent slow subscribers from blocking
- PartyRole support (persistent/delegate/temporary)
- Join tokens distributed via SessionEvent

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 08:44:05 -08:00
hailin 747e4ae8ef refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing
- Remove Address field from PartyEndpoint (parties connect to router themselves)
- Update K8s Discovery to only manage PartyID and Role labels
- Add Party registration and SessionEvent protobuf definitions
- Implement PartyRegistry and SessionEventBroadcaster domain logic
- Add RegisterParty and SubscribeSessionEvents gRPC handlers
- Prepare infrastructure for party-driven MPC coordination

This is the first phase of migrating from coordinator-driven to party-driven
architecture following international MPC system design patterns.
2025-12-05 08:11:28 -08:00
hailin e975e9d86c feat(mpc-system): implement party role labels with strict persistent-only default
Implement Solution 1 (Party Role Labels) to differentiate between persistent
and delegate parties, with strict security guarantees for MPC threshold systems.

Key Features:
- PartyRole enum: persistent, delegate, temporary
- K8s pod labels (party-role) for role identification
- Role-based party filtering and selection
- Strict persistent-only default policy (no fallback)
- Optional PartyComposition for custom party requirements

Security Guarantees:
- Default: MUST use persistent parties (store shares in database)
- Fail fast with clear error if insufficient persistent parties
- No silent fallback to mixed/delegate parties
- Empty PartyComposition validation prevents accidental bypass
- MPC system compatibility maintained

Implementation:
1. Added PartyRole type with persistent/delegate/temporary constants
2. Extended PartyEndpoint with Role field
3. K8s party discovery extracts role from pod labels (defaults to persistent)
4. Session creation logic with strict persistent requirement
5. PartyComposition support for explicit mixed-role sessions
6. K8s deployment files with party-role labels

Files Modified:
- services/session-coordinator/application/ports/output/party_pool_port.go
- services/session-coordinator/infrastructure/k8s/party_discovery.go
- services/session-coordinator/application/ports/input/session_management_port.go
- services/session-coordinator/application/use_cases/create_session.go
- k8s/server-party-deployment.yaml (persistent role)

Files Added:
- k8s/server-party-api-deployment.yaml (delegate role)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 07:08:59 -08:00
hailin cf534ec178 feat(mpc-system): implement Kubernetes-based dynamic party pool architecture
Major architectural refactoring to align with international MPC standards
and enable horizontal scalability.

## Core Changes

### 1. DeviceInfo Made Optional
- Modified DeviceInfo.Validate() to allow empty device information
- Aligns with international MPC protocol standards
- MPC protocol layer should not mandate device-specific metadata
- Location: services/session-coordinator/domain/entities/device_info.go

### 2. Kubernetes Party Discovery Service
- Created infrastructure/k8s/party_discovery.go (220 lines)
- Implements dynamic service discovery via Kubernetes API
- Supports in-cluster config and kubeconfig fallback
- Auto-refreshes party list every 30s (configurable)
- Health-aware selection (only ready pods)
- Uses pod names as unique party IDs

### 3. Party Pool Architecture
- Defined PartyPoolPort interface for abstraction
- CreateSessionUseCase now supports automatic party selection
- When no participants specified, selects from K8s pool
- Graceful fallback to dynamic join mode if discovery fails
- Location: services/session-coordinator/application/ports/output/party_pool_port.go

### 4. Integration Updates
- Modified CreateSessionUseCase to inject partyPool
- Updated session-coordinator main.go to initialize K8s discovery
- gRPC handler already supports optional participants
- Added k8s client-go dependencies (v0.29.0) to go.mod

## Kubernetes Deployment

### New K8s Manifests
- k8s/namespace.yaml: mpc-system namespace
- k8s/configmap.yaml: shared configuration
- k8s/secrets-example.yaml: secrets template
- k8s/server-party-deployment.yaml: scalable party pool (3+ replicas)
- k8s/session-coordinator-deployment.yaml: coordinator with RBAC
- k8s/README.md: comprehensive deployment guide

### RBAC Configuration
- ServiceAccount for session-coordinator
- Role with pods/services get/list/watch permissions
- RoleBinding to grant discovery capabilities

## Key Features

 Dynamic service discovery via Kubernetes API
 Horizontal scaling (kubectl scale deployment)
 No hardcoded party IDs
 Health-aware party selection
 Graceful degradation when K8s unavailable
 MPC protocol compliance (optional DeviceInfo)

## Deployment Modes

### Docker Compose (Existing)
- Fixed 3 parties (server-party-1/2/3)
- Quick setup for development
- Backward compatible

### Kubernetes (New)
- Dynamic party pool
- Auto-discovery and scaling
- Production-ready

## Documentation

- Updated main README.md with deployment options
- Added architecture diagram showing scalable party pool
- Created comprehensive k8s/README.md with:
  - Quick start guide
  - Scaling instructions
  - Troubleshooting section
  - RBAC configuration details

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 06:12:49 -08:00
Developer 393c0ef04d > 2025-11-29 01:35:10 -08:00