Commit Graph

388 Commits

Author SHA1 Message Date
hailin 63e543933f fix(mpc-service): connect to mpc-system network
- Changed MPC URLs from 192.168.1.111 to Docker internal names
- Added mpc-system_mpc-network to mpc-service for cross-network communication

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-07 01:52:25 -08:00
hailin 2ae174692e fix(mpc-service): use correct env var name MPC_ACCOUNT_SERVICE_URL
Changed from MPC_SYSTEM_URL to MPC_ACCOUNT_SERVICE_URL to match docker-compose config.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-07 01:48:02 -08:00
hailin 3332124250 fix(identity-service): change dynamic import to static import
Dynamic imports with path aliases (@/domain/events) don't work at runtime.
Changed to static import to fix module resolution error.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-07 01:40:46 -08:00
hailin 742cc21395 fix(identity-service): extend avatar_url column to 2000 chars
SVG avatars can be up to 745+ characters, exceeding the previous 500 char limit.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-07 01:31:44 -08:00
hailin 34fc008f9d fix(blockchain-service): add global API prefix and increase healthcheck start_period
- Add app.setGlobalPrefix('api/v1') to main.ts so health endpoint
  is at /api/v1/health consistent with other services
- Increase healthcheck start_period to 60s to allow time for
  Prisma migrations on first startup

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 01:20:35 -08:00
hailin be6abd3034 fix(blockchain-service): standardize Dockerfile with other services
- Use node:20-slim instead of alpine for OpenSSL compatibility
- Add startup script with prisma migrate/push
- Increase healthcheck start-period to 60s
- Add non-root user for security

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 01:17:55 -08:00
hailin 1125dd98ef fix(blockchain-service): auto-create database and run migrations on startup
- Add rwa_blockchain to init-databases.sh script
- Change Dockerfile CMD to run prisma db push before starting app

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 01:14:50 -08:00
hailin e20e3cb7af fix(blockchain-service): add openssl and curl for Prisma and healthcheck 2025-12-07 01:09:25 -08:00
hailin 9eb2d5a206 fix(blockchain-service): import DomainModule for ConfirmationPolicyService 2025-12-07 01:06:45 -08:00
hailin 3af4234c89 feat: add blockchain-service to root docker-compose.yml 2025-12-07 00:58:23 -08:00
hailin dba9d16074 . 2025-12-07 00:40:19 -08:00
hailin 6451cd6fc3 refactor: unify docker-compose configs to use shared infrastructure
All microservices now use the shared rwa-network and connect to:
- rwa-postgres: Shared PostgreSQL database server
- rwa-redis: Shared Redis cache
- rwa-kafka: Shared Kafka message broker

Each service's docker-compose.yml now only defines the application
container and uses `networks: external: true` to connect to the
shared infrastructure defined in the root docker-compose.yml.

This prevents duplicate infrastructure containers and ensures all
services can communicate via Kafka and share the same Redis/PostgreSQL.

Services updated:
- admin-service
- backup-service
- blockchain-service
- identity-service
- leaderboard-service
- mpc-service
- presence-service

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 00:35:56 -08:00
hailin ce281b0657 fix(blockchain-service): install @scure/bip39 dependency
Add @scure/bip39 package for mnemonic-based address derivation.
Package.json already had the dependency listed but node_modules was missing it.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 00:25:27 -08:00
hailin bbd8a701a8 fix(blockchain-service): add @scure/bip39 dependency 2025-12-07 00:20:49 -08:00
hailin fcb949c799 fix(identity-service): remove WalletGeneratorService from app.module.ts 2025-12-07 00:15:08 -08:00
hailin 852073ae11 refactor: move mnemonic verification from identity-service to blockchain-service
- Add /internal/verify-mnemonic API to blockchain-service
- Add /internal/derive-from-mnemonic API to blockchain-service
- Create MnemonicDerivationAdapter for BIP39 mnemonic address derivation
- Create BlockchainClientService in identity-service to call blockchain-service
- Remove WalletGeneratorService from identity-service
- Update recover-by-mnemonic handler to use blockchain-service API

This enforces proper domain boundaries - all blockchain/crypto operations
are now handled by blockchain-service.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-07 00:11:06 -08:00
hailin a181fd0d2d fix(mpc-service): change healthcheck from wget to curl
Docker compose healthcheck was using wget which is not installed in the
node:20-slim image. Changed to use curl and corrected endpoint path.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 23:42:48 -08:00
hailin 5e93bbac33 fix(mpc-service): convert docker-entrypoint.sh line endings from CRLF to LF 2025-12-06 23:34:40 -08:00
hailin 54b9a66041 fix(backup-service): convert deploy.sh line endings from CRLF to LF 2025-12-06 23:29:58 -08:00
hailin 0ab1bf0dcc feat(mpc-service): add blockchain-service client for address derivation
- Add BlockchainClientService to call blockchain-service /internal/derive-address
- Call derive addresses after keygen completes with MPC public key
- Include derived addresses (BSC, KAVA, DST) in keygen completed event

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 23:27:30 -08:00
hailin cf308efecf refactor(identity-service): remove deposit/blockchain code, belongs to wallet-service
- Remove DepositController, DepositService, BlockchainQueryService
- Deposit address and balance queries should be in wallet-service
- identity-service now only handles user identity

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 23:22:06 -08:00
hailin 9ae26d0f1f refactor(identity-service): replace direct RPC with blockchain-service API calls
- Remove ethers.js direct RPC connection to blockchain
- Add HTTP client to call blockchain-service /balance API
- Add ConfigService for BLOCKCHAIN_SERVICE_URL configuration
- Enforce proper microservice boundaries

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 23:07:46 -08:00
hailin 383a9540a0 refactor: move backup-service client from identity-service to mpc-service
Architecture change: delegate share storage is now handled by mpc-service.
- identity-service no longer calls backup-service directly
- mpc-service calls backup-service after keygen completion
- This follows proper domain boundaries (MPC domain handles share storage)

Flow:
1. identity-service publishes mpc.KeygenRequested
2. mpc-service calls mpc-system for keygen
3. mpc-service stores delegate share to backup-service
4. mpc-service publishes mpc.KeygenCompleted
5. identity-service updates user wallet address

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 22:56:35 -08:00
hailin f4f0466616 fix(mpc-service): convert deploy.sh line endings from CRLF to LF
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 22:45:35 -08:00
hailin 32c806b90c fix(identity-service): add MpcEventConsumerService to app.module.ts
The InfrastructureModule was defined inline in app.module.ts, not using
the separate infrastructure.module.ts file. Added MpcEventConsumerService
to the inline module definition.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 22:36:01 -08:00
hailin 417f580df8 fix(identity-service): add MpcEventConsumerService to InfrastructureModule
Add missing MpcEventConsumerService provider to fix NestJS dependency injection error.
MpcClientService requires MpcEventConsumerService but it was not registered in the module.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 22:15:35 -08:00
hailin 10e4fa4a5f fix(identity-service): convert deploy.sh line endings from CRLF to LF
Fix bash interpreter error caused by Windows-style CRLF line endings

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 22:03:46 -08:00
hailin 39804aa981 fix(mobile-app): update share link domain to rwaapi.szaiai.com
Changed invite share URLs from rwa-durian.app to rwaapi.szaiai.com

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 21:15:55 -08:00
hailin 2e815cec6e feat: move address derivation from identity-service to blockchain-service
- Add Cosmos address derivation (bech32) to blockchain-service
  - KAVA: kava1... format
  - DST: dst1... format
  - BSC: 0x... EVM format

- Create MpcEventConsumerService in blockchain-service to consume mpc.KeygenCompleted events

- Create BlockchainEventConsumerService in identity-service to consume blockchain.WalletAddressCreated events

- Simplify identity-service MpcKeygenCompletedHandler to only manage status updates

- Add CosmosAddress value object for Cosmos chain addresses

Event flow:
1. identity-service -> mpc.KeygenRequested
2. mpc-service -> mpc.KeygenCompleted (with publicKey)
3. blockchain-service consumes mpc.KeygenCompleted, derives addresses
4. blockchain-service -> blockchain.WalletAddressCreated (with all chain addresses)
5. identity-service consumes blockchain.WalletAddressCreated, saves to user account

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 21:08:21 -08:00
hailin 50388c1115 feat(blockchain-service): implement complete blockchain service with DDD + Hexagonal architecture
- Domain layer: ChainType, EvmAddress, TxHash, TokenAmount, BlockNumber value objects
- Domain events: DepositDetected, DepositConfirmed, WalletAddressCreated, TransactionBroadcasted
- Aggregates: DepositTransaction, MonitoredAddress, TransactionRequest
- Infrastructure: Prisma ORM, Redis cache, Kafka messaging, EVM blockchain adapters
- Application services: AddressDerivation, DepositDetection, BalanceQuery
- API: Health, Balance, Internal controllers with Swagger documentation
- Deployment: Docker, docker-compose, deploy.sh, health-check scripts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 20:54:58 -08:00
hailin 6150617c14 docs: update blockchain-service guide with address derivation responsibilities
- Add public key → address derivation as primary responsibility
- Add AddressDerivationAdapter for EVM/Cosmos address derivation
- Add WalletAddressCreated event definition
- Add MPC event consumption (mpc.KeygenCompleted)
- Add MpcKeygenCompletedHandler for processing keygen events
- Add section 17: MPC integration event flow with diagrams
- Update document version to 1.2.0

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 20:02:50 -08:00
hailin 23043d5d79 feat: add detailed debug logging for MPC Kafka event flow
- Add comprehensive [INIT], [CONNECT], [PUBLISH], [RECEIVE], [HANDLE] logs
  to identity-service and mpc-service Kafka services
- Add KeygenStarted event for tracking keygen progress
- Add MpcKeygenCompletedHandler to process keygen completion events
- Fix topic routing for MPC events between services

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 19:49:06 -08:00
hailin 289691dc3c fix: align mpc-service migration with schema and fix identity-service compile errors
- Update mpc-service migration to match new gateway mode schema (mpc_wallets, mpc_shares)
- Remove old MySQL migrations (party_shares, session_states, share_backups)
- Fix MpcSignature type to use string format (64 bytes hex: R + S)
- Add persistence layer conversion functions for DB compatibility
- Fix method names in domain services (checkDeviceNotRegistered, generateNextUserSequence)
- Update wallet generator interface to use delegateShare instead of clientShareData

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 18:42:48 -08:00
hailin ba91a89b16 feat(wallet-service): add Redis caching for wallet queries
- Add ioredis dependency for Redis connectivity
- Create Redis service and module with DB 1 configuration
- Implement WalletCacheService for wallet data caching (60s TTL)
- Integrate cache-aside pattern in getMyWallet query
- Add cache invalidation on all wallet mutations:
  - handleDeposit, deductForPlanting, addRewards
  - claimRewards, settleRewards

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 18:37:13 -08:00
hailin c459387c42 feat: add event-driven communication between identity-service and mpc-service
Replace synchronous HTTP polling with Kafka event-driven model for MPC operations:

- Add MPC event consumer service in mpc-service for keygen/signing requests
- Add keygen-requested and signing-requested event handlers
- Add MPC event consumer in identity-service for completion events
- Extend mpc-client.service with async event-driven methods
- Support backward compatibility via MPC_USE_EVENT_DRIVEN env var

Topics: mpc.KeygenRequested, mpc.SigningRequested, mpc.KeygenCompleted,
        mpc.SigningCompleted, mpc.SessionFailed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 18:17:44 -08:00
hailin 17fd663fe3 refactor: improve auto-create API semantics and use real device ID
Frontend (account_service.dart):
- Use Android ID instead of random UUID for deviceId
- Add DeviceHardwareInfo class with full hardware details
- Remove provinceCode/cityCode from CreateAccountRequest
- Simplify to: deviceId (required), deviceName (optional JSON), inviterReferralCode (optional)

Backend (identity-service):
- Rename validateDeviceId() to checkDeviceNotRegistered() for clarity
- Rename generateNext() to generateNextUserSequence() for semantics
- Update error message: "该设备已创建过账户"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 18:05:11 -08:00
hailin bc82f58549 feat: add infrastructure components for observability and service discovery
Add modular infrastructure stack with:
- Consul: service discovery and configuration center
- Jaeger: distributed tracing
- Loki + Promtail: log aggregation
- Prometheus: metrics collection with alert rules
- Grafana: unified visualization dashboard

All components are optional and can be enabled on-demand using Docker profiles.
No changes required to existing microservices.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 17:51:29 -08:00
hailin 9c36e6772b refactor: simplify mpc-service to gateway mode
mpc-service 重新定位为网关服务,转发请求到 mpc-system:
- 简化 Prisma schema:只保留 MpcWallet 和 MpcShare
- 添加 delegate share 存储(keygen 后保存用户的 share)
- 保留六边形架构结构,清理不再需要的实现
- 删除旧的 command handlers、queries、repository 实现
- 简化 infrastructure module,只保留 PrismaService

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 17:16:14 -08:00
hailin d652f1d7a4 feat: add MPC coordinator service and keygen/signing API
- Add MPCController with keygen and sign endpoints
- Add MPCCoordinatorService to coordinate with mpc-system
- Add MpcWallet, MpcShare, MpcSession entities for data storage
- Update identity-service mpc-client to call mpc-service

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 16:03:17 -08:00
hailin 9fc41cfa53 fix: add keygen index to sorted index mapping for signing session
When signing with a subset of parties (e.g., party-1 and party-3 in 2-of-3),
the TSS library creates a sorted array of party IDs. Messages contain the
original keygen party index, but we need to map it to the sorted array index.

This fixes the 'invalid FromPartyIndex' error when signing with non-consecutive
party indices.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 11:04:19 -08:00
hailin f769c7eebf test: update signing test username 2025-12-06 10:54:22 -08:00
hailin ac4d9283dc fix: preserve original PartyIndex from keygen for signing sessions
- Add PartyIndex field to protobuf ParticipantInfo message
- Pass original PartyIndex from account shares to session coordinator
- Use original PartyIndex instead of loop variable when creating participants
- This fixes TSS signing failures when non-consecutive parties are selected
2025-12-06 10:45:05 -08:00
hailin 1d507a7afd test: update signing test to use wallet with configured parties 2025-12-06 10:34:14 -08:00
hailin 8dd1c50eb9 fix: update test username for signing parties API test 2025-12-06 10:29:30 -08:00
hailin 1044cfe635 fix: correct signing parties count validation to T+1 (required signers for TSS) 2025-12-06 10:20:21 -08:00
hailin 47a98da4e4 test: add signing parties API test script 2025-12-06 10:18:19 -08:00
hailin 93eab1931e test: update wallet username
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 10:08:17 -08:00
hailin dbe630dbd6 fix: add wait time before TSS protocol to prevent race condition
Wait 500ms after subscribing to messages to ensure all parties have
completed subscription before starting TSS protocol. This prevents
broadcast messages from being lost when some parties haven't subscribed yet.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 10:04:10 -08:00
hailin 0e8dff0371 test: update wallet username for signing test
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 10:01:56 -08:00
hailin 98731cc133 debug: add more logging to message broker for broadcast diagnostics
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 09:57:34 -08:00
hailin c257ad1639 test: update test_signing.go with new wallet username
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 09:52:58 -08:00
hailin 378970048b debug: add TSS signing debug logs to diagnose stuck issue
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 09:41:31 -08:00
hailin f70ece0d4f test: update test_signing.go to use current wallet username
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 09:33:58 -08:00
hailin fd74bc825a chore: add detailed logging for keygen_session_id tracing
Add logging at key points to trace keygen_session_id flow:
- Account Handler: log keygen_session_id when creating signing session
- Session Coordinator: log keygen_session_id in CreateSession and JoinSession
- Message Router: log keygen_session_id when proxying JoinSession
- Server Party: log keygen_session_id when joining session

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 09:21:19 -08:00
hailin a1b2b760ab feat(migration): add keygen_session_id column to mpc_sessions table
For sign sessions, this column stores the reference to the keygen session
whose key shares should be used for signing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-06 09:16:31 -08:00
hailin 3d176e1132 feat: complete keygen_session_id implementation for signing sessions
- Regenerate protobuf Go code with KeygenSessionId fields
- Session Coordinator correctly parses, stores, and returns keygen_session_id
- Message Router Client parses keygen_session_id in JoinSession response
- participate_signing.go uses keygen_session_id for precise share lookup
- Database schema already includes keygen_session_id column

This fixes the signing issue where wrong keyshares were loaded for multi-account scenarios.
2025-12-06 08:57:30 -08:00
hailin 23eff00d76 feat: add KeygenSessionID to MPCSession entity
- Add KeygenSessionID field to MPCSession struct for tracking which keygen's shares to use
- This is the first step in完整的修复流程
2025-12-06 08:40:38 -08:00
hailin 382386733d feat: add keygen_session_id to signing session flow
- Add keygen_session_id field to CreateSessionRequest and SessionInfo protobuf
- Modify CreateSigningSessionAuto to accept and pass keygenSessionID
- Update Account Handler to pass account's keygen_session_id when creating signing session
- This enables parties to load the correct keyshare by session ID
2025-12-06 08:39:40 -08:00
hailin 7660868a38 fix(account): select t+1 parties for threshold signing
TSS threshold semantics: for threshold parameter t, the required number of signers is t+1.
For 2-of-3 with t=2, we need 2+1=3 signers (all parties must participate).

Previous error: 't+1=3 is not satisfied by the key count of 2'
Fix: Changed from selecting t parties to selecting t+1 parties.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 07:46:32 -08:00
hailin 0ea64e02ae fix(account): use only threshold_t parties for signing instead of all active parties
For 2-of-3 threshold signing, only 2 parties should participate in signing, not all 3. This fixes the 'failed to calculate Bob_mid' error that occurred when all parties tried to sign.

Changes:
- Modified CreateSigningSession to select exactly threshold_t parties when no signing config exists
- For 2-of-3: now selects 2 parties instead of all 3
- Added logging to show party selection details

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 07:35:03 -08:00
hailin 672b6e1630 feat(schema): make email field optional in accounts table
Only username is required, all other fields (email, phone, public_key, etc.) are now optional.

Changes:
- Modified 001_init_schema.up.sql to remove NOT NULL constraints
- Added partial unique index for email (only for non-NULL values)
- Created migration 006_make_email_optional for existing databases
- Set default status to 'active'

This allows automatic account creation from keygen without requiring user info.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 07:16:34 -08:00
hailin eb63b9341b fix(tss): correct threshold signing to support t-of-n properly
Previously, signing incorrectly required all n parties from keygen to participate. For 2-of-3 threshold, it required all 3 parties instead of just 2.

Root cause: tss.NewParameters was using len(currentSigners) instead of the original n from keygen.

Changes:
- Added TotalParties field to SigningConfig to store original n from keygen
- Modified participate_signing.go to read threshold_n from database
- Updated tss.NewParameters to use TotalParties instead of current signer count
- Added logging to show t, n, and current_signers

For 2-of-3: threshold_t=2, threshold_n=3, any 2 parties can now sign.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 07:16:24 -08:00
hailin 6fdd2905b1 test(signing): add signing session test script
Created test_signing.go to test MPC signing functionality:
- Generates JWT token for authentication
- Creates SHA-256 hash of test message
- Calls POST /api/v1/mpc/sign API
- Tests signing with persistent parties (non-delegate mode)

Usage: go run test_signing.go

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 06:58:54 -08:00
hailin e786219f37 debug(keygen): add detailed logging for message flow tracking
Added comprehensive debug logging to track message conversion and
party index mapping in keygen protocol:

1. Log party index map construction with all participants
2. Log received MPC messages before conversion
3. Log when messages are dropped due to unknown sender
4. Log successful message conversion and TSS forwarding
5. Show known_parties map when dropping messages

This will help identify why delegate party receives messages but
doesn't process them during keygen.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 06:45:23 -08:00
hailin 5344af465b fix(server-party): fix context leak in GetPendingMessages acknowledgment
Fixed the acknowledgment goroutine in GetPendingMessages to use parent
context instead of context.Background(), preventing orphan goroutines
that can't be cancelled.

This completes all context bug fixes:
- server-party-api event handler (commit 450163a)
- server-party event handler (commit 99ff3ac)
- message acknowledgment in SubscribeMessages (commit 450163a)
- message acknowledgment in GetPendingMessages (this commit)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 06:42:07 -08:00
hailin 99ff3ac130 fix(server-party): use parent context in event handler for proper cancellation
- Fixed server-party event handler to use parent context with timeout
- Prevents orphan goroutines when session fails or party exits
- Consistent with server-party-api fix
2025-12-06 06:39:23 -08:00
hailin 450163a94d fix(context): use parent context instead of Background() to allow proper cancellation
- Fixed delegate party event handler to use parent context with timeout
- Fixed message acknowledgment to use parent context
- Prevents orphan goroutines when session fails or party exits
- Resolves system crash after delegate party failure
2025-12-06 06:36:34 -08:00
hailin 3adc091140 fix(docker): add PARTY_ROLE environment variable for server-party-api
Add PARTY_ROLE=delegate environment variable to server-party-api service
to fix nil pointer dereference when determining party role during keygen.

Without this variable, the party defaults to "persistent" role which tries
to access keyShareRepo (nil for delegate parties), causing a panic.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 06:00:28 -08:00
hailin 13e81e37c9 fix(db): update repository to save and load delegate_party_id field
Update session repository to properly handle delegate_party_id column:
- Add delegate_party_id to Save method INSERT and UPDATE statements
- Add DelegatePartyID field to sessionRow struct
- Update FindByUUID, FindByStatus, FindExpired, FindActive SELECT queries
- Update scanSessions method to scan and pass delegate_party_id
- Remove placeholder empty string, now loads actual value from database

This completes the delegate party functionality by ensuring the delegate party ID
is persisted and retrieved correctly from the database.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 05:52:34 -08:00
hailin 391448063f feat(db): add delegate_party_id column to mpc_sessions table
Add delegate_party_id column to track which party is acting as delegate
(generates and returns user share instead of storing it).

Changes:
- Add delegate_party_id VARCHAR(255) column with default empty string
- Add partial index for faster lookups when delegate party is present
- Include up and down migrations

This fixes the issue where delegate party selection worked but the delegate_party
field was not being returned in API responses due to missing database column.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 05:50:01 -08:00
hailin 36e1359f43 fix(session-coordinator): pass PartyComposition from gRPC request to use case
Fixed critical bug where PartyComposition (persistent/delegate party counts) was being sent
by account-service in gRPC request but was not being extracted and passed to the CreateSession
use case, causing delegate party selection to fail.

Changes:
- Extract PartyComposition from protobuf request and pass to CreateSessionInput
- Add logging for party composition values in gRPC handler
- Return delegate_party_id and selected_parties in CreateSessionResponse
- Load session after creation to get delegate party ID

This fixes the issue where require_delegate=true had no effect and all parties selected
were persistent parties instead of 2 persistent + 1 delegate.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 05:38:38 -08:00
hailin c5d3840835 fix(docker-compose): add ACCOUNT_SERVICE_ADDR to session-coordinator
- Add ACCOUNT_SERVICE_ADDR environment variable pointing to account-service:8080
- Fixes "connection refused" error when session-coordinator tries to auto-create account after keygen
- Session-coordinator can now properly call account service to create account records

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 05:21:06 -08:00
hailin b8d66921e0 feat(docker-compose): add PARTY_ID to server-party-api configuration
- Add explicit PARTY_ID environment variable for delegate party
- Set PARTY_ID=delegate-party for server-party-api service
- This ensures the delegate party properly registers to Message Router party pool
- Enables delegate party selection for keygen sessions with require_delegate=true

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 05:17:03 -08:00
hailin 5f12404be4 fix: remove dynamic participant join to fix concurrent party_index assignment
- Remove dynamic participant addition in JoinSession
- Participants must be pre-created in CreateSession
- Add ErrPartyNotInvited error for unauthorized join attempts
- Fix Redis adapter to include version parameter in ReconstructSession
- This fixes VSS verification failures caused by inconsistent party indices
2025-12-06 04:54:40 -08:00
hailin b72268c1ce feat(mpc-system): implement optimistic locking for session updates
Implement version-based optimistic locking to prevent concurrent update conflicts
when multiple parties simultaneously report completion during keygen operations.

Changes:
- Add version column to mpc_sessions table (migration 004)
- Add Version field to MPCSession entity
- Define ErrOptimisticLockConflict error
- Update SessionPostgresRepo.Update() to check version and increment on success
- Add automatic retry logic (max 3 attempts) to ReportCompletionUseCase
- Update Save and all query methods (FindByStatus, FindExpired, etc.) to handle version field

This replaces pessimistic locking (FOR UPDATE) with optimistic locking using
the industry-standard pattern: WHERE version = $n and checking rowsAffected.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 04:16:32 -08:00
hailin 63e00a64f5 fix(test): update JWT secret to match .env configuration
Fixed JWT secret in test_create_session.go to use the same secret key
as configured in .env file, resolving 401 Unauthorized errors during
keygen session creation tests.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 03:54:12 -08:00
hailin 77fa40d27f test(logger): set Development=true to test if it affects debug logging
Changed Development from false to true to test if this is preventing
debug logs from being output. Development mode may affect how the
logger handles different log levels.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 03:41:50 -08:00
hailin 47dd2d1cb5 test(logger): add internal debug test immediately after Build()
Added Log.Debug() and Log.Info() calls immediately after Build()
to test if the logger can output debug logs right after creation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 03:41:00 -08:00
hailin 3a247562ea debug(logger): add AtomicLevel tracking to diagnose level changes
Added debug output to track:
1. AtomicLevel value when created
2. AtomicLevel value after Build()
3. Log.Level() value after Build()

This will help identify if Build() or something else is changing the level.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 03:32:45 -08:00
hailin bfe129da51 test(logger): add debug log test to verify debug level works
Added test debug log immediately after logger initialization.
If debug logging is working, we should see this message.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 03:23:24 -08:00
hailin bac623f63c debug(logger): add detailed debug output for level initialization
Added println statements to trace:
1. Level value after UnmarshalText
2. Logger level after Build()

This will help diagnose why debug level is not being applied.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 03:16:05 -08:00
hailin ac1858a19e fix(logger): remove init() function that was overriding config level
Problem: Logger was always using info level despite MPC_LOGGER_LEVEL=debug
Root cause: The init() function in logger.go was calling InitProduction()
which created a zap.NewProduction() logger with hardcoded info level.
This happened before main() called logger.Init(cfg), so the config was
being ignored.

Solution:
1. Removed init() function to prevent early logger initialization
2. Added zap.ReplaceGlobals() in Init() to ensure config takes effect
3. Removed unused "os" import

References:
- https://pkg.go.dev/go.uber.org/zap
- https://stackoverflow.com/questions/57745017/

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 03:07:30 -08:00
hailin 32e3970f34 debug: add logger level debug output 2025-12-06 02:56:26 -08:00
hailin 5764f3d50d chore: set logger level to debug for debugging 2025-12-06 02:42:04 -08:00
hailin e321501c32 chore: set default environment to development for debug logging 2025-12-06 02:37:36 -08:00
hailin 6df7355abe fix: add username field to keygen request 2025-12-06 02:35:03 -08:00
hailin ac64c2d012 fix: add Authorization header to test_create_session.go 2025-12-06 02:33:59 -08:00
hailin fb9c85f883 debug(coordinator): add detailed logging to track concurrent update issue
Add comprehensive debug logs to:
1. report_completion.go - log all participant statuses at key points
2. session_postgres_repo.go - log before/after each participant update

This will help identify why server-party-1 status remains 'invited'
despite successfully reporting completion.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 02:11:28 -08:00
hailin 380bf46fb6 fix(coordinator): add row-level locking to prevent concurrent update conflicts
Problem:
Multiple parties reporting completion simultaneously caused lost updates
because each transaction would read the full session, modify their
participant status, then update ALL participants - causing last-write-wins
behavior.

Solution:
Add SELECT ... FOR UPDATE locks on both mpc_sessions and participants
tables at the start of the Update transaction. This serializes concurrent
updates and prevents lost updates.

Lock order:
1. Lock session row (FOR UPDATE)
2. Lock all participant rows for this session (FOR UPDATE)
3. Perform updates
4. Commit (releases locks)

This ensures that concurrent ReportCompletion calls are fully serialized
and each participant status update is preserved.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 01:58:05 -08:00
hailin aab88834f9 fix(coordinator): prevent lost updates in concurrent participant status changes
Fix critical concurrency bug where simultaneous ReportCompletion calls from
multiple parties could cause lost database updates. Changed from UPSERT-all
to UPDATE-individual pattern to ensure each participant status update is
atomic and won't be overwritten by concurrent transactions.

Before: All participants were UPSERTed in single transaction, causing
last-commit-wins behavior that lost earlier status updates.

After: Each participant is UPDATEd individually using UPDATE...WHERE, then
INSERT only if row doesn't exist. This prevents concurrent updates to
different participants from conflicting.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 01:35:38 -08:00
hailin 00b48bab50 fix(coordinator): handle all participant states in ReportCompletion with proper state transitions
- Add switch-case to handle Invited, Joined, and Ready states
- Auto-transition Invited -> Joined -> Ready -> Completed
- Auto-transition Joined -> Ready -> Completed
- Auto-transition Ready -> Completed
- Return error for invalid states (Failed, Completed, etc.)
- Fixes 'cannot transition to completed status' error
- Applies to all parties including server-party-api
2025-12-06 01:09:49 -08:00
hailin 4e14212147 fix(coordinator): auto-transition participant to Ready before Completed
ReportCompletion was failing with "cannot transition to completed status"
because participants were in Joined state trying to transition directly to
Completed, which violates the state machine flow: Joined -> Ready -> Completed.

Changes:
- Check participant status before marking as Completed
- Auto-transition Joined -> Ready if needed
- Then transition Ready -> Completed
- Add debug logging for auto-transition

This fixes the error seen during keygen completion.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 00:33:22 -08:00
hailin 8e683064ed chore: regenerate coordinator proto with party_index field
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 00:11:24 -08:00
hailin 78119bc6a4 fix(proto): add party_index to JoinSessionResponse for correct index assignment
The JoinSessionResponse from coordinator was missing party_index field,
causing message router to try finding self's index in OtherParties (which
only contains other parties). This resulted in incorrect party index
assignment leading to "duplicate indexes" error in TSS keygen.

Changes:
- Add party_index field to coordinator's JoinSessionResponse proto
- Coordinator now includes PartyIndex in gRPC response
- Message router uses party_index from coordinator instead of searching

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 00:08:47 -08:00
hailin b51d5687b2 fix(server-party): include self in participants list for keygen
The JoinSession response contains OtherParties (excluding self) and
PartyIndex (self's index). The participants list passed to TSS keygen
must include all parties including self, otherwise validation fails
with "invalid party count" error.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-06 00:01:14 -08:00
hailin 54061b4c16 feat(mpc-system): add event sourcing for session tracking
- Add SessionEventRepository interface for append-only event storage
- Implement PostgreSQL session_event_repo with immutable event log
- Add database migration for session_events table with indexes
- Record events for keygen and sign session creation
- Record events for signing-config APIs (set, update, clear)
- Wire up sessionEventRepo in main.go and account handler
- Update API documentation with event sourcing design

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 23:31:04 -08:00
hailin aa74e2b2e2 feat(mpc-system): add signing parties configuration and delegate signing support
- Add signing-config API endpoints (POST/PUT/DELETE/GET) for configuring
  which parties should participate in signing operations
- Add SigningParties field to Account entity with database migration
- Modify CreateSigningSession to use configured parties if set,
  otherwise use all active parties (backward compatible)
- Add delegate party signing support: user provides encrypted share
  at sign time for delegate party to use
- Update protobuf definitions for DelegateUserShare in session events
- Add ShareTypeDelegate to support hybrid custody model

API endpoints:
- POST /accounts/:id/signing-config - Set signing parties (first time)
- PUT /accounts/:id/signing-config - Update signing parties
- DELETE /accounts/:id/signing-config - Clear config (use all parties)
- GET /accounts/:id/signing-config - Get current configuration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 22:47:55 -08:00
hailin 55f5ec49f2 chore(mpc-system): remove duplicate protobuf generated files
Remove redundant .pb.go files from api/proto/ directory.
The actual generated files are in api/grpc/coordinator/v1/ and api/grpc/router/v1/.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 20:47:55 -08:00
hailin 135e821386 feat(mpc-system): integrate reliability mechanisms and enable party-driven architecture
- Enable SubscribeSessionEvents for automatic session participation
- Integrate heartbeat mechanism with pending message count
- Add ACK sending after message receipt for reliable delivery
- Add party activity tracking in session coordinator
- Add CountPendingByParty for heartbeat response
- Add retry package with exponential backoff for gRPC clients
- Add memory-based message broker and event publisher adapters
- Add account service integration for keygen completion
- Add party timeout checking background job
- Add notification service stub for future implementation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 20:30:03 -08:00
hailin 34f0f7b897 chore(mpc-system): update Dockerfiles to Go 1.24 and fix line endings
- Update all Dockerfiles from Go 1.21 to Go 1.24 (required by go.mod)
- Fix line endings in deploy.sh and .env.example for Unix compatibility

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-05 16:40:32 -08:00