Major architectural refactoring to align with international MPC standards
and enable horizontal scalability.
## Core Changes
### 1. DeviceInfo Made Optional
- Modified DeviceInfo.Validate() to allow empty device information
- Aligns with international MPC protocol standards
- MPC protocol layer should not mandate device-specific metadata
- Location: services/session-coordinator/domain/entities/device_info.go
### 2. Kubernetes Party Discovery Service
- Created infrastructure/k8s/party_discovery.go (220 lines)
- Implements dynamic service discovery via Kubernetes API
- Supports in-cluster config and kubeconfig fallback
- Auto-refreshes party list every 30s (configurable)
- Health-aware selection (only ready pods)
- Uses pod names as unique party IDs
### 3. Party Pool Architecture
- Defined PartyPoolPort interface for abstraction
- CreateSessionUseCase now supports automatic party selection
- When no participants specified, selects from K8s pool
- Graceful fallback to dynamic join mode if discovery fails
- Location: services/session-coordinator/application/ports/output/party_pool_port.go
### 4. Integration Updates
- Modified CreateSessionUseCase to inject partyPool
- Updated session-coordinator main.go to initialize K8s discovery
- gRPC handler already supports optional participants
- Added k8s client-go dependencies (v0.29.0) to go.mod
## Kubernetes Deployment
### New K8s Manifests
- k8s/namespace.yaml: mpc-system namespace
- k8s/configmap.yaml: shared configuration
- k8s/secrets-example.yaml: secrets template
- k8s/server-party-deployment.yaml: scalable party pool (3+ replicas)
- k8s/session-coordinator-deployment.yaml: coordinator with RBAC
- k8s/README.md: comprehensive deployment guide
### RBAC Configuration
- ServiceAccount for session-coordinator
- Role with pods/services get/list/watch permissions
- RoleBinding to grant discovery capabilities
## Key Features
✅ Dynamic service discovery via Kubernetes API
✅ Horizontal scaling (kubectl scale deployment)
✅ No hardcoded party IDs
✅ Health-aware party selection
✅ Graceful degradation when K8s unavailable
✅ MPC protocol compliance (optional DeviceInfo)
## Deployment Modes
### Docker Compose (Existing)
- Fixed 3 parties (server-party-1/2/3)
- Quick setup for development
- Backward compatible
### Kubernetes (New)
- Dynamic party pool
- Auto-discovery and scaling
- Production-ready
## Documentation
- Updated main README.md with deployment options
- Added architecture diagram showing scalable party pool
- Created comprehensive k8s/README.md with:
- Quick start guide
- Scaling instructions
- Troubleshooting section
- RBAC configuration details
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Changes:
- Modified CreateAccountRequest to make email optional (omitempty)
- Changed Account.Email from string to *string pointer type
- Updated PostgreSQL repository to handle nullable email with sql.NullString
- Username remains required and auto-generated by identity-service
This supports anonymous account creation without requiring email registration.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added comprehensive documentation for MPC system integration:
- MPC_INTEGRATION_GUIDE.md: Complete integration guide for backend developers
* System architecture explanation
* Service responsibilities and relationships
* Standard MPC session types (keygen/sign/recovery)
* Integration examples (Go/Python/HTTP)
* Troubleshooting guide
- VERIFICATION_REPORT.md: System verification report
* Service health status checks
* API functionality verification
* E2E test issue analysis
* System maturity assessment
- test_real_scenario.sh: Real scenario test script
* Automated verification workflow
* Keygen session creation test
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Simplified participant list handling in JoinSession client
- Added debug logging for party_index conversion in gRPC messages
- Removed redundant party filtering logic
- Added detailed logging to trace protobuf field values
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Added debug logging to track participant details including party_index in:
- account service MPC keygen handler
- session coordinator gRPC client
- session coordinator gRPC handler
This helps debug the party index assignment issue where all parties
were receiving index 0 instead of unique indices (0, 1, 2).
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Update account_handler to use real gRPC calls instead of placeholders
- Add sessionCoordinatorClient field to AccountHTTPHandler
- Modify CreateKeygenSession to call session coordinator via gRPC
- Modify CreateSigningSession to call session coordinator via gRPC
- Modify GetSessionStatus to query real session data via gRPC
- Update main.go to initialize and pass sessionCoordinatorClient
- Remove separate mpc_handler.go (consolidated into account_handler)
- Regenerate protobuf files with gRPC service definitions
- Add proper imports for context, time, and grpc adapter
All MPC endpoints now create real sessions with JWT tokens and
can query actual session status from the session coordinator service.
Tested end-to-end: keygen session creation and status query working.
- Add SessionCoordinatorClient gRPC adapter with connection retry logic
- Implement MPCHandler with real gRPC calls to session-coordinator
- Replace placeholder implementation with actual session creation
- Add keygen and signing session endpoints with proper validation
- Include comprehensive implementation summary documentation
This enables account-service to create real MPC sessions via gRPC
instead of returning mock data. Requires main.go integration to activate.
- Add retry mechanism for PostgreSQL connections (10 retries, 2s base delay)
- Add retry mechanism for RabbitMQ connections (10 retries, 2s base delay)
- Add retry mechanism for Redis connections (10 retries, 2s base delay)
- Use exponential backoff: delay increases with each retry attempt
- Log detailed retry information (attempt number, max retries, errors)
- Redis continues without cache if all retries fail (non-critical)
- Database and RabbitMQ return error after all retries (critical)
This resolves startup failures when dependent services are slow to initialize,
particularly RabbitMQ which may pass health checks but not be fully ready.
- Add mkdir commands to create output directories
- Add paths=source_relative options for go_out and go-grpc_out
- This ensures *_grpc.pb.go files are generated correctly
- Fixes session-coordinator and message-router startup failures
Related: MPC services were failing to start due to missing gRPC service interface files
- Created .env.example files with comprehensive security warnings
- Removed hardcoded IP addresses and credentials from docker-compose files
- Made database passwords mandatory (fail-fast on missing config)
- Removed Chinese mirror sources from all Dockerfiles
- Enhanced deploy.sh scripts with .env validation and auto-creation
- Added comprehensive README.md deployment guides
- Changed ALLOWED_IPS default to enable cross-server deployment
- Updated all docker-compose files to use environment variables
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Script to enable/disable transparent proxy on 192.168.1.100, allowing
192.168.1.111 to access internet through Clash proxy without any
client-side configuration.
Usage:
sudo bash scripts/tproxy.sh on # Enable
sudo bash scripts/tproxy.sh off # Disable
sudo bash scripts/tproxy.sh status # Check status
sudo bash scripts/tproxy.sh config # Show required Clash config
Features:
- Redirects TCP traffic from specified clients to Clash redir port
- Optional DNS redirect to Clash DNS
- Bypasses local/private networks
- Easy on/off switching
Prerequisites:
- Clash running with redir-port and allow-lan enabled
- 192.168.1.100 configured as gateway for 192.168.1.111
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Problem: message-router and other services were using wrong ports (50051/8080)
instead of their configured ports (50052/8082) because mpc.env contained:
MPC_SERVER_HTTP_PORT=8080
MPC_SERVER_GRPC_PORT=50051
These global settings in mpc.env were overriding the per-service Environment=
settings in systemd unit files, causing port conflicts.
Solution:
- Remove MPC_SERVER_HTTP_PORT and MPC_SERVER_GRPC_PORT from mpc.env template
- Add fix-ports command to remove these settings from existing installations
- Add comments explaining per-service port configuration
Port assignments:
- session-coordinator: gRPC 50051, HTTP 8081
- message-router: gRPC 50052, HTTP 8082
- server-party-1/2/3: HTTP 8083/8084/8085
- account-service: HTTP 8080
To fix existing installation:
sudo bash scripts/deploy.sh fix-ports
sudo bash scripts/deploy.sh restart
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Changed sed patterns from matching specific placeholder strings to
matching entire lines (^KEY=.*), ensuring keys are properly replaced
regardless of current value.
Tested in WSL2 - generates valid 64-char hex master key.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Major changes:
- Add TSS core library (pkg/tss) with keygen and signing protocols
- Implement gRPC clients for Server Party service
- Add MPC session endpoints to Account service
- Deploy 3 Server Party instances in docker-compose
- Add MarkPartyReady and StartSession to proto definitions
- Complete integration tests for 2-of-3, 3-of-5, 4-of-7 thresholds
- Add comprehensive documentation (architecture, API, testing, deployment)
Test results:
- 2-of-3: PASSED (keygen 93s, signing 80s)
- 3-of-5: PASSED (keygen 198s, signing 120s)
- 4-of-7: PASSED (keygen 221s, signing 150s)
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add config.example.yaml with all configuration options documented
- Add server-party service main.go with HTTP endpoints
- Fix message-router gRPC handler registration
- All services now buildable and deployable via docker-compose
Test results:
- Unit tests: 3/3 PASS
- Integration tests: 26/26 PASS
- E2E tests: 8/8 PASS
- Docker build: All 4 services built successfully
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Fix CreateAccount to decode hex-encoded public key before storage
- Fix Login signature verification to hash challenge before verifying
- Return 401 instead of 400 for invalid hex format in login credentials
- Fix CompleteRecovery to handle direct transition from requested state
All 8 E2E tests now pass (100% pass rate):
- TestAccountRecoveryFlow, TestCompleteAccountFlow, TestDuplicateUsername, TestInvalidLogin
- TestCompleteKeygenFlow, TestExceedParticipantLimit, TestGetNonExistentSession, TestJoinSessionWithInvalidToken
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Add encoding/hex import to account handler
- Encode challenge as hex string in GenerateChallenge handler
- Decode hex-encoded challenge and signature in Login handler
- Decode hex-encoded public key in CompleteRecovery handler
This fixes compatibility between the test client (which uses hex encoding)
and the server handlers.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
The test was only reporting completion for one participant, but the session
requires ALL participants to report completion before transitioning to
"completed" status. This follows the domain logic in ShouldCompleteSession()
which checks session.AllCompleted().
Changes:
- Added reportCompletion calls for all 3 parties (party_user_device,
party_server, party_recovery)
- Updated test comment to clarify all participants must report completion
- Add sessionRepo to HTTP handler for database operations
- Implement MarkPartyReady handler to update participant status
- Implement StartSession handler to start MPC sessions
- Update CanStart() to accept participants in 'ready' status
- Make Start() method idempotent to handle automatic + explicit starts
- Fix repository injection through dependency chain in main.go
- Add party_id parameter to test completion request