rwadurian/backend/mpc-system
hailin 62b2a87e90 fix(android): 为 MainViewModel 添加 safeLaunch 异常处理 [P2]
【架构安全修复 - ViewModel 层协程异常处理】

## 问题背景

MainViewModel 使用的 viewModelScope 没有配置 CoroutineExceptionHandler:
- 未捕获的异常会导致应用崩溃
- 用户操作触发的异常体验最差
- 有 29 处 viewModelScope.launch 调用都存在风险

## 修复方案

### 1. 添加 safeLaunch 辅助函数

创建一个扩展函数自动捕获异常:

### 2. 替换关键的 viewModelScope.launch

将 14 个最关键的用户交互点改为使用 safeLaunch:

**已修复的函数:**
1. checkAllServices() - 服务初始化检查
2. connectToServer() - 连接服务器
3. createKeygenSession() - 创建密钥生成会话
4. validateInviteCode() - 验证邀请码
5. joinKeygen() - 加入密钥生成
6. joinSign() - 加入签名
7. initiateSignSession() - 发起签名会话
8. initiateSignSessionWithOptions() - 发起签名(带选项)
9. startSigningProcess() - 启动签名过程
10. prepareTransfer() - 准备转账
11. broadcastTransaction() - 广播交易
12. exportShareBackup() - 导出备份
13. importShareBackup() - 导入备份
14. confirmTransactionInBackground() - 后台确认交易

## 修复的崩溃场景

### 场景 1: 网络请求失败
- 原问题: 用户点击"创建钱包"时网络异常
- 修复前: 应用直接崩溃 
- 修复后: 显示"网络错误"提示,应用继续运行 

### 场景 2: 参数验证失败
- 原问题: 邀请码格式错误抛出 IllegalArgumentException
- 修复前: 应用崩溃 
- 修复后: 显示"参数错误"提示 

### 场景 3: 状态不一致
- 原问题: 快速切换页面导致状态异常
- 修复前: 应用崩溃,用户丢失数据 
- 修复后: 显示错误提示,状态可恢复 

### 场景 4: JSON 解析失败
- 原问题: 导入损坏的备份文件
- 修复前: 应用崩溃 
- 修复后: 显示"导入失败"提示 

## 双重保护机制

现在有两层保护:
1. **内层 try-catch** - 函数内部的具体业务异常处理
2. **外层 safeLaunch** - 捕获所有未处理的异常,防止崩溃

示例:

## 异常分类处理

根据异常类型提供友好的错误提示:
- SocketTimeoutException → "网络超时,请检查网络连接"
- UnknownHostException → "无法连接到服务器,请检查网络设置"
- IOException → "网络错误: {message}"
- IllegalStateException → "状态错误: {message}"
- IllegalArgumentException → "参数错误: {message}"
- 其他异常 → "操作失败: {message}"

## 影响范围

### 修改的代码位置
- MainViewModel.kt - 添加 safeLaunch 函数
- 14 个关键用户交互函数 - 替换 viewModelScope.launch 为 safeLaunch

### 行为变化
- BEFORE: 协程中未捕获异常导致应用崩溃
- AFTER: 异常被捕获,显示错误提示,应用继续运行

### 完全向后兼容
- 所有现有的 try-catch 逻辑保持不变
- 仅在异常未被捕获时才触发 safeLaunch 的处理
- 不影响正常的业务流程

## 测试验证

编译状态:  BUILD SUCCESSFUL in 29s
- 无编译错误
- 仅有警告 (unused parameters),不影响功能

## 与 TssRepository 形成完整防护

现在有两层完整的异常保护:
1. **TssRepository 层** - 后台协程的异常处理 (CoroutineExceptionHandler)
2. **MainViewModel 层** - UI 交互的异常处理 (safeLaunch)

用户操作流程:
用户点击按钮 → MainViewModel.safeLaunch (外层保护)
                 ↓
            Repository 调用 → repositoryScope (后台保护)
                 ↓
         双重保护,极大降低崩溃风险

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-26 22:09:52 -08:00
..
.claude refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing 2025-12-05 08:11:28 -08:00
api fix(mpc-system): GetSessionStatus 返回实际的 threshold_n 和 threshold_t 2025-12-29 11:59:53 -08:00
docs feat(mpc-system): add event sourcing for session tracking 2025-12-05 23:31:04 -08:00
k8s feat(mpc-system): implement party role labels with strict persistent-only default 2025-12-05 07:08:59 -08:00
migrations fix(migration): 使数据库迁移脚本幂等化,支持重复执行 2025-12-28 05:26:38 -08:00
pkg fix(tss): convert threshold to tss-lib format (threshold-1) in all keygen and signing 2025-12-31 12:19:58 -08:00
scripts fix: convert deploy.sh CRLF to LF and add executable permission 2025-12-07 07:01:13 -08:00
services fix(android): 为 MainViewModel 添加 safeLaunch 异常处理 [P2] 2026-01-26 22:09:52 -08:00
tests refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing 2025-12-05 08:11:28 -08:00
.env.example docs(config): update .env.example files for production deployment 2025-12-07 04:55:21 -08:00
.env.party.example feat(mpc-system): add signing parties configuration and delegate signing support 2025-12-05 22:47:55 -08:00
.env.prod.example feat(mpc-system): add signing parties configuration and delegate signing support 2025-12-05 22:47:55 -08:00
.gitignore refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing 2025-12-05 08:11:28 -08:00
DELEGATE_PARTY_GUIDE.md feat(mpc-system): implement delegate party for hybrid custody 2025-12-05 09:07:46 -08:00
MPC-Distributed-Signature-System-Complete-Spec.md refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing 2025-12-05 08:11:28 -08:00
MPC_INTEGRATION_GUIDE.md refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing 2025-12-05 08:11:28 -08:00
Makefile refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing 2025-12-05 08:11:28 -08:00
PARTY_ROLE_VERIFICATION_REPORT.md refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing 2025-12-05 08:11:28 -08:00
README.md feat(mpc-system): implement Kubernetes-based dynamic party pool architecture 2025-12-05 06:12:49 -08:00
TEST_REPORT.md refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing 2025-12-05 08:11:28 -08:00
VERIFICATION_REPORT.md refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing 2025-12-05 08:11:28 -08:00
config.example.yaml refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing 2025-12-05 08:11:28 -08:00
deploy.sh feat(mpc-system): add server-party-co-managed for co_managed_keygen sessions 2025-12-29 23:54:45 -08:00
docker-compose.party.yml chore(docker): 为 mpc-system、api-gateway、infrastructure 添加时区配置 2025-12-23 18:35:09 -08:00
docker-compose.prod.yml chore(docker): 为 mpc-system、api-gateway、infrastructure 添加时区配置 2025-12-23 18:35:09 -08:00
docker-compose.yml feat(mpc-system): add server-party-co-managed for co_managed_keygen sessions 2025-12-29 23:54:45 -08:00
get-docker.sh refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing 2025-12-05 08:11:28 -08:00
go.mod feat(mpc-system): implement party-driven architecture with SessionEvent broadcasting 2025-12-05 08:44:05 -08:00
go.sum feat(mpc-system): implement party-driven architecture with SessionEvent broadcasting 2025-12-05 08:44:05 -08:00
test_create_session.go feat: add keygen_session_id to signing session flow 2025-12-06 08:39:40 -08:00
test_real_scenario.sh refactor(mpc-system): migrate to party-driven architecture with PartyID-based routing 2025-12-05 08:11:28 -08:00
test_signing.go test: update signing test username 2025-12-06 10:54:22 -08:00
test_signing_parties_api.go fix: update test username for signing parties API test 2025-12-06 10:29:30 -08:00

README.md

MPC System Deployment Guide

Multi-Party Computation (MPC) system for secure threshold signature scheme (TSS) implementation in the RWADurian project.

Table of Contents

Overview

The MPC system implements a 2-of-3 threshold signature scheme where:

  • Server parties from a dynamically scalable pool hold key shares
  • At least 2 parties are required to generate signatures (configurable threshold)
  • User shares are generated dynamically and returned to the calling service
  • All shares are encrypted using AES-256-GCM

Key Features

  • Threshold Cryptography: Configurable N-of-M TSS for enhanced security
  • Dynamic Party Pool: Kubernetes-based service discovery for automatic party scaling
  • Distributed Architecture: Services communicate via gRPC and WebSocket
  • Secure Storage: AES-256-GCM encryption for all stored shares
  • API Authentication: API key and IP-based access control
  • Session Management: Coordinated multi-party computation sessions
  • MPC Protocol Compliance: DeviceInfo optional, aligning with international MPC standards

Architecture

┌────────────────────────────────────────────────────────────────┐
│                         MPC System                              │
│                                                                 │
│  ┌──────────────────┐        ┌──────────────────┐              │
│  │ Account Service  │        │ Server Party API │              │
│  │  (Port 4000)     │        │  (Port 8083)     │              │
│  │ External API     │        │ User Share Gen   │              │
│  └────────┬─────────┘        └────────┬─────────┘              │
│           │                           │                        │
│           ▼                           ▼                        │
│  ┌──────────────────┐        ┌──────────────────┐              │
│  │   Session        │◄──────►│ Message Router   │              │
│  │   Coordinator    │        │  (Port 8082)     │              │
│  │  (Port 8081)     │        │  WebSocket       │              │
│  └────────┬─────────┘        └────────┬─────────┘              │
│           │                           │                        │
│           ▼                           ▼                        │
│  ┌────────────────────────────────────────────┐                │
│  │   Server Party Pool (Dynamically Scalable) │                │
│  │   ┌──────────┐ ┌──────────┐ ┌──────────┐  │                │
│  │   │ Party 1  │ │ Party 2  │ │ Party 3  │  │  K8s Discovery │
│  │   │  (TSS)   │ │  (TSS)   │ │  (TSS)   │  │  Auto-selected │
│  │   └──────────┘ └──────────┘ └──────────┘  │  from pool     │
│  │   ┌──────────┐     ... can scale up/down  │                │
│  │   │ Party N  │                             │                │
│  │   └──────────┘                             │                │
│  └────────────────────────────────────────────┘                │
│                                                                 │
│  ┌────────────────────────────────────────────┐                │
│  │         Infrastructure Services            │                │
│  │  PostgreSQL  │  Redis  │  RabbitMQ         │                │
│  └────────────────────────────────────────────┘                │
└────────────────────────────────────────────────────────────────┘
                           │
                           │ Network Access
                           ▼
              ┌──────────────────────────┐
              │   Backend Services       │
              │   mpc-service (caller)   │
              └──────────────────────────┘

Deployment Options

This system supports two deployment modes:

Option 1: Docker Compose (Development/Simple Deployment)

  • Quick setup for development or simple production environments
  • Fixed 3 server parties (hardcoded IDs)
  • See instructions below in "Quick Start"

Option 2: Kubernetes (Production/Scalable Deployment)

  • Dynamic party pool with service discovery
  • Horizontally scalable server parties
  • Recommended for production environments
  • See k8s/README.md for detailed instructions

Quick Start (Docker Compose)

Prerequisites

  • Docker (version 20.10+)
  • Docker Compose (version 2.0+)
  • Network Access from backend services
  • Ports Available: 4000, 8081, 8082, 8083

1. Initial Setup

cd backend/mpc-system

# Create environment configuration
cp .env.example .env

# Edit configuration for your environment
nano .env

2. Configure Environment

Edit .env and update the following REQUIRED values:

# Database password (REQUIRED)
POSTGRES_PASSWORD=your_secure_postgres_password

# RabbitMQ password (REQUIRED)
RABBITMQ_PASSWORD=your_secure_rabbitmq_password

# JWT secret key (REQUIRED, min 32 chars)
JWT_SECRET_KEY=your_jwt_secret_key_at_least_32_characters

# Master encryption key (REQUIRED, exactly 64 hex chars)
# WARNING: If you lose this, encrypted shares cannot be recovered!
CRYPTO_MASTER_KEY=$(openssl rand -hex 32)

# API key for server-to-server auth (REQUIRED)
# Must match the MPC_API_KEY in your backend mpc-service config
MPC_API_KEY=your_api_key_matching_mpc_service

# Allowed IPs (REQUIRED - update to actual backend server IP!)
ALLOWED_IPS=192.168.1.111

3. Deploy Services

# Start all services
./deploy.sh up

# Check status
./deploy.sh status

# View logs
./deploy.sh logs

4. Verify Deployment

# Health check
./deploy.sh health

# Test API
./deploy.sh test-api

Configuration

All configuration is managed through .env file. See .env.example for complete documentation.

Critical Environment Variables

Variable Description Required Example
POSTGRES_PASSWORD Database password Yes openssl rand -base64 32
RABBITMQ_PASSWORD Message broker password Yes openssl rand -base64 32
JWT_SECRET_KEY JWT signing key (≥32 chars) Yes openssl rand -base64 48
CRYPTO_MASTER_KEY AES-256 key (64 hex chars) Yes openssl rand -hex 32
MPC_API_KEY API authentication key Yes openssl rand -base64 48
ALLOWED_IPS Comma-separated allowed IPs Yes 192.168.1.111,192.168.1.112
ENVIRONMENT Environment name No production (default)
REDIS_PASSWORD Redis password No Leave empty for internal network

Generating Secure Keys

# PostgreSQL & RabbitMQ passwords
openssl rand -base64 32

# JWT Secret Key
openssl rand -base64 48

# Master Encryption Key (MUST be exactly 64 hex characters)
openssl rand -hex 32

# API Key
openssl rand -base64 48

Configuration Checklist

Before deploying to production:

  • Change all default passwords
  • Generate secure CRYPTO_MASTER_KEY and back it up securely
  • Set MPC_API_KEY to match backend mpc-service configuration
  • Update ALLOWED_IPS to actual backend server IP(s)
  • Backup .env file to secure location (NOT in git!)

Deployment Commands

Basic Operations

./deploy.sh up          # Start all services
./deploy.sh down        # Stop all services
./deploy.sh restart     # Restart all services
./deploy.sh logs [svc]  # View logs (all or specific service)
./deploy.sh status      # Show service status
./deploy.sh health      # Health check all services

Build Commands

./deploy.sh build            # Build Docker images
./deploy.sh build-no-cache   # Rebuild without cache

Service Management

# Infrastructure only
./deploy.sh infra up    # Start postgres, redis, rabbitmq
./deploy.sh infra down  # Stop infrastructure

# MPC services only
./deploy.sh mpc up      # Start MPC services
./deploy.sh mpc down    # Stop MPC services
./deploy.sh mpc restart # Restart MPC services

Debugging

./deploy.sh logs-tail [service]  # Last 100 log lines
./deploy.sh shell [service]      # Open shell in container
./deploy.sh test-api             # Test Account Service API

Cleanup

# WARNING: This removes all data!
./deploy.sh clean

Services

External Services (Exposed Ports)

Service Port Protocol Purpose
account-service 4000 HTTP Main API for backend integration
session-coordinator 8081 HTTP/gRPC Session coordination
message-router 8082 WebSocket/gRPC Message routing
server-party-api 8083 HTTP User share generation

Internal Services

Service Purpose
server-party-1/2/3 TSS parties (Docker Compose mode - fixed IDs)
server-party-pool TSS party pool (Kubernetes mode - dynamic scaling)
postgres Database for session/account data
redis Cache and temporary data
rabbitmq Message broker for inter-service communication

Note: In Kubernetes mode, server parties are discovered dynamically using K8s service discovery. Parties can be scaled up/down without service interruption.

Service Dependencies

Infrastructure Services (postgres, redis, rabbitmq)
    ↓
Session Coordinator & Message Router
    ↓
Server Parties (1, 2, 3) & Server Party API
    ↓
Account Service (external API)

Security

Access Control

  1. IP Whitelisting: Only IPs in ALLOWED_IPS can access the API
  2. API Key Authentication: Requires valid MPC_API_KEY header
  3. Network Isolation: Services communicate within Docker network

Data Protection

  1. Encryption at Rest: All shares encrypted with AES-256-GCM
  2. Master Key: CRYPTO_MASTER_KEY must be securely stored and backed up
  3. Secure Transport: Use HTTPS/TLS for external communication

Best Practices

  • Never commit .env to version control
  • Backup CRYPTO_MASTER_KEY to multiple secure locations
  • Rotate API keys regularly
  • Use strong passwords (min 32 chars)
  • Restrict database ports (don't expose to internet)
  • Monitor failed authentication attempts
  • Enable audit logging

Key Backup

# Backup master key (CRITICAL!)
echo "CRYPTO_MASTER_KEY=$(grep CRYPTO_MASTER_KEY .env | cut -d= -f2)" > master_key.backup

# Store securely (encrypted USB, password manager, vault)
# NEVER store in plaintext on the server

Troubleshooting

Services won't start

# Check logs
./deploy.sh logs

# Check specific service
./deploy.sh logs postgres

# Common issues:
# 1. Ports already in use
# 2. .env file missing or misconfigured
# 3. Database initialization failed

Database connection errors

# Check postgres health
docker compose ps postgres

# View postgres logs
./deploy.sh logs postgres

# Restart infrastructure
./deploy.sh infra down
./deploy.sh infra up

API returns 403 Forbidden

# Check ALLOWED_IPS configuration
grep ALLOWED_IPS .env

# Verify caller's IP is in the list
# Update .env and restart:
./deploy.sh restart

API returns 401 Unauthorized

# Verify MPC_API_KEY matches between:
# 1. This system's .env
# 2. Backend mpc-service configuration

# Check API key
grep MPC_API_KEY .env

# Restart after updating
./deploy.sh restart

Keygen or signing fails

# Check all server parties are healthy
./deploy.sh health

# View server party logs
./deploy.sh logs server-party-1
./deploy.sh logs server-party-2
./deploy.sh logs server-party-3

# Check message router
./deploy.sh logs message-router

# Restart MPC services
./deploy.sh mpc restart

Lost master encryption key

CRITICAL: If CRYPTO_MASTER_KEY is lost, encrypted shares cannot be recovered!

Prevention:

  • Backup key immediately after generation
  • Store in multiple secure locations
  • Use enterprise key management system in production

Production Deployment

Pre-Deployment Checklist

  • Generate all secure keys and passwords
  • Backup CRYPTO_MASTER_KEY to secure locations
  • Configure ALLOWED_IPS for actual backend server
  • Sync MPC_API_KEY with backend mpc-service
  • Set up database backups
  • Configure log aggregation
  • Set up monitoring and alerts
  • Document recovery procedures
  • Test disaster recovery

Deployment Steps

Step 1: Prepare Environment

# On MPC server
git clone <repo> /opt/rwadurian
cd /opt/rwadurian/backend/mpc-system

# Configure environment
cp .env.example .env
nano .env  # Set all required values

# Generate and backup keys
openssl rand -hex 32 > master_key.txt
# Copy to secure storage, then delete:
# rm master_key.txt

Step 2: Deploy Services

# Build images
./deploy.sh build

# Start services
./deploy.sh up

# Verify all healthy
./deploy.sh health

Step 3: Configure Firewall

# Allow backend server to access MPC ports
sudo ufw allow from <BACKEND_IP> to any port 4000
sudo ufw allow from <BACKEND_IP> to any port 8081
sudo ufw allow from <BACKEND_IP> to any port 8082
sudo ufw allow from <BACKEND_IP> to any port 8083

# Deny all other external access
sudo ufw default deny incoming
sudo ufw enable

Step 4: Test Integration

# From backend server, test API access
curl -H "X-API-Key: YOUR_MPC_API_KEY" \
  http://<MPC_SERVER_IP>:4000/health

Monitoring

Monitor these metrics:

  • Service health status
  • API request rate and latency
  • Failed authentication attempts
  • Database connection pool usage
  • RabbitMQ queue depths
  • Key generation/signing success rates

Backup Strategy

# Database backup (daily)
docker compose exec postgres pg_dump -U mpc_user mpc_system > backup_$(date +%Y%m%d).sql

# Configuration backup
tar -czf config_backup_$(date +%Y%m%d).tar.gz .env kong.yml

# Encryption key backup (secure storage only!)

Disaster Recovery

  1. Service Failure: Restart affected service using ./deploy.sh restart
  2. Database Corruption: Restore from latest backup
  3. Key Loss: If CRYPTO_MASTER_KEY lost, all encrypted shares are unrecoverable
  4. Full System Recovery: Redeploy from backups, restore database

Performance Tuning

# docker-compose.yml - adjust resources
services:
  session-coordinator:
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 2G

API Reference

Account Service API (Port 4000)

# Health check
curl http://localhost:4000/health

# Create account (keygen)
curl -X POST http://localhost:4000/api/v1/accounts \
  -H "X-API-Key: YOUR_MPC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"user_id": "user123"}'

# Sign transaction
curl -X POST http://localhost:4000/api/v1/accounts/{account_id}/sign \
  -H "X-API-Key: YOUR_MPC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "tx_hash"}'

Server Party API (Port 8083)

# Generate user share
curl -X POST http://localhost:8083/api/v1/shares/generate \
  -H "X-API-Key: YOUR_MPC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"session_id": "session123"}'

Getting Help

  • Check logs: ./deploy.sh logs
  • Health check: ./deploy.sh health
  • View commands: ./deploy.sh help
  • Review .env.example for configuration options

License

Copyright © 2024 RWADurian. All rights reserved.