rwadurian/backend/mpc-system
hailin b4d6b0f264 feat(mpc-system): add connection retry logic with exponential backoff
- Add retry mechanism for PostgreSQL connections (10 retries, 2s base delay)
- Add retry mechanism for RabbitMQ connections (10 retries, 2s base delay)
- Add retry mechanism for Redis connections (10 retries, 2s base delay)
- Use exponential backoff: delay increases with each retry attempt
- Log detailed retry information (attempt number, max retries, errors)
- Redis continues without cache if all retries fail (non-critical)
- Database and RabbitMQ return error after all retries (critical)

This resolves startup failures when dependent services are slow to initialize,
particularly RabbitMQ which may pass health checks but not be fully ready.
2025-12-04 23:12:15 -08:00
..
.claude . 2025-11-29 07:14:44 -08:00
api feat: Complete MPC TSS implementation with t-of-n threshold signing 2025-11-29 06:57:53 -08:00
docs docs: Add detailed DDD + Hexagonal architecture documentation 2025-11-29 07:13:38 -08:00
migrations > 2025-11-29 01:35:10 -08:00
pkg feat: Complete MPC TSS implementation with t-of-n threshold signing 2025-11-29 06:57:53 -08:00
scripts fix(tproxy): detect clash process with any name (clash-linux-amd64, etc.) 2025-12-02 01:28:17 -08:00
services feat(mpc-system): add connection retry logic with exponential backoff 2025-12-04 23:12:15 -08:00
tests feat: Complete MPC TSS implementation with t-of-n threshold signing 2025-11-29 06:57:53 -08:00
.env.example refactor: separate configuration from code following 12-Factor App principles 2025-12-04 21:46:35 -08:00
.gitignore chore(mpc-system): 添加 .gitignore 排除敏感配置 2025-12-03 17:25:12 -08:00
MPC-Distributed-Signature-System-Complete-Spec.md > 2025-11-29 01:35:10 -08:00
Makefile fix(mpc-system): fix protobuf generation in Makefile to generate gRPC service files 2025-12-04 22:54:59 -08:00
README.md refactor: separate configuration from code following 12-Factor App principles 2025-12-04 21:46:35 -08:00
TEST_REPORT.md > 2025-11-29 01:35:10 -08:00
config.example.yaml feat: Complete production deployment configuration 2025-11-29 04:10:06 -08:00
deploy.sh refactor: separate configuration from code following 12-Factor App principles 2025-12-04 21:46:35 -08:00
docker-compose.yml refactor: separate configuration from code following 12-Factor App principles 2025-12-04 21:46:35 -08:00
get-docker.sh > 2025-11-29 01:35:10 -08:00
go.mod feat: Complete MPC TSS implementation with t-of-n threshold signing 2025-11-29 06:57:53 -08:00
go.sum feat: Complete MPC TSS implementation with t-of-n threshold signing 2025-11-29 06:57:53 -08:00

README.md

MPC System Deployment Guide

Multi-Party Computation (MPC) system for secure threshold signature scheme (TSS) implementation in the RWADurian project.

Table of Contents

Overview

The MPC system implements a 2-of-3 threshold signature scheme where:

  • 3 server parties hold key shares
  • At least 2 parties are required to generate signatures
  • User shares are generated dynamically and returned to the calling service
  • All shares are encrypted using AES-256-GCM

Key Features

  • Threshold Cryptography: 2-of-3 TSS for enhanced security
  • Distributed Architecture: Services communicate via gRPC and WebSocket
  • Secure Storage: AES-256-GCM encryption for all stored shares
  • API Authentication: API key and IP-based access control
  • Session Management: Coordinated multi-party computation sessions

Architecture

┌────────────────────────────────────────────────────────────────┐
│                         MPC System                              │
│                                                                 │
│  ┌──────────────────┐        ┌──────────────────┐              │
│  │ Account Service  │        │ Server Party API │              │
│  │  (Port 4000)     │        │  (Port 8083)     │              │
│  │ External API     │        │ User Share Gen   │              │
│  └────────┬─────────┘        └────────┬─────────┘              │
│           │                           │                        │
│           ▼                           ▼                        │
│  ┌──────────────────┐        ┌──────────────────┐              │
│  │   Session        │◄──────►│ Message Router   │              │
│  │   Coordinator    │        │  (Port 8082)     │              │
│  │  (Port 8081)     │        │  WebSocket       │              │
│  └────────┬─────────┘        └────────┬─────────┘              │
│           │                           │                        │
│           ▼                           ▼                        │
│  ┌────────────────────────────────────────────┐                │
│  │          Server Parties (3 instances)      │                │
│  │   ┌──────────┐ ┌──────────┐ ┌──────────┐  │                │
│  │   │ Party 1  │ │ Party 2  │ │ Party 3  │  │                │
│  │   │  (TSS)   │ │  (TSS)   │ │  (TSS)   │  │                │
│  │   └──────────┘ └──────────┘ └──────────┘  │                │
│  └────────────────────────────────────────────┘                │
│                                                                 │
│  ┌────────────────────────────────────────────┐                │
│  │         Infrastructure Services            │                │
│  │  PostgreSQL  │  Redis  │  RabbitMQ         │                │
│  └────────────────────────────────────────────┘                │
└────────────────────────────────────────────────────────────────┘
                           │
                           │ Network Access
                           ▼
              ┌──────────────────────────┐
              │   Backend Services       │
              │   mpc-service (caller)   │
              └──────────────────────────┘

Quick Start

Prerequisites

  • Docker (version 20.10+)
  • Docker Compose (version 2.0+)
  • Network Access from backend services
  • Ports Available: 4000, 8081, 8082, 8083

1. Initial Setup

cd backend/mpc-system

# Create environment configuration
cp .env.example .env

# Edit configuration for your environment
nano .env

2. Configure Environment

Edit .env and update the following REQUIRED values:

# Database password (REQUIRED)
POSTGRES_PASSWORD=your_secure_postgres_password

# RabbitMQ password (REQUIRED)
RABBITMQ_PASSWORD=your_secure_rabbitmq_password

# JWT secret key (REQUIRED, min 32 chars)
JWT_SECRET_KEY=your_jwt_secret_key_at_least_32_characters

# Master encryption key (REQUIRED, exactly 64 hex chars)
# WARNING: If you lose this, encrypted shares cannot be recovered!
CRYPTO_MASTER_KEY=$(openssl rand -hex 32)

# API key for server-to-server auth (REQUIRED)
# Must match the MPC_API_KEY in your backend mpc-service config
MPC_API_KEY=your_api_key_matching_mpc_service

# Allowed IPs (REQUIRED - update to actual backend server IP!)
ALLOWED_IPS=192.168.1.111

3. Deploy Services

# Start all services
./deploy.sh up

# Check status
./deploy.sh status

# View logs
./deploy.sh logs

4. Verify Deployment

# Health check
./deploy.sh health

# Test API
./deploy.sh test-api

Configuration

All configuration is managed through .env file. See .env.example for complete documentation.

Critical Environment Variables

Variable Description Required Example
POSTGRES_PASSWORD Database password Yes openssl rand -base64 32
RABBITMQ_PASSWORD Message broker password Yes openssl rand -base64 32
JWT_SECRET_KEY JWT signing key (≥32 chars) Yes openssl rand -base64 48
CRYPTO_MASTER_KEY AES-256 key (64 hex chars) Yes openssl rand -hex 32
MPC_API_KEY API authentication key Yes openssl rand -base64 48
ALLOWED_IPS Comma-separated allowed IPs Yes 192.168.1.111,192.168.1.112
ENVIRONMENT Environment name No production (default)
REDIS_PASSWORD Redis password No Leave empty for internal network

Generating Secure Keys

# PostgreSQL & RabbitMQ passwords
openssl rand -base64 32

# JWT Secret Key
openssl rand -base64 48

# Master Encryption Key (MUST be exactly 64 hex characters)
openssl rand -hex 32

# API Key
openssl rand -base64 48

Configuration Checklist

Before deploying to production:

  • Change all default passwords
  • Generate secure CRYPTO_MASTER_KEY and back it up securely
  • Set MPC_API_KEY to match backend mpc-service configuration
  • Update ALLOWED_IPS to actual backend server IP(s)
  • Backup .env file to secure location (NOT in git!)

Deployment Commands

Basic Operations

./deploy.sh up          # Start all services
./deploy.sh down        # Stop all services
./deploy.sh restart     # Restart all services
./deploy.sh logs [svc]  # View logs (all or specific service)
./deploy.sh status      # Show service status
./deploy.sh health      # Health check all services

Build Commands

./deploy.sh build            # Build Docker images
./deploy.sh build-no-cache   # Rebuild without cache

Service Management

# Infrastructure only
./deploy.sh infra up    # Start postgres, redis, rabbitmq
./deploy.sh infra down  # Stop infrastructure

# MPC services only
./deploy.sh mpc up      # Start MPC services
./deploy.sh mpc down    # Stop MPC services
./deploy.sh mpc restart # Restart MPC services

Debugging

./deploy.sh logs-tail [service]  # Last 100 log lines
./deploy.sh shell [service]      # Open shell in container
./deploy.sh test-api             # Test Account Service API

Cleanup

# WARNING: This removes all data!
./deploy.sh clean

Services

External Services (Exposed Ports)

Service Port Protocol Purpose
account-service 4000 HTTP Main API for backend integration
session-coordinator 8081 HTTP/gRPC Session coordination
message-router 8082 WebSocket/gRPC Message routing
server-party-api 8083 HTTP User share generation

Internal Services

Service Purpose
server-party-1 TSS party 1 (stores server shares)
server-party-2 TSS party 2 (stores server shares)
server-party-3 TSS party 3 (stores server shares)
postgres Database for session/account data
redis Cache and temporary data
rabbitmq Message broker for inter-service communication

Service Dependencies

Infrastructure Services (postgres, redis, rabbitmq)
    ↓
Session Coordinator & Message Router
    ↓
Server Parties (1, 2, 3) & Server Party API
    ↓
Account Service (external API)

Security

Access Control

  1. IP Whitelisting: Only IPs in ALLOWED_IPS can access the API
  2. API Key Authentication: Requires valid MPC_API_KEY header
  3. Network Isolation: Services communicate within Docker network

Data Protection

  1. Encryption at Rest: All shares encrypted with AES-256-GCM
  2. Master Key: CRYPTO_MASTER_KEY must be securely stored and backed up
  3. Secure Transport: Use HTTPS/TLS for external communication

Best Practices

  • Never commit .env to version control
  • Backup CRYPTO_MASTER_KEY to multiple secure locations
  • Rotate API keys regularly
  • Use strong passwords (min 32 chars)
  • Restrict database ports (don't expose to internet)
  • Monitor failed authentication attempts
  • Enable audit logging

Key Backup

# Backup master key (CRITICAL!)
echo "CRYPTO_MASTER_KEY=$(grep CRYPTO_MASTER_KEY .env | cut -d= -f2)" > master_key.backup

# Store securely (encrypted USB, password manager, vault)
# NEVER store in plaintext on the server

Troubleshooting

Services won't start

# Check logs
./deploy.sh logs

# Check specific service
./deploy.sh logs postgres

# Common issues:
# 1. Ports already in use
# 2. .env file missing or misconfigured
# 3. Database initialization failed

Database connection errors

# Check postgres health
docker compose ps postgres

# View postgres logs
./deploy.sh logs postgres

# Restart infrastructure
./deploy.sh infra down
./deploy.sh infra up

API returns 403 Forbidden

# Check ALLOWED_IPS configuration
grep ALLOWED_IPS .env

# Verify caller's IP is in the list
# Update .env and restart:
./deploy.sh restart

API returns 401 Unauthorized

# Verify MPC_API_KEY matches between:
# 1. This system's .env
# 2. Backend mpc-service configuration

# Check API key
grep MPC_API_KEY .env

# Restart after updating
./deploy.sh restart

Keygen or signing fails

# Check all server parties are healthy
./deploy.sh health

# View server party logs
./deploy.sh logs server-party-1
./deploy.sh logs server-party-2
./deploy.sh logs server-party-3

# Check message router
./deploy.sh logs message-router

# Restart MPC services
./deploy.sh mpc restart

Lost master encryption key

CRITICAL: If CRYPTO_MASTER_KEY is lost, encrypted shares cannot be recovered!

Prevention:

  • Backup key immediately after generation
  • Store in multiple secure locations
  • Use enterprise key management system in production

Production Deployment

Pre-Deployment Checklist

  • Generate all secure keys and passwords
  • Backup CRYPTO_MASTER_KEY to secure locations
  • Configure ALLOWED_IPS for actual backend server
  • Sync MPC_API_KEY with backend mpc-service
  • Set up database backups
  • Configure log aggregation
  • Set up monitoring and alerts
  • Document recovery procedures
  • Test disaster recovery

Deployment Steps

Step 1: Prepare Environment

# On MPC server
git clone <repo> /opt/rwadurian
cd /opt/rwadurian/backend/mpc-system

# Configure environment
cp .env.example .env
nano .env  # Set all required values

# Generate and backup keys
openssl rand -hex 32 > master_key.txt
# Copy to secure storage, then delete:
# rm master_key.txt

Step 2: Deploy Services

# Build images
./deploy.sh build

# Start services
./deploy.sh up

# Verify all healthy
./deploy.sh health

Step 3: Configure Firewall

# Allow backend server to access MPC ports
sudo ufw allow from <BACKEND_IP> to any port 4000
sudo ufw allow from <BACKEND_IP> to any port 8081
sudo ufw allow from <BACKEND_IP> to any port 8082
sudo ufw allow from <BACKEND_IP> to any port 8083

# Deny all other external access
sudo ufw default deny incoming
sudo ufw enable

Step 4: Test Integration

# From backend server, test API access
curl -H "X-API-Key: YOUR_MPC_API_KEY" \
  http://<MPC_SERVER_IP>:4000/health

Monitoring

Monitor these metrics:

  • Service health status
  • API request rate and latency
  • Failed authentication attempts
  • Database connection pool usage
  • RabbitMQ queue depths
  • Key generation/signing success rates

Backup Strategy

# Database backup (daily)
docker compose exec postgres pg_dump -U mpc_user mpc_system > backup_$(date +%Y%m%d).sql

# Configuration backup
tar -czf config_backup_$(date +%Y%m%d).tar.gz .env kong.yml

# Encryption key backup (secure storage only!)

Disaster Recovery

  1. Service Failure: Restart affected service using ./deploy.sh restart
  2. Database Corruption: Restore from latest backup
  3. Key Loss: If CRYPTO_MASTER_KEY lost, all encrypted shares are unrecoverable
  4. Full System Recovery: Redeploy from backups, restore database

Performance Tuning

# docker-compose.yml - adjust resources
services:
  session-coordinator:
    deploy:
      resources:
        limits:
          cpus: '2'
          memory: 2G

API Reference

Account Service API (Port 4000)

# Health check
curl http://localhost:4000/health

# Create account (keygen)
curl -X POST http://localhost:4000/api/v1/accounts \
  -H "X-API-Key: YOUR_MPC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"user_id": "user123"}'

# Sign transaction
curl -X POST http://localhost:4000/api/v1/accounts/{account_id}/sign \
  -H "X-API-Key: YOUR_MPC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "tx_hash"}'

Server Party API (Port 8083)

# Generate user share
curl -X POST http://localhost:8083/api/v1/shares/generate \
  -H "X-API-Key: YOUR_MPC_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"session_id": "session123"}'

Getting Help

  • Check logs: ./deploy.sh logs
  • Health check: ./deploy.sh health
  • View commands: ./deploy.sh help
  • Review .env.example for configuration options

License

Copyright © 2024 RWADurian. All rights reserved.