rwadurian/backend/mpc-system/README.md

562 lines
17 KiB
Markdown

# MPC System Deployment Guide
Multi-Party Computation (MPC) system for secure threshold signature scheme (TSS) implementation in the RWADurian project.
## Table of Contents
- [Overview](#overview)
- [Architecture](#architecture)
- [Quick Start](#quick-start)
- [Configuration](#configuration)
- [Deployment Commands](#deployment-commands)
- [Services](#services)
- [Security](#security)
- [Troubleshooting](#troubleshooting)
- [Production Deployment](#production-deployment)
## Overview
The MPC system implements a 2-of-3 threshold signature scheme where:
- Server parties from a dynamically scalable pool hold key shares
- At least 2 parties are required to generate signatures (configurable threshold)
- User shares are generated dynamically and returned to the calling service
- All shares are encrypted using AES-256-GCM
### Key Features
- **Threshold Cryptography**: Configurable N-of-M TSS for enhanced security
- **Dynamic Party Pool**: Kubernetes-based service discovery for automatic party scaling
- **Distributed Architecture**: Services communicate via gRPC and WebSocket
- **Secure Storage**: AES-256-GCM encryption for all stored shares
- **API Authentication**: API key and IP-based access control
- **Session Management**: Coordinated multi-party computation sessions
- **MPC Protocol Compliance**: DeviceInfo optional, aligning with international MPC standards
## Architecture
```
┌────────────────────────────────────────────────────────────────┐
│ MPC System │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Account Service │ │ Server Party API │ │
│ │ (Port 4000) │ │ (Port 8083) │ │
│ │ External API │ │ User Share Gen │ │
│ └────────┬─────────┘ └────────┬─────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ Session │◄──────►│ Message Router │ │
│ │ Coordinator │ │ (Port 8082) │ │
│ │ (Port 8081) │ │ WebSocket │ │
│ └────────┬─────────┘ └────────┬─────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌────────────────────────────────────────────┐ │
│ │ Server Party Pool (Dynamically Scalable) │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Party 1 │ │ Party 2 │ │ Party 3 │ │ K8s Discovery │
│ │ │ (TSS) │ │ (TSS) │ │ (TSS) │ │ Auto-selected │
│ │ └──────────┘ └──────────┘ └──────────┘ │ from pool │
│ │ ┌──────────┐ ... can scale up/down │ │
│ │ │ Party N │ │ │
│ │ └──────────┘ │ │
│ └────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────┐ │
│ │ Infrastructure Services │ │
│ │ PostgreSQL │ Redis │ RabbitMQ │ │
│ └────────────────────────────────────────────┘ │
└────────────────────────────────────────────────────────────────┘
│ Network Access
┌──────────────────────────┐
│ Backend Services │
│ mpc-service (caller) │
└──────────────────────────┘
```
## Deployment Options
This system supports two deployment modes:
### Option 1: Docker Compose (Development/Simple Deployment)
- Quick setup for development or simple production environments
- Fixed 3 server parties (hardcoded IDs)
- See instructions below in "Quick Start"
### Option 2: Kubernetes (Production/Scalable Deployment)
- Dynamic party pool with service discovery
- Horizontally scalable server parties
- Recommended for production environments
- See `k8s/README.md` for detailed instructions
## Quick Start (Docker Compose)
### Prerequisites
- **Docker** (version 20.10+)
- **Docker Compose** (version 2.0+)
- **Network Access** from backend services
- **Ports Available**: 4000, 8081, 8082, 8083
### 1. Initial Setup
```bash
cd backend/mpc-system
# Create environment configuration
cp .env.example .env
# Edit configuration for your environment
nano .env
```
### 2. Configure Environment
Edit `.env` and update the following **REQUIRED** values:
```bash
# Database password (REQUIRED)
POSTGRES_PASSWORD=your_secure_postgres_password
# RabbitMQ password (REQUIRED)
RABBITMQ_PASSWORD=your_secure_rabbitmq_password
# JWT secret key (REQUIRED, min 32 chars)
JWT_SECRET_KEY=your_jwt_secret_key_at_least_32_characters
# Master encryption key (REQUIRED, exactly 64 hex chars)
# WARNING: If you lose this, encrypted shares cannot be recovered!
CRYPTO_MASTER_KEY=$(openssl rand -hex 32)
# API key for server-to-server auth (REQUIRED)
# Must match the MPC_API_KEY in your backend mpc-service config
MPC_API_KEY=your_api_key_matching_mpc_service
# Allowed IPs (REQUIRED - update to actual backend server IP!)
ALLOWED_IPS=192.168.1.111
```
### 3. Deploy Services
```bash
# Start all services
./deploy.sh up
# Check status
./deploy.sh status
# View logs
./deploy.sh logs
```
### 4. Verify Deployment
```bash
# Health check
./deploy.sh health
# Test API
./deploy.sh test-api
```
## Configuration
All configuration is managed through `.env` file. See `.env.example` for complete documentation.
### Critical Environment Variables
| Variable | Description | Required | Example |
|----------|-------------|----------|---------|
| `POSTGRES_PASSWORD` | Database password | Yes | `openssl rand -base64 32` |
| `RABBITMQ_PASSWORD` | Message broker password | Yes | `openssl rand -base64 32` |
| `JWT_SECRET_KEY` | JWT signing key (≥32 chars) | Yes | `openssl rand -base64 48` |
| `CRYPTO_MASTER_KEY` | AES-256 key (64 hex chars) | Yes | `openssl rand -hex 32` |
| `MPC_API_KEY` | API authentication key | Yes | `openssl rand -base64 48` |
| `ALLOWED_IPS` | Comma-separated allowed IPs | Yes | `192.168.1.111,192.168.1.112` |
| `ENVIRONMENT` | Environment name | No | `production` (default) |
| `REDIS_PASSWORD` | Redis password | No | Leave empty for internal network |
### Generating Secure Keys
```bash
# PostgreSQL & RabbitMQ passwords
openssl rand -base64 32
# JWT Secret Key
openssl rand -base64 48
# Master Encryption Key (MUST be exactly 64 hex characters)
openssl rand -hex 32
# API Key
openssl rand -base64 48
```
### Configuration Checklist
Before deploying to production:
- [ ] Change all default passwords
- [ ] Generate secure `CRYPTO_MASTER_KEY` and back it up securely
- [ ] Set `MPC_API_KEY` to match backend mpc-service configuration
- [ ] Update `ALLOWED_IPS` to actual backend server IP(s)
- [ ] Backup `.env` file to secure location (NOT in git!)
## Deployment Commands
### Basic Operations
```bash
./deploy.sh up # Start all services
./deploy.sh down # Stop all services
./deploy.sh restart # Restart all services
./deploy.sh logs [svc] # View logs (all or specific service)
./deploy.sh status # Show service status
./deploy.sh health # Health check all services
```
### Build Commands
```bash
./deploy.sh build # Build Docker images
./deploy.sh build-no-cache # Rebuild without cache
```
### Service Management
```bash
# Infrastructure only
./deploy.sh infra up # Start postgres, redis, rabbitmq
./deploy.sh infra down # Stop infrastructure
# MPC services only
./deploy.sh mpc up # Start MPC services
./deploy.sh mpc down # Stop MPC services
./deploy.sh mpc restart # Restart MPC services
```
### Debugging
```bash
./deploy.sh logs-tail [service] # Last 100 log lines
./deploy.sh shell [service] # Open shell in container
./deploy.sh test-api # Test Account Service API
```
### Cleanup
```bash
# WARNING: This removes all data!
./deploy.sh clean
```
## Services
### External Services (Exposed Ports)
| Service | Port | Protocol | Purpose |
|---------|------|----------|---------|
| account-service | 4000 | HTTP | Main API for backend integration |
| session-coordinator | 8081 | HTTP/gRPC | Session coordination |
| message-router | 8082 | WebSocket/gRPC | Message routing |
| server-party-api | 8083 | HTTP | User share generation |
### Internal Services
| Service | Purpose |
|---------|---------|
| server-party-1/2/3 | TSS parties (Docker Compose mode - fixed IDs) |
| server-party-pool | TSS party pool (Kubernetes mode - dynamic scaling) |
| postgres | Database for session/account data |
| redis | Cache and temporary data |
| rabbitmq | Message broker for inter-service communication |
**Note**: In Kubernetes mode, server parties are discovered dynamically using K8s service discovery. Parties can be scaled up/down without service interruption.
### Service Dependencies
```
Infrastructure Services (postgres, redis, rabbitmq)
Session Coordinator & Message Router
Server Parties (1, 2, 3) & Server Party API
Account Service (external API)
```
## Security
### Access Control
1. **IP Whitelisting**: Only IPs in `ALLOWED_IPS` can access the API
2. **API Key Authentication**: Requires valid `MPC_API_KEY` header
3. **Network Isolation**: Services communicate within Docker network
### Data Protection
1. **Encryption at Rest**: All shares encrypted with AES-256-GCM
2. **Master Key**: `CRYPTO_MASTER_KEY` must be securely stored and backed up
3. **Secure Transport**: Use HTTPS/TLS for external communication
### Best Practices
- **Never commit `.env` to version control**
- **Backup `CRYPTO_MASTER_KEY` to multiple secure locations**
- **Rotate API keys regularly**
- **Use strong passwords (min 32 chars)**
- **Restrict database ports (don't expose to internet)**
- **Monitor failed authentication attempts**
- **Enable audit logging**
### Key Backup
```bash
# Backup master key (CRITICAL!)
echo "CRYPTO_MASTER_KEY=$(grep CRYPTO_MASTER_KEY .env | cut -d= -f2)" > master_key.backup
# Store securely (encrypted USB, password manager, vault)
# NEVER store in plaintext on the server
```
## Troubleshooting
### Services won't start
```bash
# Check logs
./deploy.sh logs
# Check specific service
./deploy.sh logs postgres
# Common issues:
# 1. Ports already in use
# 2. .env file missing or misconfigured
# 3. Database initialization failed
```
### Database connection errors
```bash
# Check postgres health
docker compose ps postgres
# View postgres logs
./deploy.sh logs postgres
# Restart infrastructure
./deploy.sh infra down
./deploy.sh infra up
```
### API returns 403 Forbidden
```bash
# Check ALLOWED_IPS configuration
grep ALLOWED_IPS .env
# Verify caller's IP is in the list
# Update .env and restart:
./deploy.sh restart
```
### API returns 401 Unauthorized
```bash
# Verify MPC_API_KEY matches between:
# 1. This system's .env
# 2. Backend mpc-service configuration
# Check API key
grep MPC_API_KEY .env
# Restart after updating
./deploy.sh restart
```
### Keygen or signing fails
```bash
# Check all server parties are healthy
./deploy.sh health
# View server party logs
./deploy.sh logs server-party-1
./deploy.sh logs server-party-2
./deploy.sh logs server-party-3
# Check message router
./deploy.sh logs message-router
# Restart MPC services
./deploy.sh mpc restart
```
### Lost master encryption key
**CRITICAL**: If `CRYPTO_MASTER_KEY` is lost, encrypted shares cannot be recovered!
Prevention:
- Backup key immediately after generation
- Store in multiple secure locations
- Use enterprise key management system in production
## Production Deployment
### Pre-Deployment Checklist
- [ ] Generate all secure keys and passwords
- [ ] Backup `CRYPTO_MASTER_KEY` to secure locations
- [ ] Configure `ALLOWED_IPS` for actual backend server
- [ ] Sync `MPC_API_KEY` with backend mpc-service
- [ ] Set up database backups
- [ ] Configure log aggregation
- [ ] Set up monitoring and alerts
- [ ] Document recovery procedures
- [ ] Test disaster recovery
### Deployment Steps
**Step 1: Prepare Environment**
```bash
# On MPC server
git clone <repo> /opt/rwadurian
cd /opt/rwadurian/backend/mpc-system
# Configure environment
cp .env.example .env
nano .env # Set all required values
# Generate and backup keys
openssl rand -hex 32 > master_key.txt
# Copy to secure storage, then delete:
# rm master_key.txt
```
**Step 2: Deploy Services**
```bash
# Build images
./deploy.sh build
# Start services
./deploy.sh up
# Verify all healthy
./deploy.sh health
```
**Step 3: Configure Firewall**
```bash
# Allow backend server to access MPC ports
sudo ufw allow from <BACKEND_IP> to any port 4000
sudo ufw allow from <BACKEND_IP> to any port 8081
sudo ufw allow from <BACKEND_IP> to any port 8082
sudo ufw allow from <BACKEND_IP> to any port 8083
# Deny all other external access
sudo ufw default deny incoming
sudo ufw enable
```
**Step 4: Test Integration**
```bash
# From backend server, test API access
curl -H "X-API-Key: YOUR_MPC_API_KEY" \
http://<MPC_SERVER_IP>:4000/health
```
### Monitoring
Monitor these metrics:
- Service health status
- API request rate and latency
- Failed authentication attempts
- Database connection pool usage
- RabbitMQ queue depths
- Key generation/signing success rates
### Backup Strategy
```bash
# Database backup (daily)
docker compose exec postgres pg_dump -U mpc_user mpc_system > backup_$(date +%Y%m%d).sql
# Configuration backup
tar -czf config_backup_$(date +%Y%m%d).tar.gz .env kong.yml
# Encryption key backup (secure storage only!)
```
### Disaster Recovery
1. **Service Failure**: Restart affected service using `./deploy.sh restart`
2. **Database Corruption**: Restore from latest backup
3. **Key Loss**: If `CRYPTO_MASTER_KEY` lost, all encrypted shares are unrecoverable
4. **Full System Recovery**: Redeploy from backups, restore database
### Performance Tuning
```yaml
# docker-compose.yml - adjust resources
services:
session-coordinator:
deploy:
resources:
limits:
cpus: '2'
memory: 2G
```
## API Reference
### Account Service API (Port 4000)
```bash
# Health check
curl http://localhost:4000/health
# Create account (keygen)
curl -X POST http://localhost:4000/api/v1/accounts \
-H "X-API-Key: YOUR_MPC_API_KEY" \
-H "Content-Type: application/json" \
-d '{"user_id": "user123"}'
# Sign transaction
curl -X POST http://localhost:4000/api/v1/accounts/{account_id}/sign \
-H "X-API-Key: YOUR_MPC_API_KEY" \
-H "Content-Type: application/json" \
-d '{"message": "tx_hash"}'
```
### Server Party API (Port 8083)
```bash
# Generate user share
curl -X POST http://localhost:8083/api/v1/shares/generate \
-H "X-API-Key: YOUR_MPC_API_KEY" \
-H "Content-Type: application/json" \
-d '{"session_id": "session123"}'
```
## Getting Help
- Check logs: `./deploy.sh logs`
- Health check: `./deploy.sh health`
- View commands: `./deploy.sh help`
- Review `.env.example` for configuration options
## License
Copyright © 2024 RWADurian. All rights reserved.