rwadurian/backend/mpc-system/docs/05-deployment-guide.md

19 KiB

MPC 分布式签名系统 - 部署指南

1. 部署架构

1.1 最小部署 (开发/测试)

4 台服务器部署 2-of-3 方案:

┌─────────────────────────────────────────────────────────────────────┐
│                    Server 1 - Coordinator (协调节点)                 │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐     │
│  │ Session         │  │ Message         │  │ Account         │     │
│  │ Coordinator     │  │ Router          │  │ Service         │     │
│  │ :50051/:8080    │  │ :50052/:8081    │  │ :50054/:8083    │     │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘     │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐     │
│  │ PostgreSQL      │  │ Redis           │  │ RabbitMQ        │     │
│  │ :5432           │  │ :6379           │  │ :5672           │     │
│  └─────────────────┘  └─────────────────┘  └─────────────────┘     │
└─────────────────────────────────────────────────────────────────────┘

┌──────────────────┐  ┌──────────────────┐  ┌──────────────────┐
│   Server 2       │  │   Server 3       │  │   Server 4       │
│   Server Party 1 │  │   Server Party 2 │  │   Server Party 3 │
│   :50053/:8082   │  │   :50055/:8084   │  │   :50056/:8085   │
└──────────────────┘  └──────────────────┘  └──────────────────┘

1.2 生产部署 (高可用)

                    ┌─────────────────────────────────────┐
                    │         Load Balancer (Nginx)       │
                    │         (SSL Termination)           │
                    └─────────────────┬───────────────────┘
                                      │
              ┌───────────────────────┼───────────────────────┐
              │                       │                       │
              ▼                       ▼                       ▼
┌─────────────────────┐  ┌─────────────────────┐  ┌─────────────────────┐
│  Coordinator Pod 1  │  │  Coordinator Pod 2  │  │  Coordinator Pod 3  │
│  - Session Coord.   │  │  - Session Coord.   │  │  - Session Coord.   │
│  - Message Router   │  │  - Message Router   │  │  - Message Router   │
│  - Account Service  │  │  - Account Service  │  │  - Account Service  │
└──────────┬──────────┘  └──────────┬──────────┘  └──────────┬──────────┘
           │                        │                        │
           └────────────────────────┼────────────────────────┘
                                    │
              ┌─────────────────────┼─────────────────────┐
              │                     │                     │
              ▼                     ▼                     ▼
     ┌─────────────────┐   ┌─────────────────┐   ┌─────────────────┐
     │ Server Party 1  │   │ Server Party 2  │   │ Server Party 3  │
     │ (独立服务器)     │   │ (独立服务器)     │   │ (独立服务器)     │
     └─────────────────┘   └─────────────────┘   └─────────────────┘
              │                     │                     │
              └─────────────────────┼─────────────────────┘
                                    │
                    ┌───────────────┴───────────────┐
                    │                               │
                    ▼                               ▼
           ┌─────────────────┐             ┌─────────────────┐
           │ PostgreSQL      │             │ Redis Cluster   │
           │ (Primary/Replica)│            │                 │
           └─────────────────┘             └─────────────────┘

2. Docker Compose 部署

2.1 配置文件

# docker-compose.yml
version: '3.8'

services:
  # ============================================
  # 基础设施
  # ============================================
  postgres:
    image: postgres:14-alpine
    container_name: mpc-postgres
    ports:
      - "5432:5432"
    environment:
      POSTGRES_USER: mpc_user
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-mpc_secret_password}
      POSTGRES_DB: mpc_system
    volumes:
      - postgres-data:/var/lib/postgresql/data
      - ./migrations:/docker-entrypoint-initdb.d
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U mpc_user"]
      interval: 10s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    container_name: mpc-redis
    ports:
      - "6379:6379"
    command: redis-server --appendonly yes
    volumes:
      - redis-data:/data
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 5s
      retries: 5

  # ============================================
  # 核心服务
  # ============================================
  session-coordinator:
    build:
      context: .
      dockerfile: services/session-coordinator/Dockerfile
    container_name: mpc-session-coordinator
    ports:
      - "50051:50051"
      - "8080:8080"
    environment:
      MPC_DATABASE_HOST: postgres
      MPC_DATABASE_PORT: 5432
      MPC_DATABASE_USER: mpc_user
      MPC_DATABASE_PASSWORD: ${POSTGRES_PASSWORD:-mpc_secret_password}
      MPC_DATABASE_DBNAME: mpc_system
      MPC_REDIS_HOST: redis
      MPC_REDIS_PORT: 6379
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    healthcheck:
      test: ["CMD", "wget", "-q", "--spider", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3

  message-router:
    build:
      context: .
      dockerfile: services/message-router/Dockerfile
    container_name: mpc-message-router
    ports:
      - "50052:50051"
      - "8081:8080"
    environment:
      MPC_REDIS_HOST: redis
      MPC_REDIS_PORT: 6379
    depends_on:
      redis:
        condition: service_healthy

  # ============================================
  # Server Parties (3 个实例)
  # ============================================
  server-party-1:
    build:
      context: .
      dockerfile: services/server-party/Dockerfile
    container_name: mpc-server-party-1
    ports:
      - "50053:50051"
      - "8082:8080"
    environment:
      SESSION_COORDINATOR_ADDR: session-coordinator:50051
      MESSAGE_ROUTER_ADDR: message-router:50051
      MPC_DATABASE_HOST: postgres
      MPC_CRYPTO_MASTER_KEY: ${CRYPTO_MASTER_KEY}
      PARTY_ID: server-party-1
    depends_on:
      - session-coordinator
      - message-router

  server-party-2:
    build:
      context: .
      dockerfile: services/server-party/Dockerfile
    container_name: mpc-server-party-2
    ports:
      - "50055:50051"
      - "8084:8080"
    environment:
      SESSION_COORDINATOR_ADDR: session-coordinator:50051
      MESSAGE_ROUTER_ADDR: message-router:50051
      MPC_DATABASE_HOST: postgres
      MPC_CRYPTO_MASTER_KEY: ${CRYPTO_MASTER_KEY}
      PARTY_ID: server-party-2
    depends_on:
      - session-coordinator
      - message-router

  server-party-3:
    build:
      context: .
      dockerfile: services/server-party/Dockerfile
    container_name: mpc-server-party-3
    ports:
      - "50056:50051"
      - "8085:8080"
    environment:
      SESSION_COORDINATOR_ADDR: session-coordinator:50051
      MESSAGE_ROUTER_ADDR: message-router:50051
      MPC_DATABASE_HOST: postgres
      MPC_CRYPTO_MASTER_KEY: ${CRYPTO_MASTER_KEY}
      PARTY_ID: server-party-3
    depends_on:
      - session-coordinator
      - message-router

  account-service:
    build:
      context: .
      dockerfile: services/account/Dockerfile
    container_name: mpc-account-service
    ports:
      - "50054:50051"
      - "8083:8080"
    environment:
      MPC_DATABASE_HOST: postgres
      SESSION_COORDINATOR_ADDR: session-coordinator:50051
    depends_on:
      - session-coordinator
      - postgres

volumes:
  postgres-data:
  redis-data:

networks:
  default:
    name: mpc-network

2.2 环境变量文件

# .env
# 数据库
POSTGRES_PASSWORD=your_secure_password_here

# 加密主密钥 (64 位十六进制, 256 bit)
CRYPTO_MASTER_KEY=0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef

# 服务配置
LOG_LEVEL=info
ENVIRONMENT=production

2.3 启动服务

# 构建镜像
docker-compose build

# 启动所有服务
docker-compose up -d

# 查看状态
docker-compose ps

# 查看日志
docker-compose logs -f

# 停止服务
docker-compose down

3. Kubernetes 部署

3.1 命名空间

# k8s/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
  name: mpc-system

3.2 ConfigMap

# k8s/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: mpc-config
  namespace: mpc-system
data:
  LOG_LEVEL: "info"
  ENVIRONMENT: "production"
  DATABASE_HOST: "postgres-service"
  DATABASE_PORT: "5432"
  DATABASE_NAME: "mpc_system"
  REDIS_HOST: "redis-service"
  REDIS_PORT: "6379"

3.3 Secret

# k8s/secret.yaml
apiVersion: v1
kind: Secret
metadata:
  name: mpc-secrets
  namespace: mpc-system
type: Opaque
data:
  DATABASE_PASSWORD: <base64-encoded-password>
  CRYPTO_MASTER_KEY: <base64-encoded-key>

3.4 Session Coordinator Deployment

# k8s/session-coordinator.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: session-coordinator
  namespace: mpc-system
spec:
  replicas: 2
  selector:
    matchLabels:
      app: session-coordinator
  template:
    metadata:
      labels:
        app: session-coordinator
    spec:
      containers:
      - name: session-coordinator
        image: mpc-system/session-coordinator:latest
        ports:
        - containerPort: 50051
          name: grpc
        - containerPort: 8080
          name: http
        envFrom:
        - configMapRef:
            name: mpc-config
        - secretRef:
            name: mpc-secrets
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: 8080
          initialDelaySeconds: 5
          periodSeconds: 5
---
apiVersion: v1
kind: Service
metadata:
  name: session-coordinator-service
  namespace: mpc-system
spec:
  selector:
    app: session-coordinator
  ports:
  - name: grpc
    port: 50051
    targetPort: 50051
  - name: http
    port: 8080
    targetPort: 8080

3.5 Server Party StatefulSet

# k8s/server-party.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: server-party
  namespace: mpc-system
spec:
  serviceName: server-party
  replicas: 3
  selector:
    matchLabels:
      app: server-party
  template:
    metadata:
      labels:
        app: server-party
    spec:
      containers:
      - name: server-party
        image: mpc-system/server-party:latest
        ports:
        - containerPort: 50051
          name: grpc
        - containerPort: 8080
          name: http
        env:
        - name: PARTY_ID
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: SESSION_COORDINATOR_ADDR
          value: "session-coordinator-service:50051"
        - name: MESSAGE_ROUTER_ADDR
          value: "message-router-service:50051"
        envFrom:
        - configMapRef:
            name: mpc-config
        - secretRef:
            name: mpc-secrets
        volumeMounts:
        - name: keyshare-storage
          mountPath: /data/keyshares
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
  volumeClaimTemplates:
  - metadata:
      name: keyshare-storage
    spec:
      accessModes: ["ReadWriteOnce"]
      resources:
        requests:
          storage: 10Gi

3.6 Ingress

# k8s/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: mpc-ingress
  namespace: mpc-system
  annotations:
    nginx.ingress.kubernetes.io/ssl-redirect: "true"
    cert-manager.io/cluster-issuer: "letsencrypt-prod"
spec:
  ingressClassName: nginx
  tls:
  - hosts:
    - mpc-api.example.com
    secretName: mpc-tls
  rules:
  - host: mpc-api.example.com
    http:
      paths:
      - path: /api/v1/sessions
        pathType: Prefix
        backend:
          service:
            name: session-coordinator-service
            port:
              number: 8080
      - path: /api/v1/accounts
        pathType: Prefix
        backend:
          service:
            name: account-service
            port:
              number: 8080

3.7 部署命令

# 应用所有配置
kubectl apply -f k8s/

# 查看部署状态
kubectl get pods -n mpc-system

# 查看日志
kubectl logs -f deployment/session-coordinator -n mpc-system

# 扩缩容
kubectl scale statefulset server-party --replicas=5 -n mpc-system

4. 安全配置

4.1 TLS 配置

# 生成自签名证书 (开发环境)
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes

# 生产环境使用 Let's Encrypt 或企业 CA

4.2 网络策略

# k8s/network-policy.yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: server-party-policy
  namespace: mpc-system
spec:
  podSelector:
    matchLabels:
      app: server-party
  policyTypes:
  - Ingress
  - Egress
  ingress:
  - from:
    - podSelector:
        matchLabels:
          app: message-router
    - podSelector:
        matchLabels:
          app: session-coordinator
    ports:
    - protocol: TCP
      port: 50051
  egress:
  - to:
    - podSelector:
        matchLabels:
          app: message-router
    - podSelector:
        matchLabels:
          app: postgres

4.3 密钥管理

生产环境建议使用:

  • AWS KMS
  • HashiCorp Vault
  • Azure Key Vault
  • GCP Cloud KMS
# Vault 示例
vault kv put secret/mpc/master-key value=<key>

# 在应用中读取
export CRYPTO_MASTER_KEY=$(vault kv get -field=value secret/mpc/master-key)

5. 监控和日志

5.1 Prometheus 指标

# k8s/servicemonitor.yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: mpc-services
  namespace: mpc-system
spec:
  selector:
    matchLabels:
      monitoring: enabled
  endpoints:
  - port: http
    path: /metrics
    interval: 30s

5.2 Grafana Dashboard

关键指标:

  • 会话创建/完成率
  • TSS 协议延迟
  • 错误率
  • 活跃连接数

5.3 日志聚合

# Fluentd 配置
<source>
  @type tail
  path /var/log/containers/mpc-*.log
  pos_file /var/log/fluentd-mpc.log.pos
  tag mpc.*
  <parse>
    @type json
  </parse>
</source>

<match mpc.**>
  @type elasticsearch
  host elasticsearch
  port 9200
  index_name mpc-logs
</match>

6. 运维操作

6.1 健康检查

# 检查所有服务健康状态
curl http://localhost:8080/health  # Session Coordinator
curl http://localhost:8081/health  # Message Router
curl http://localhost:8082/health  # Server Party 1
curl http://localhost:8083/health  # Account Service

6.2 数据库备份

# PostgreSQL 备份
pg_dump -h localhost -U mpc_user mpc_system > backup_$(date +%Y%m%d).sql

# 恢复
psql -h localhost -U mpc_user mpc_system < backup_20240115.sql

6.3 密钥轮换

# 1. 生成新主密钥
NEW_KEY=$(openssl rand -hex 32)

# 2. 滚动更新各 Party 节点
kubectl set env statefulset/server-party CRYPTO_MASTER_KEY=$NEW_KEY -n mpc-system

# 3. 重新加密现有密钥分片 (需要自定义迁移脚本)

7. 故障排查

7.1 常见问题

问题 可能原因 解决方案
连接超时 网络/防火墙 检查端口开放
TSS 协议失败 参与方离线 检查所有 Party 状态
签名失败 密钥分片损坏 从备份恢复
数据库连接失败 凭证错误 检查环境变量

7.2 调试命令

# 检查网络连通性
kubectl exec -it pod/session-coordinator-xxx -- nc -zv message-router-service 50051

# 查看详细日志
kubectl logs -f pod/server-party-0 -n mpc-system --tail=100

# 进入容器调试
kubectl exec -it pod/session-coordinator-xxx -- /bin/sh