feat(mpc-system): implement Kubernetes-based dynamic party pool architecture
Major architectural refactoring to align with international MPC standards and enable horizontal scalability. ## Core Changes ### 1. DeviceInfo Made Optional - Modified DeviceInfo.Validate() to allow empty device information - Aligns with international MPC protocol standards - MPC protocol layer should not mandate device-specific metadata - Location: services/session-coordinator/domain/entities/device_info.go ### 2. Kubernetes Party Discovery Service - Created infrastructure/k8s/party_discovery.go (220 lines) - Implements dynamic service discovery via Kubernetes API - Supports in-cluster config and kubeconfig fallback - Auto-refreshes party list every 30s (configurable) - Health-aware selection (only ready pods) - Uses pod names as unique party IDs ### 3. Party Pool Architecture - Defined PartyPoolPort interface for abstraction - CreateSessionUseCase now supports automatic party selection - When no participants specified, selects from K8s pool - Graceful fallback to dynamic join mode if discovery fails - Location: services/session-coordinator/application/ports/output/party_pool_port.go ### 4. Integration Updates - Modified CreateSessionUseCase to inject partyPool - Updated session-coordinator main.go to initialize K8s discovery - gRPC handler already supports optional participants - Added k8s client-go dependencies (v0.29.0) to go.mod ## Kubernetes Deployment ### New K8s Manifests - k8s/namespace.yaml: mpc-system namespace - k8s/configmap.yaml: shared configuration - k8s/secrets-example.yaml: secrets template - k8s/server-party-deployment.yaml: scalable party pool (3+ replicas) - k8s/session-coordinator-deployment.yaml: coordinator with RBAC - k8s/README.md: comprehensive deployment guide ### RBAC Configuration - ServiceAccount for session-coordinator - Role with pods/services get/list/watch permissions - RoleBinding to grant discovery capabilities ## Key Features ✅ Dynamic service discovery via Kubernetes API ✅ Horizontal scaling (kubectl scale deployment) ✅ No hardcoded party IDs ✅ Health-aware party selection ✅ Graceful degradation when K8s unavailable ✅ MPC protocol compliance (optional DeviceInfo) ## Deployment Modes ### Docker Compose (Existing) - Fixed 3 parties (server-party-1/2/3) - Quick setup for development - Backward compatible ### Kubernetes (New) - Dynamic party pool - Auto-discovery and scaling - Production-ready ## Documentation - Updated main README.md with deployment options - Added architecture diagram showing scalable party pool - Created comprehensive k8s/README.md with: - Quick start guide - Scaling instructions - Troubleshooting section - RBAC configuration details 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
parent
8e386c7683
commit
cf534ec178
|
|
@ -30,7 +30,8 @@
|
||||||
"Bash(wsl.exe -- bash -c 'cd ~/rwadurian/backend/mpc-system && docker compose logs server-party-1 | grep -E \"\"Starting|gRPC|port\"\" | tail -10')",
|
"Bash(wsl.exe -- bash -c 'cd ~/rwadurian/backend/mpc-system && docker compose logs server-party-1 | grep -E \"\"Starting|gRPC|port\"\" | tail -10')",
|
||||||
"Bash(wsl.exe -- bash -c 'find ~/rwadurian/backend/mpc-system/services/server-party -name \"\"main.go\"\" -path \"\"*/cmd/server/*\"\"')",
|
"Bash(wsl.exe -- bash -c 'find ~/rwadurian/backend/mpc-system/services/server-party -name \"\"main.go\"\" -path \"\"*/cmd/server/*\"\"')",
|
||||||
"Bash(wsl.exe -- bash -c 'cat ~/rwadurian/backend/mpc-system/services/server-party/cmd/server/main.go | grep -E \"\"grpc|GRPC|gRPC|50051\"\" | head -20')",
|
"Bash(wsl.exe -- bash -c 'cat ~/rwadurian/backend/mpc-system/services/server-party/cmd/server/main.go | grep -E \"\"grpc|GRPC|gRPC|50051\"\" | head -20')",
|
||||||
"Bash(wsl.exe -- bash:*)"
|
"Bash(wsl.exe -- bash:*)",
|
||||||
|
"Bash(dir:*)"
|
||||||
],
|
],
|
||||||
"deny": [],
|
"deny": [],
|
||||||
"ask": []
|
"ask": []
|
||||||
|
|
|
||||||
|
|
@ -17,18 +17,20 @@ Multi-Party Computation (MPC) system for secure threshold signature scheme (TSS)
|
||||||
## Overview
|
## Overview
|
||||||
|
|
||||||
The MPC system implements a 2-of-3 threshold signature scheme where:
|
The MPC system implements a 2-of-3 threshold signature scheme where:
|
||||||
- 3 server parties hold key shares
|
- Server parties from a dynamically scalable pool hold key shares
|
||||||
- At least 2 parties are required to generate signatures
|
- At least 2 parties are required to generate signatures (configurable threshold)
|
||||||
- User shares are generated dynamically and returned to the calling service
|
- User shares are generated dynamically and returned to the calling service
|
||||||
- All shares are encrypted using AES-256-GCM
|
- All shares are encrypted using AES-256-GCM
|
||||||
|
|
||||||
### Key Features
|
### Key Features
|
||||||
|
|
||||||
- **Threshold Cryptography**: 2-of-3 TSS for enhanced security
|
- **Threshold Cryptography**: Configurable N-of-M TSS for enhanced security
|
||||||
|
- **Dynamic Party Pool**: Kubernetes-based service discovery for automatic party scaling
|
||||||
- **Distributed Architecture**: Services communicate via gRPC and WebSocket
|
- **Distributed Architecture**: Services communicate via gRPC and WebSocket
|
||||||
- **Secure Storage**: AES-256-GCM encryption for all stored shares
|
- **Secure Storage**: AES-256-GCM encryption for all stored shares
|
||||||
- **API Authentication**: API key and IP-based access control
|
- **API Authentication**: API key and IP-based access control
|
||||||
- **Session Management**: Coordinated multi-party computation sessions
|
- **Session Management**: Coordinated multi-party computation sessions
|
||||||
|
- **MPC Protocol Compliance**: DeviceInfo optional, aligning with international MPC standards
|
||||||
|
|
||||||
## Architecture
|
## Architecture
|
||||||
|
|
||||||
|
|
@ -51,11 +53,14 @@ The MPC system implements a 2-of-3 threshold signature scheme where:
|
||||||
│ │ │ │
|
│ │ │ │
|
||||||
│ ▼ ▼ │
|
│ ▼ ▼ │
|
||||||
│ ┌────────────────────────────────────────────┐ │
|
│ ┌────────────────────────────────────────────┐ │
|
||||||
│ │ Server Parties (3 instances) │ │
|
│ │ Server Party Pool (Dynamically Scalable) │ │
|
||||||
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
|
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
|
||||||
│ │ │ Party 1 │ │ Party 2 │ │ Party 3 │ │ │
|
│ │ │ Party 1 │ │ Party 2 │ │ Party 3 │ │ K8s Discovery │
|
||||||
│ │ │ (TSS) │ │ (TSS) │ │ (TSS) │ │ │
|
│ │ │ (TSS) │ │ (TSS) │ │ (TSS) │ │ Auto-selected │
|
||||||
│ │ └──────────┘ └──────────┘ └──────────┘ │ │
|
│ │ └──────────┘ └──────────┘ └──────────┘ │ from pool │
|
||||||
|
│ │ ┌──────────┐ ... can scale up/down │ │
|
||||||
|
│ │ │ Party N │ │ │
|
||||||
|
│ │ └──────────┘ │ │
|
||||||
│ └────────────────────────────────────────────┘ │
|
│ └────────────────────────────────────────────┘ │
|
||||||
│ │
|
│ │
|
||||||
│ ┌────────────────────────────────────────────┐ │
|
│ ┌────────────────────────────────────────────┐ │
|
||||||
|
|
@ -72,7 +77,24 @@ The MPC system implements a 2-of-3 threshold signature scheme where:
|
||||||
└──────────────────────────┘
|
└──────────────────────────┘
|
||||||
```
|
```
|
||||||
|
|
||||||
## Quick Start
|
## Deployment Options
|
||||||
|
|
||||||
|
This system supports two deployment modes:
|
||||||
|
|
||||||
|
### Option 1: Docker Compose (Development/Simple Deployment)
|
||||||
|
|
||||||
|
- Quick setup for development or simple production environments
|
||||||
|
- Fixed 3 server parties (hardcoded IDs)
|
||||||
|
- See instructions below in "Quick Start"
|
||||||
|
|
||||||
|
### Option 2: Kubernetes (Production/Scalable Deployment)
|
||||||
|
|
||||||
|
- Dynamic party pool with service discovery
|
||||||
|
- Horizontally scalable server parties
|
||||||
|
- Recommended for production environments
|
||||||
|
- See `k8s/README.md` for detailed instructions
|
||||||
|
|
||||||
|
## Quick Start (Docker Compose)
|
||||||
|
|
||||||
### Prerequisites
|
### Prerequisites
|
||||||
|
|
||||||
|
|
@ -248,13 +270,14 @@ Before deploying to production:
|
||||||
|
|
||||||
| Service | Purpose |
|
| Service | Purpose |
|
||||||
|---------|---------|
|
|---------|---------|
|
||||||
| server-party-1 | TSS party 1 (stores server shares) |
|
| server-party-1/2/3 | TSS parties (Docker Compose mode - fixed IDs) |
|
||||||
| server-party-2 | TSS party 2 (stores server shares) |
|
| server-party-pool | TSS party pool (Kubernetes mode - dynamic scaling) |
|
||||||
| server-party-3 | TSS party 3 (stores server shares) |
|
|
||||||
| postgres | Database for session/account data |
|
| postgres | Database for session/account data |
|
||||||
| redis | Cache and temporary data |
|
| redis | Cache and temporary data |
|
||||||
| rabbitmq | Message broker for inter-service communication |
|
| rabbitmq | Message broker for inter-service communication |
|
||||||
|
|
||||||
|
**Note**: In Kubernetes mode, server parties are discovered dynamically using K8s service discovery. Parties can be scaled up/down without service interruption.
|
||||||
|
|
||||||
### Service Dependencies
|
### Service Dependencies
|
||||||
|
|
||||||
```
|
```
|
||||||
|
|
|
||||||
|
|
@ -16,6 +16,9 @@ require (
|
||||||
go.uber.org/zap v1.26.0
|
go.uber.org/zap v1.26.0
|
||||||
golang.org/x/crypto v0.16.0
|
golang.org/x/crypto v0.16.0
|
||||||
google.golang.org/grpc v1.60.0
|
google.golang.org/grpc v1.60.0
|
||||||
|
k8s.io/api v0.29.0
|
||||||
|
k8s.io/apimachinery v0.29.0
|
||||||
|
k8s.io/client-go v0.29.0
|
||||||
)
|
)
|
||||||
|
|
||||||
replace github.com/agl/ed25519 => github.com/agl/ed25519 v0.0.0-20170116200512-5312a6153412
|
replace github.com/agl/ed25519 => github.com/agl/ed25519 v0.0.0-20170116200512-5312a6153412
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,198 @@
|
||||||
|
# Kubernetes Deployment for MPC System
|
||||||
|
|
||||||
|
This directory contains Kubernetes manifests for deploying the MPC system with dynamic party pool service discovery.
|
||||||
|
|
||||||
|
## Architecture Overview
|
||||||
|
|
||||||
|
The Kubernetes deployment implements a **Party Pool** architecture where:
|
||||||
|
|
||||||
|
- **Server parties are dynamically discovered** via Kubernetes service discovery
|
||||||
|
- **Session coordinator** automatically selects available parties from the pool
|
||||||
|
- **Parties can be scaled** up/down without code changes (just scale the deployment)
|
||||||
|
- **No hardcoded party IDs** - each pod gets a unique name as its party ID
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
- Kubernetes cluster (v1.24+)
|
||||||
|
- kubectl configured to access your cluster
|
||||||
|
- Docker images built for all services
|
||||||
|
- PostgreSQL, Redis, and RabbitMQ deployed (see infrastructure/)
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
### 1. Create namespace
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl apply -f namespace.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
### 2. Create secrets
|
||||||
|
|
||||||
|
Copy the example secrets file and fill in your actual values:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
cp secrets-example.yaml secrets.yaml
|
||||||
|
# Edit secrets.yaml with your base64-encoded secrets
|
||||||
|
# Generate base64: echo -n "your-secret" | base64
|
||||||
|
kubectl apply -f secrets.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
### 3. Create ConfigMap
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl apply -f configmap.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
### 4. Deploy Session Coordinator
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl apply -f session-coordinator-deployment.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
The session coordinator requires RBAC permissions to discover party pods.
|
||||||
|
|
||||||
|
### 5. Deploy Server Party Pool
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl apply -f server-party-deployment.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
This creates a deployment with 3 replicas by default. Each pod gets a unique name (e.g., `mpc-server-party-0`, `mpc-server-party-1`, etc.) which serves as its party ID.
|
||||||
|
|
||||||
|
### 6. Deploy other services
|
||||||
|
|
||||||
|
```bash
|
||||||
|
kubectl apply -f message-router-deployment.yaml
|
||||||
|
kubectl apply -f account-service-deployment.yaml
|
||||||
|
kubectl apply -f server-party-api-deployment.yaml
|
||||||
|
```
|
||||||
|
|
||||||
|
## Scaling Server Parties
|
||||||
|
|
||||||
|
To scale the party pool, simply adjust the replica count:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Scale up to 5 parties
|
||||||
|
kubectl scale deployment mpc-server-party -n mpc-system --replicas=5
|
||||||
|
|
||||||
|
# Scale down to 2 parties
|
||||||
|
kubectl scale deployment mpc-server-party -n mpc-system --replicas=2
|
||||||
|
```
|
||||||
|
|
||||||
|
The session coordinator will automatically discover new parties within 30 seconds (configurable via `MPC_PARTY_DISCOVERY_INTERVAL`).
|
||||||
|
|
||||||
|
## Service Discovery Configuration
|
||||||
|
|
||||||
|
The session coordinator uses environment variables to configure party discovery:
|
||||||
|
|
||||||
|
- `K8S_NAMESPACE`: Namespace to search for parties (auto-detected from pod metadata)
|
||||||
|
- `MPC_PARTY_SERVICE_NAME`: Service name to discover (`mpc-server-party`)
|
||||||
|
- `MPC_PARTY_LABEL_SELECTOR`: Label selector (`app=mpc-server-party`)
|
||||||
|
- `MPC_PARTY_GRPC_PORT`: gRPC port for parties (`50051`)
|
||||||
|
- `MPC_PARTY_DISCOVERY_INTERVAL`: Refresh interval (`30s`)
|
||||||
|
|
||||||
|
## RBAC Permissions
|
||||||
|
|
||||||
|
The session coordinator requires the following Kubernetes permissions:
|
||||||
|
|
||||||
|
- `pods`: get, list, watch (to discover party pods)
|
||||||
|
- `services`: get, list, watch (to discover services)
|
||||||
|
|
||||||
|
These permissions are granted via the `mpc-session-coordinator-role` Role and RoleBinding.
|
||||||
|
|
||||||
|
## Health Checks
|
||||||
|
|
||||||
|
All services expose a `/health` endpoint on their HTTP port (8080) for:
|
||||||
|
|
||||||
|
- Liveness probes: Detects if the service is alive
|
||||||
|
- Readiness probes: Detects if the service is ready to accept traffic
|
||||||
|
|
||||||
|
## Monitoring Party Pool
|
||||||
|
|
||||||
|
Check available parties:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# View all party pods
|
||||||
|
kubectl get pods -n mpc-system -l app=mpc-server-party
|
||||||
|
|
||||||
|
# Check party pod logs
|
||||||
|
kubectl logs -n mpc-system -l app=mpc-server-party --tail=50
|
||||||
|
|
||||||
|
# Check session coordinator logs for party discovery
|
||||||
|
kubectl logs -n mpc-system -l app=mpc-session-coordinator | grep "party"
|
||||||
|
```
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Session coordinator can't discover parties
|
||||||
|
|
||||||
|
1. Check RBAC permissions:
|
||||||
|
```bash
|
||||||
|
kubectl get role,rolebinding -n mpc-system
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Check if service account is correctly assigned:
|
||||||
|
```bash
|
||||||
|
kubectl get pod -n mpc-system -l app=mpc-session-coordinator -o yaml | grep serviceAccount
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Check coordinator logs:
|
||||||
|
```bash
|
||||||
|
kubectl logs -n mpc-system -l app=mpc-session-coordinator
|
||||||
|
```
|
||||||
|
|
||||||
|
### Parties not showing as ready
|
||||||
|
|
||||||
|
1. Check party pod status:
|
||||||
|
```bash
|
||||||
|
kubectl get pods -n mpc-system -l app=mpc-server-party
|
||||||
|
```
|
||||||
|
|
||||||
|
2. Check readiness probe:
|
||||||
|
```bash
|
||||||
|
kubectl describe pod -n mpc-system <party-pod-name>
|
||||||
|
```
|
||||||
|
|
||||||
|
3. Check party logs:
|
||||||
|
```bash
|
||||||
|
kubectl logs -n mpc-system <party-pod-name>
|
||||||
|
```
|
||||||
|
|
||||||
|
## Migration from Docker Compose
|
||||||
|
|
||||||
|
Key differences from docker-compose deployment:
|
||||||
|
|
||||||
|
1. **No hardcoded party IDs**: In docker-compose, parties had static IDs (`server-party-1`, `server-party-2`, `server-party-3`). In K8s, pod names are used as party IDs.
|
||||||
|
|
||||||
|
2. **Dynamic scaling**: Can scale parties up/down without restarting other services.
|
||||||
|
|
||||||
|
3. **Service discovery**: Automatic discovery via Kubernetes API instead of DNS.
|
||||||
|
|
||||||
|
4. **DeviceInfo optional**: `DeviceInfo` is now optional in the protocol layer, aligning with international MPC standards.
|
||||||
|
|
||||||
|
## Advanced Configuration
|
||||||
|
|
||||||
|
### Custom party selection strategy
|
||||||
|
|
||||||
|
The default selection strategy is "first N available parties". To implement custom strategies (e.g., load-based, geo-aware), modify the `SelectParties()` method in `services/session-coordinator/infrastructure/k8s/party_discovery.go`.
|
||||||
|
|
||||||
|
### Party affinity
|
||||||
|
|
||||||
|
To ensure parties run on different nodes for fault tolerance:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
spec:
|
||||||
|
template:
|
||||||
|
spec:
|
||||||
|
affinity:
|
||||||
|
podAntiAffinity:
|
||||||
|
preferredDuringSchedulingIgnoredDuringExecution:
|
||||||
|
- weight: 100
|
||||||
|
podAffinityTerm:
|
||||||
|
labelSelector:
|
||||||
|
matchLabels:
|
||||||
|
app: mpc-server-party
|
||||||
|
topologyKey: kubernetes.io/hostname
|
||||||
|
```
|
||||||
|
|
||||||
|
Add this to `server-party-deployment.yaml` under `spec.template`.
|
||||||
|
|
@ -0,0 +1,10 @@
|
||||||
|
apiVersion: v1
|
||||||
|
kind: ConfigMap
|
||||||
|
metadata:
|
||||||
|
name: mpc-config
|
||||||
|
namespace: mpc-system
|
||||||
|
data:
|
||||||
|
environment: "production"
|
||||||
|
postgres_host: "postgres.mpc-system.svc.cluster.local"
|
||||||
|
redis_host: "redis.mpc-system.svc.cluster.local"
|
||||||
|
rabbitmq_host: "rabbitmq.mpc-system.svc.cluster.local"
|
||||||
|
|
@ -0,0 +1,7 @@
|
||||||
|
apiVersion: v1
|
||||||
|
kind: Namespace
|
||||||
|
metadata:
|
||||||
|
name: mpc-system
|
||||||
|
labels:
|
||||||
|
name: mpc-system
|
||||||
|
app: mpc
|
||||||
|
|
@ -0,0 +1,19 @@
|
||||||
|
# IMPORTANT: This is an example file. DO NOT commit real secrets to git!
|
||||||
|
# Copy this file to secrets.yaml and fill in your actual base64-encoded values
|
||||||
|
# Generate base64 values: echo -n "your-value" | base64
|
||||||
|
|
||||||
|
apiVersion: v1
|
||||||
|
kind: Secret
|
||||||
|
metadata:
|
||||||
|
name: mpc-secrets
|
||||||
|
namespace: mpc-system
|
||||||
|
type: Opaque
|
||||||
|
data:
|
||||||
|
postgres_user: bXBjX3VzZXI= # mpc_user (example)
|
||||||
|
postgres_password: Y2hhbmdlbWU= # changeme (example - REPLACE THIS!)
|
||||||
|
redis_password: "" # empty if no password
|
||||||
|
rabbitmq_user: bXBjX3VzZXI= # mpc_user (example)
|
||||||
|
rabbitmq_password: Y2hhbmdlbWU= # changeme (example - REPLACE THIS!)
|
||||||
|
jwt_secret_key: Y2hhbmdlLXRoaXMtdG8tYS1zZWN1cmUtcmFuZG9tLXN0cmluZw== # REPLACE THIS!
|
||||||
|
crypto_master_key: Y2hhbmdlLXRoaXMtdG8tYS1zZWN1cmUtcmFuZG9tLXN0cmluZw== # REPLACE THIS!
|
||||||
|
mpc_api_key: Y2hhbmdlLXRoaXMtdG8tYS1zZWN1cmUtcmFuZG9tLXN0cmluZw== # REPLACE THIS!
|
||||||
|
|
@ -0,0 +1,125 @@
|
||||||
|
apiVersion: v1
|
||||||
|
kind: ServiceAccount
|
||||||
|
metadata:
|
||||||
|
name: mpc-server-party
|
||||||
|
namespace: mpc-system
|
||||||
|
---
|
||||||
|
apiVersion: apps/v1
|
||||||
|
kind: Deployment
|
||||||
|
metadata:
|
||||||
|
name: mpc-server-party
|
||||||
|
namespace: mpc-system
|
||||||
|
labels:
|
||||||
|
app: mpc-server-party
|
||||||
|
component: compute
|
||||||
|
spec:
|
||||||
|
replicas: 3 # Start with 3 parties, can scale up/down dynamically
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
app: mpc-server-party
|
||||||
|
template:
|
||||||
|
metadata:
|
||||||
|
labels:
|
||||||
|
app: mpc-server-party
|
||||||
|
component: compute
|
||||||
|
spec:
|
||||||
|
serviceAccountName: mpc-server-party
|
||||||
|
containers:
|
||||||
|
- name: server-party
|
||||||
|
image: mpc-system/server-party:latest
|
||||||
|
imagePullPolicy: IfNotPresent
|
||||||
|
ports:
|
||||||
|
- name: grpc
|
||||||
|
containerPort: 50051
|
||||||
|
protocol: TCP
|
||||||
|
- name: http
|
||||||
|
containerPort: 8080
|
||||||
|
protocol: TCP
|
||||||
|
env:
|
||||||
|
- name: MPC_SERVER_GRPC_PORT
|
||||||
|
value: "50051"
|
||||||
|
- name: MPC_SERVER_HTTP_PORT
|
||||||
|
value: "8080"
|
||||||
|
- name: MPC_SERVER_ENVIRONMENT
|
||||||
|
valueFrom:
|
||||||
|
configMapKeyRef:
|
||||||
|
name: mpc-config
|
||||||
|
key: environment
|
||||||
|
- name: MPC_DATABASE_HOST
|
||||||
|
valueFrom:
|
||||||
|
configMapKeyRef:
|
||||||
|
name: mpc-config
|
||||||
|
key: postgres_host
|
||||||
|
- name: MPC_DATABASE_PORT
|
||||||
|
value: "5432"
|
||||||
|
- name: MPC_DATABASE_USER
|
||||||
|
valueFrom:
|
||||||
|
secretKeyRef:
|
||||||
|
name: mpc-secrets
|
||||||
|
key: postgres_user
|
||||||
|
- name: MPC_DATABASE_PASSWORD
|
||||||
|
valueFrom:
|
||||||
|
secretKeyRef:
|
||||||
|
name: mpc-secrets
|
||||||
|
key: postgres_password
|
||||||
|
- name: MPC_DATABASE_DBNAME
|
||||||
|
value: "mpc_system"
|
||||||
|
- name: MPC_DATABASE_SSLMODE
|
||||||
|
value: "disable"
|
||||||
|
- name: SESSION_COORDINATOR_ADDR
|
||||||
|
value: "mpc-session-coordinator:50051"
|
||||||
|
- name: MESSAGE_ROUTER_ADDR
|
||||||
|
value: "mpc-message-router:50051"
|
||||||
|
- name: MPC_CRYPTO_MASTER_KEY
|
||||||
|
valueFrom:
|
||||||
|
secretKeyRef:
|
||||||
|
name: mpc-secrets
|
||||||
|
key: crypto_master_key
|
||||||
|
- name: PARTY_ID
|
||||||
|
valueFrom:
|
||||||
|
fieldRef:
|
||||||
|
fieldPath: metadata.name # Use pod name as unique party ID
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
memory: "256Mi"
|
||||||
|
cpu: "250m"
|
||||||
|
limits:
|
||||||
|
memory: "512Mi"
|
||||||
|
cpu: "500m"
|
||||||
|
livenessProbe:
|
||||||
|
httpGet:
|
||||||
|
path: /health
|
||||||
|
port: 8080
|
||||||
|
initialDelaySeconds: 30
|
||||||
|
periodSeconds: 10
|
||||||
|
timeoutSeconds: 5
|
||||||
|
failureThreshold: 3
|
||||||
|
readinessProbe:
|
||||||
|
httpGet:
|
||||||
|
path: /health
|
||||||
|
port: 8080
|
||||||
|
initialDelaySeconds: 10
|
||||||
|
periodSeconds: 5
|
||||||
|
timeoutSeconds: 3
|
||||||
|
failureThreshold: 2
|
||||||
|
---
|
||||||
|
apiVersion: v1
|
||||||
|
kind: Service
|
||||||
|
metadata:
|
||||||
|
name: mpc-server-party
|
||||||
|
namespace: mpc-system
|
||||||
|
labels:
|
||||||
|
app: mpc-server-party
|
||||||
|
spec:
|
||||||
|
selector:
|
||||||
|
app: mpc-server-party
|
||||||
|
clusterIP: None # Headless service for service discovery
|
||||||
|
ports:
|
||||||
|
- name: grpc
|
||||||
|
port: 50051
|
||||||
|
targetPort: 50051
|
||||||
|
protocol: TCP
|
||||||
|
- name: http
|
||||||
|
port: 8080
|
||||||
|
targetPort: 8080
|
||||||
|
protocol: TCP
|
||||||
|
|
@ -0,0 +1,189 @@
|
||||||
|
apiVersion: v1
|
||||||
|
kind: ServiceAccount
|
||||||
|
metadata:
|
||||||
|
name: mpc-session-coordinator
|
||||||
|
namespace: mpc-system
|
||||||
|
---
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
kind: Role
|
||||||
|
metadata:
|
||||||
|
name: mpc-session-coordinator-role
|
||||||
|
namespace: mpc-system
|
||||||
|
rules:
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["pods"]
|
||||||
|
verbs: ["get", "list", "watch"]
|
||||||
|
- apiGroups: [""]
|
||||||
|
resources: ["services"]
|
||||||
|
verbs: ["get", "list", "watch"]
|
||||||
|
---
|
||||||
|
apiVersion: rbac.authorization.k8s.io/v1
|
||||||
|
kind: RoleBinding
|
||||||
|
metadata:
|
||||||
|
name: mpc-session-coordinator-rolebinding
|
||||||
|
namespace: mpc-system
|
||||||
|
subjects:
|
||||||
|
- kind: ServiceAccount
|
||||||
|
name: mpc-session-coordinator
|
||||||
|
namespace: mpc-system
|
||||||
|
roleRef:
|
||||||
|
kind: Role
|
||||||
|
name: mpc-session-coordinator-role
|
||||||
|
apiGroup: rbac.authorization.k8s.io
|
||||||
|
---
|
||||||
|
apiVersion: apps/v1
|
||||||
|
kind: Deployment
|
||||||
|
metadata:
|
||||||
|
name: mpc-session-coordinator
|
||||||
|
namespace: mpc-system
|
||||||
|
labels:
|
||||||
|
app: mpc-session-coordinator
|
||||||
|
component: core
|
||||||
|
spec:
|
||||||
|
replicas: 2 # Can scale horizontally for high availability
|
||||||
|
selector:
|
||||||
|
matchLabels:
|
||||||
|
app: mpc-session-coordinator
|
||||||
|
template:
|
||||||
|
metadata:
|
||||||
|
labels:
|
||||||
|
app: mpc-session-coordinator
|
||||||
|
component: core
|
||||||
|
spec:
|
||||||
|
serviceAccountName: mpc-session-coordinator
|
||||||
|
containers:
|
||||||
|
- name: session-coordinator
|
||||||
|
image: mpc-system/session-coordinator:latest
|
||||||
|
imagePullPolicy: IfNotPresent
|
||||||
|
ports:
|
||||||
|
- name: grpc
|
||||||
|
containerPort: 50051
|
||||||
|
protocol: TCP
|
||||||
|
- name: http
|
||||||
|
containerPort: 8080
|
||||||
|
protocol: TCP
|
||||||
|
env:
|
||||||
|
- name: MPC_SERVER_GRPC_PORT
|
||||||
|
value: "50051"
|
||||||
|
- name: MPC_SERVER_HTTP_PORT
|
||||||
|
value: "8080"
|
||||||
|
- name: MPC_SERVER_ENVIRONMENT
|
||||||
|
valueFrom:
|
||||||
|
configMapKeyRef:
|
||||||
|
name: mpc-config
|
||||||
|
key: environment
|
||||||
|
- name: MPC_DATABASE_HOST
|
||||||
|
valueFrom:
|
||||||
|
configMapKeyRef:
|
||||||
|
name: mpc-config
|
||||||
|
key: postgres_host
|
||||||
|
- name: MPC_DATABASE_PORT
|
||||||
|
value: "5432"
|
||||||
|
- name: MPC_DATABASE_USER
|
||||||
|
valueFrom:
|
||||||
|
secretKeyRef:
|
||||||
|
name: mpc-secrets
|
||||||
|
key: postgres_user
|
||||||
|
- name: MPC_DATABASE_PASSWORD
|
||||||
|
valueFrom:
|
||||||
|
secretKeyRef:
|
||||||
|
name: mpc-secrets
|
||||||
|
key: postgres_password
|
||||||
|
- name: MPC_DATABASE_DBNAME
|
||||||
|
value: "mpc_system"
|
||||||
|
- name: MPC_DATABASE_SSLMODE
|
||||||
|
value: "disable"
|
||||||
|
- name: MPC_REDIS_HOST
|
||||||
|
valueFrom:
|
||||||
|
configMapKeyRef:
|
||||||
|
name: mpc-config
|
||||||
|
key: redis_host
|
||||||
|
- name: MPC_REDIS_PORT
|
||||||
|
value: "6379"
|
||||||
|
- name: MPC_REDIS_PASSWORD
|
||||||
|
valueFrom:
|
||||||
|
secretKeyRef:
|
||||||
|
name: mpc-secrets
|
||||||
|
key: redis_password
|
||||||
|
optional: true
|
||||||
|
- name: MPC_RABBITMQ_HOST
|
||||||
|
valueFrom:
|
||||||
|
configMapKeyRef:
|
||||||
|
name: mpc-config
|
||||||
|
key: rabbitmq_host
|
||||||
|
- name: MPC_RABBITMQ_PORT
|
||||||
|
value: "5672"
|
||||||
|
- name: MPC_RABBITMQ_USER
|
||||||
|
valueFrom:
|
||||||
|
secretKeyRef:
|
||||||
|
name: mpc-secrets
|
||||||
|
key: rabbitmq_user
|
||||||
|
- name: MPC_RABBITMQ_PASSWORD
|
||||||
|
valueFrom:
|
||||||
|
secretKeyRef:
|
||||||
|
name: mpc-secrets
|
||||||
|
key: rabbitmq_password
|
||||||
|
- name: MPC_JWT_SECRET_KEY
|
||||||
|
valueFrom:
|
||||||
|
secretKeyRef:
|
||||||
|
name: mpc-secrets
|
||||||
|
key: jwt_secret_key
|
||||||
|
- name: MPC_JWT_ISSUER
|
||||||
|
value: "mpc-system"
|
||||||
|
# K8s service discovery configuration
|
||||||
|
- name: K8S_NAMESPACE
|
||||||
|
valueFrom:
|
||||||
|
fieldRef:
|
||||||
|
fieldPath: metadata.namespace
|
||||||
|
- name: MPC_PARTY_SERVICE_NAME
|
||||||
|
value: "mpc-server-party"
|
||||||
|
- name: MPC_PARTY_LABEL_SELECTOR
|
||||||
|
value: "app=mpc-server-party"
|
||||||
|
- name: MPC_PARTY_GRPC_PORT
|
||||||
|
value: "50051"
|
||||||
|
- name: MPC_PARTY_DISCOVERY_INTERVAL
|
||||||
|
value: "30s"
|
||||||
|
resources:
|
||||||
|
requests:
|
||||||
|
memory: "256Mi"
|
||||||
|
cpu: "250m"
|
||||||
|
limits:
|
||||||
|
memory: "512Mi"
|
||||||
|
cpu: "500m"
|
||||||
|
livenessProbe:
|
||||||
|
httpGet:
|
||||||
|
path: /health
|
||||||
|
port: 8080
|
||||||
|
initialDelaySeconds: 30
|
||||||
|
periodSeconds: 10
|
||||||
|
timeoutSeconds: 5
|
||||||
|
failureThreshold: 3
|
||||||
|
readinessProbe:
|
||||||
|
httpGet:
|
||||||
|
path: /health
|
||||||
|
port: 8080
|
||||||
|
initialDelaySeconds: 10
|
||||||
|
periodSeconds: 5
|
||||||
|
timeoutSeconds: 3
|
||||||
|
failureThreshold: 2
|
||||||
|
---
|
||||||
|
apiVersion: v1
|
||||||
|
kind: Service
|
||||||
|
metadata:
|
||||||
|
name: mpc-session-coordinator
|
||||||
|
namespace: mpc-system
|
||||||
|
labels:
|
||||||
|
app: mpc-session-coordinator
|
||||||
|
spec:
|
||||||
|
selector:
|
||||||
|
app: mpc-session-coordinator
|
||||||
|
type: ClusterIP
|
||||||
|
ports:
|
||||||
|
- name: grpc
|
||||||
|
port: 50051
|
||||||
|
targetPort: 50051
|
||||||
|
protocol: TCP
|
||||||
|
- name: http
|
||||||
|
port: 8080
|
||||||
|
targetPort: 8080
|
||||||
|
protocol: TCP
|
||||||
|
|
@ -0,0 +1,17 @@
|
||||||
|
package output
|
||||||
|
|
||||||
|
// PartyEndpoint represents a party endpoint from the pool
|
||||||
|
type PartyEndpoint struct {
|
||||||
|
Address string
|
||||||
|
PartyID string
|
||||||
|
Ready bool
|
||||||
|
}
|
||||||
|
|
||||||
|
// PartyPoolPort defines the interface for party pool management
|
||||||
|
type PartyPoolPort interface {
|
||||||
|
// GetAvailableParties returns all available party endpoints
|
||||||
|
GetAvailableParties() []PartyEndpoint
|
||||||
|
|
||||||
|
// SelectParties selects n parties from the available pool
|
||||||
|
SelectParties(n int) ([]PartyEndpoint, error)
|
||||||
|
}
|
||||||
|
|
@ -19,6 +19,7 @@ type CreateSessionUseCase struct {
|
||||||
sessionRepo repositories.SessionRepository
|
sessionRepo repositories.SessionRepository
|
||||||
tokenGen jwt.TokenGenerator
|
tokenGen jwt.TokenGenerator
|
||||||
eventPublisher output.MessageBrokerPort
|
eventPublisher output.MessageBrokerPort
|
||||||
|
partyPool output.PartyPoolPort
|
||||||
coordinatorSvc *services.SessionCoordinatorService
|
coordinatorSvc *services.SessionCoordinatorService
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
@ -27,11 +28,13 @@ func NewCreateSessionUseCase(
|
||||||
sessionRepo repositories.SessionRepository,
|
sessionRepo repositories.SessionRepository,
|
||||||
tokenGen jwt.TokenGenerator,
|
tokenGen jwt.TokenGenerator,
|
||||||
eventPublisher output.MessageBrokerPort,
|
eventPublisher output.MessageBrokerPort,
|
||||||
|
partyPool output.PartyPoolPort,
|
||||||
) *CreateSessionUseCase {
|
) *CreateSessionUseCase {
|
||||||
return &CreateSessionUseCase{
|
return &CreateSessionUseCase{
|
||||||
sessionRepo: sessionRepo,
|
sessionRepo: sessionRepo,
|
||||||
tokenGen: tokenGen,
|
tokenGen: tokenGen,
|
||||||
eventPublisher: eventPublisher,
|
eventPublisher: eventPublisher,
|
||||||
|
partyPool: partyPool,
|
||||||
coordinatorSvc: services.NewSessionCoordinatorService(),
|
coordinatorSvc: services.NewSessionCoordinatorService(),
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
@ -79,12 +82,59 @@ func (uc *CreateSessionUseCase) Execute(
|
||||||
// 5. Add participants and generate join tokens
|
// 5. Add participants and generate join tokens
|
||||||
tokens := make(map[string]string)
|
tokens := make(map[string]string)
|
||||||
if len(req.Participants) == 0 {
|
if len(req.Participants) == 0 {
|
||||||
// For dynamic joining, generate a universal join token with wildcard party ID
|
// No participants provided - use party pool for automatic selection
|
||||||
universalToken, err := uc.tokenGen.GenerateJoinToken(session.ID.UUID(), "*", expiresIn)
|
if uc.partyPool != nil {
|
||||||
if err != nil {
|
// Select parties from K8s pool based on threshold
|
||||||
return nil, err
|
selectedParties, err := uc.partyPool.SelectParties(threshold.N())
|
||||||
|
if err != nil {
|
||||||
|
logger.Warn("failed to select parties from pool, falling back to dynamic join",
|
||||||
|
zap.Error(err),
|
||||||
|
zap.Int("required_parties", threshold.N()))
|
||||||
|
|
||||||
|
// Fallback: generate universal join token for dynamic joining
|
||||||
|
universalToken, err := uc.tokenGen.GenerateJoinToken(session.ID.UUID(), "*", expiresIn)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
tokens["*"] = universalToken
|
||||||
|
} else {
|
||||||
|
// Add selected parties as participants
|
||||||
|
for i, party := range selectedParties {
|
||||||
|
partyID, err := value_objects.NewPartyID(party.PartyID)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create participant with empty DeviceInfo (server parties don't have device info)
|
||||||
|
participant, err := entities.NewParticipant(partyID, i, entities.DeviceInfo{})
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
|
||||||
|
if err := session.AddParticipant(participant); err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
|
||||||
|
// Generate join token for this party
|
||||||
|
token, err := uc.tokenGen.GenerateJoinToken(session.ID.UUID(), party.PartyID, expiresIn)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
tokens[party.PartyID] = token
|
||||||
|
}
|
||||||
|
|
||||||
|
logger.Info("selected parties from K8s pool",
|
||||||
|
zap.String("session_id", session.ID.String()),
|
||||||
|
zap.Int("party_count", len(selectedParties)))
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
// No party pool configured - fallback to dynamic join
|
||||||
|
universalToken, err := uc.tokenGen.GenerateJoinToken(session.ID.UUID(), "*", expiresIn)
|
||||||
|
if err != nil {
|
||||||
|
return nil, err
|
||||||
|
}
|
||||||
|
tokens["*"] = universalToken
|
||||||
}
|
}
|
||||||
tokens["*"] = universalToken
|
|
||||||
} else {
|
} else {
|
||||||
// For pre-registered participants, generate individual tokens
|
// For pre-registered participants, generate individual tokens
|
||||||
for i, pInfo := range req.Participants {
|
for i, pInfo := range req.Participants {
|
||||||
|
|
|
||||||
|
|
@ -30,6 +30,7 @@ import (
|
||||||
redisadapter "github.com/rwadurian/mpc-system/services/session-coordinator/adapters/output/redis"
|
redisadapter "github.com/rwadurian/mpc-system/services/session-coordinator/adapters/output/redis"
|
||||||
"github.com/rwadurian/mpc-system/services/session-coordinator/application/use_cases"
|
"github.com/rwadurian/mpc-system/services/session-coordinator/application/use_cases"
|
||||||
"github.com/rwadurian/mpc-system/services/session-coordinator/domain/repositories"
|
"github.com/rwadurian/mpc-system/services/session-coordinator/domain/repositories"
|
||||||
|
"github.com/rwadurian/mpc-system/services/session-coordinator/infrastructure/k8s"
|
||||||
"go.uber.org/zap"
|
"go.uber.org/zap"
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
@ -96,8 +97,18 @@ func main() {
|
||||||
cfg.JWT.RefreshExpiry,
|
cfg.JWT.RefreshExpiry,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
// Initialize K8s party discovery (optional - will fallback gracefully if not in K8s)
|
||||||
|
partyPool, err := k8s.NewPartyDiscovery(logger.Log)
|
||||||
|
if err != nil {
|
||||||
|
logger.Warn("K8s party discovery not available, will use dynamic join mode",
|
||||||
|
zap.Error(err))
|
||||||
|
partyPool = nil // Set to nil so CreateSessionUseCase can handle gracefully
|
||||||
|
} else {
|
||||||
|
logger.Info("K8s party discovery initialized successfully")
|
||||||
|
}
|
||||||
|
|
||||||
// Initialize use cases
|
// Initialize use cases
|
||||||
createSessionUC := use_cases.NewCreateSessionUseCase(sessionRepo, jwtService, eventPublisher)
|
createSessionUC := use_cases.NewCreateSessionUseCase(sessionRepo, jwtService, eventPublisher, partyPool)
|
||||||
joinSessionUC := use_cases.NewJoinSessionUseCase(sessionRepo, jwtService, eventPublisher)
|
joinSessionUC := use_cases.NewJoinSessionUseCase(sessionRepo, jwtService, eventPublisher)
|
||||||
getSessionStatusUC := use_cases.NewGetSessionStatusUseCase(sessionRepo)
|
getSessionStatusUC := use_cases.NewGetSessionStatusUseCase(sessionRepo)
|
||||||
reportCompletionUC := use_cases.NewReportCompletionUseCase(sessionRepo, eventPublisher)
|
reportCompletionUC := use_cases.NewReportCompletionUseCase(sessionRepo, eventPublisher)
|
||||||
|
|
|
||||||
|
|
@ -45,9 +45,18 @@ func (d DeviceInfo) IsRecovery() bool {
|
||||||
}
|
}
|
||||||
|
|
||||||
// Validate validates the device info
|
// Validate validates the device info
|
||||||
|
// DeviceInfo is now optional - empty device info is valid
|
||||||
func (d DeviceInfo) Validate() error {
|
func (d DeviceInfo) Validate() error {
|
||||||
if d.DeviceType == "" {
|
// Allow empty DeviceInfo for server parties or anonymous participants
|
||||||
return ErrInvalidDeviceInfo
|
// Only validate if DeviceType is provided
|
||||||
|
if d.DeviceType != "" {
|
||||||
|
// If DeviceType is set, validate it's a known type
|
||||||
|
switch d.DeviceType {
|
||||||
|
case DeviceTypeAndroid, DeviceTypeIOS, DeviceTypePC, DeviceTypeServer, DeviceTypeRecovery:
|
||||||
|
return nil
|
||||||
|
default:
|
||||||
|
return ErrInvalidDeviceInfo
|
||||||
|
}
|
||||||
}
|
}
|
||||||
return nil
|
return nil
|
||||||
}
|
}
|
||||||
|
|
|
||||||
|
|
@ -0,0 +1,219 @@
|
||||||
|
package k8s
|
||||||
|
|
||||||
|
import (
|
||||||
|
"context"
|
||||||
|
"fmt"
|
||||||
|
"os"
|
||||||
|
"sync"
|
||||||
|
"time"
|
||||||
|
|
||||||
|
"github.com/rwadurian/mpc-system/services/session-coordinator/application/ports/output"
|
||||||
|
"go.uber.org/zap"
|
||||||
|
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
|
||||||
|
"k8s.io/client-go/kubernetes"
|
||||||
|
"k8s.io/client-go/rest"
|
||||||
|
"k8s.io/client-go/tools/clientcmd"
|
||||||
|
)
|
||||||
|
|
||||||
|
// PartyEndpoint represents a discovered party endpoint
|
||||||
|
type PartyEndpoint struct {
|
||||||
|
Address string
|
||||||
|
PodName string
|
||||||
|
Ready bool
|
||||||
|
}
|
||||||
|
|
||||||
|
// PartyDiscovery handles Kubernetes-based party service discovery
|
||||||
|
type PartyDiscovery struct {
|
||||||
|
clientset *kubernetes.Clientset
|
||||||
|
namespace string
|
||||||
|
serviceName string
|
||||||
|
labelSelector string
|
||||||
|
logger *zap.Logger
|
||||||
|
endpoints []PartyEndpoint
|
||||||
|
mu sync.RWMutex
|
||||||
|
refreshInterval time.Duration
|
||||||
|
}
|
||||||
|
|
||||||
|
// NewPartyDiscovery creates a new Kubernetes party discovery service
|
||||||
|
func NewPartyDiscovery(logger *zap.Logger) (*PartyDiscovery, error) {
|
||||||
|
var config *rest.Config
|
||||||
|
var err error
|
||||||
|
|
||||||
|
// Try in-cluster config first (when running inside K8s)
|
||||||
|
config, err = rest.InClusterConfig()
|
||||||
|
if err != nil {
|
||||||
|
// Fallback to kubeconfig for local development
|
||||||
|
kubeconfig := os.Getenv("KUBECONFIG")
|
||||||
|
if kubeconfig == "" {
|
||||||
|
kubeconfig = os.Getenv("HOME") + "/.kube/config"
|
||||||
|
}
|
||||||
|
config, err = clientcmd.BuildConfigFromFlags("", kubeconfig)
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to create k8s config: %w", err)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
clientset, err := kubernetes.NewForConfig(config)
|
||||||
|
if err != nil {
|
||||||
|
return nil, fmt.Errorf("failed to create k8s clientset: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
namespace := os.Getenv("K8S_NAMESPACE")
|
||||||
|
if namespace == "" {
|
||||||
|
namespace = "default"
|
||||||
|
}
|
||||||
|
|
||||||
|
serviceName := os.Getenv("MPC_PARTY_SERVICE_NAME")
|
||||||
|
if serviceName == "" {
|
||||||
|
serviceName = "mpc-server-party"
|
||||||
|
}
|
||||||
|
|
||||||
|
labelSelector := os.Getenv("MPC_PARTY_LABEL_SELECTOR")
|
||||||
|
if labelSelector == "" {
|
||||||
|
labelSelector = "app=mpc-server-party"
|
||||||
|
}
|
||||||
|
|
||||||
|
refreshInterval := 30 * time.Second
|
||||||
|
if interval := os.Getenv("MPC_PARTY_DISCOVERY_INTERVAL"); interval != "" {
|
||||||
|
if d, err := time.ParseDuration(interval); err == nil {
|
||||||
|
refreshInterval = d
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pd := &PartyDiscovery{
|
||||||
|
clientset: clientset,
|
||||||
|
namespace: namespace,
|
||||||
|
serviceName: serviceName,
|
||||||
|
labelSelector: labelSelector,
|
||||||
|
logger: logger,
|
||||||
|
endpoints: []PartyEndpoint{},
|
||||||
|
refreshInterval: refreshInterval,
|
||||||
|
}
|
||||||
|
|
||||||
|
// Initial discovery
|
||||||
|
if err := pd.refresh(); err != nil {
|
||||||
|
logger.Warn("Initial party discovery failed, will retry", zap.Error(err))
|
||||||
|
}
|
||||||
|
|
||||||
|
// Start background refresh
|
||||||
|
go pd.backgroundRefresh()
|
||||||
|
|
||||||
|
return pd, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// GetAvailableParties returns a list of available party endpoints
|
||||||
|
// Implements output.PartyPoolPort interface
|
||||||
|
func (pd *PartyDiscovery) GetAvailableParties() []output.PartyEndpoint {
|
||||||
|
pd.mu.RLock()
|
||||||
|
defer pd.mu.RUnlock()
|
||||||
|
|
||||||
|
// Return only ready endpoints
|
||||||
|
available := make([]output.PartyEndpoint, 0, len(pd.endpoints))
|
||||||
|
for _, ep := range pd.endpoints {
|
||||||
|
if ep.Ready {
|
||||||
|
available = append(available, output.PartyEndpoint{
|
||||||
|
Address: ep.Address,
|
||||||
|
PartyID: ep.PodName, // Use pod name as party ID
|
||||||
|
Ready: ep.Ready,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return available
|
||||||
|
}
|
||||||
|
|
||||||
|
// SelectParties randomly selects n parties from the available pool
|
||||||
|
// Implements output.PartyPoolPort interface
|
||||||
|
func (pd *PartyDiscovery) SelectParties(n int) ([]output.PartyEndpoint, error) {
|
||||||
|
available := pd.GetAvailableParties()
|
||||||
|
|
||||||
|
if len(available) < n {
|
||||||
|
return nil, fmt.Errorf("insufficient parties: need %d, have %d", n, len(available))
|
||||||
|
}
|
||||||
|
|
||||||
|
// For now, return first n parties
|
||||||
|
// TODO: Implement random selection or load balancing strategy
|
||||||
|
selected := make([]output.PartyEndpoint, n)
|
||||||
|
copy(selected, available[:n])
|
||||||
|
|
||||||
|
return selected, nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// refresh updates the list of party endpoints from Kubernetes
|
||||||
|
func (pd *PartyDiscovery) refresh() error {
|
||||||
|
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
|
||||||
|
defer cancel()
|
||||||
|
|
||||||
|
// Get pods matching the label selector
|
||||||
|
pods, err := pd.clientset.CoreV1().Pods(pd.namespace).List(ctx, metav1.ListOptions{
|
||||||
|
LabelSelector: pd.labelSelector,
|
||||||
|
})
|
||||||
|
if err != nil {
|
||||||
|
return fmt.Errorf("failed to list pods: %w", err)
|
||||||
|
}
|
||||||
|
|
||||||
|
endpoints := make([]PartyEndpoint, 0, len(pods.Items))
|
||||||
|
for _, pod := range pods.Items {
|
||||||
|
// Check if pod is ready
|
||||||
|
ready := false
|
||||||
|
for _, condition := range pod.Status.Conditions {
|
||||||
|
if condition.Type == "Ready" && condition.Status == "True" {
|
||||||
|
ready = true
|
||||||
|
break
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Get pod IP
|
||||||
|
if pod.Status.PodIP != "" {
|
||||||
|
// Assuming gRPC port is 50051 (should be configurable)
|
||||||
|
grpcPort := os.Getenv("MPC_PARTY_GRPC_PORT")
|
||||||
|
if grpcPort == "" {
|
||||||
|
grpcPort = "50051"
|
||||||
|
}
|
||||||
|
|
||||||
|
endpoints = append(endpoints, PartyEndpoint{
|
||||||
|
Address: fmt.Sprintf("%s:%s", pod.Status.PodIP, grpcPort),
|
||||||
|
PodName: pod.Name,
|
||||||
|
Ready: ready,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
pd.mu.Lock()
|
||||||
|
pd.endpoints = endpoints
|
||||||
|
pd.mu.Unlock()
|
||||||
|
|
||||||
|
pd.logger.Info("Party endpoints refreshed",
|
||||||
|
zap.Int("total", len(endpoints)),
|
||||||
|
zap.Int("ready", pd.countReady(endpoints)))
|
||||||
|
|
||||||
|
return nil
|
||||||
|
}
|
||||||
|
|
||||||
|
// backgroundRefresh periodically refreshes the party endpoints
|
||||||
|
func (pd *PartyDiscovery) backgroundRefresh() {
|
||||||
|
ticker := time.NewTicker(pd.refreshInterval)
|
||||||
|
defer ticker.Stop()
|
||||||
|
|
||||||
|
for range ticker.C {
|
||||||
|
if err := pd.refresh(); err != nil {
|
||||||
|
pd.logger.Error("Failed to refresh party endpoints", zap.Error(err))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// countReady counts the number of ready endpoints
|
||||||
|
func (pd *PartyDiscovery) countReady(endpoints []PartyEndpoint) int {
|
||||||
|
count := 0
|
||||||
|
for _, ep := range endpoints {
|
||||||
|
if ep.Ready {
|
||||||
|
count++
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return count
|
||||||
|
}
|
||||||
|
|
||||||
|
// Close stops the background refresh
|
||||||
|
func (pd *PartyDiscovery) Close() {
|
||||||
|
// Ticker will be stopped when the goroutine exits
|
||||||
|
pd.logger.Info("Party discovery service closed")
|
||||||
|
}
|
||||||
Loading…
Reference in New Issue