17 KiB
17 KiB
Backup Service Deployment Guide
Overview
This guide covers deploying the backup-service to various environments. The service is designed to run in Docker containers with PostgreSQL as the database.
Critical Security Requirement: The backup-service MUST be deployed on a physically separate server from identity-service to maintain MPC security.
Deployment Architecture
┌─────────────────────────────────────────────────────────────────────────┐
│ Production Architecture │
├─────────────────────────────────────────────────────────────────────────┤
│ │
│ Server A (Identity) Server B (Backup) │
│ ┌─────────────────────┐ ┌─────────────────────┐ │
│ │ identity-service │ │ backup-service │ │
│ │ ┌───────────────┐ │ │ ┌───────────────┐ │ │
│ │ │ PostgreSQL │ │ │ │ PostgreSQL │ │ │
│ │ │ (identity-db) │ │ │ │ (backup-db) │ │ │
│ │ └───────────────┘ │ │ └───────────────┘ │ │
│ └─────────────────────┘ └─────────────────────┘ │
│ │ ▲ │
│ │ Internal Network │ │
│ └───────────────────────────────┘ │
│ (Service-to-Service JWT) │
│ │
└─────────────────────────────────────────────────────────────────────────┘
Docker Deployment
Production Dockerfile
# Dockerfile
# Multi-stage build for smaller image size
# Stage 1: Build
FROM node:20-alpine AS builder
WORKDIR /app
# Copy package files
COPY package*.json ./
COPY prisma ./prisma/
# Install all dependencies (including devDependencies for build)
RUN npm ci
# Copy source code
COPY . .
# Generate Prisma client
RUN npx prisma generate
# Build application
RUN npm run build
# Stage 2: Production
FROM node:20-alpine AS production
WORKDIR /app
# Create non-root user for security
RUN addgroup -g 1001 -S nodejs && \
adduser -S nestjs -u 1001
# Copy package files
COPY package*.json ./
# Install production dependencies only
RUN npm ci --only=production && npm cache clean --force
# Copy built application
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules/.prisma ./node_modules/.prisma
COPY --from=builder /app/prisma ./prisma
# Change ownership to non-root user
RUN chown -R nestjs:nodejs /app
# Switch to non-root user
USER nestjs
# Expose port
EXPOSE 3002
# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://localhost:3002/health || exit 1
# Start application
CMD ["node", "dist/main.js"]
Build and Push Image
# Build image
docker build -t rwa-durian/backup-service:latest .
# Tag for registry
docker tag rwa-durian/backup-service:latest registry.example.com/backup-service:v1.0.0
# Push to registry
docker push registry.example.com/backup-service:v1.0.0
Docker Compose Deployment
Production Compose File
# docker-compose.prod.yml
version: '3.8'
services:
backup-service:
image: rwa-durian/backup-service:latest
container_name: backup-service
restart: unless-stopped
ports:
- "3002:3002"
environment:
- DATABASE_URL=postgresql://postgres:${DB_PASSWORD}@backup-db:5432/rwa_backup?schema=public
- APP_PORT=3002
- APP_ENV=production
- SERVICE_JWT_SECRET=${SERVICE_JWT_SECRET}
- ALLOWED_SERVICES=${ALLOWED_SERVICES}
- BACKUP_ENCRYPTION_KEY=${BACKUP_ENCRYPTION_KEY}
- BACKUP_ENCRYPTION_KEY_ID=${BACKUP_ENCRYPTION_KEY_ID}
- MAX_RETRIEVE_PER_DAY=3
- MAX_STORE_PER_MINUTE=10
- AUDIT_LOG_RETENTION_DAYS=365
depends_on:
backup-db:
condition: service_healthy
networks:
- backup-network
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
deploy:
resources:
limits:
cpus: '1'
memory: 512M
reservations:
cpus: '0.5'
memory: 256M
backup-db:
image: postgres:15-alpine
container_name: backup-db
restart: unless-stopped
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: ${DB_PASSWORD}
POSTGRES_DB: rwa_backup
volumes:
- backup-db-data:/var/lib/postgresql/data
- ./init-db.sql:/docker-entrypoint-initdb.d/init.sql:ro
ports:
- "5433:5432" # Different port to avoid conflicts
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 10s
timeout: 5s
retries: 5
networks:
- backup-network
logging:
driver: "json-file"
options:
max-size: "10m"
max-file: "3"
volumes:
backup-db-data:
driver: local
networks:
backup-network:
driver: bridge
Environment File
# .env.production
DB_PASSWORD=your-strong-database-password-here
SERVICE_JWT_SECRET=your-super-secret-service-jwt-key-min-32-chars
ALLOWED_SERVICES=identity-service,recovery-service
BACKUP_ENCRYPTION_KEY=your-256-bit-encryption-key-in-hex-64-chars
BACKUP_ENCRYPTION_KEY_ID=key-v1
Deploy Commands
# Pull latest image
docker-compose -f docker-compose.prod.yml pull
# Start services
docker-compose -f docker-compose.prod.yml up -d
# Run database migrations
docker-compose -f docker-compose.prod.yml exec backup-service \
npx prisma migrate deploy
# View logs
docker-compose -f docker-compose.prod.yml logs -f backup-service
# Stop services
docker-compose -f docker-compose.prod.yml down
Kubernetes Deployment
Namespace and ConfigMap
# kubernetes/namespace.yaml
apiVersion: v1
kind: Namespace
metadata:
name: rwa-backup
---
# kubernetes/configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: backup-service-config
namespace: rwa-backup
data:
APP_PORT: "3002"
APP_ENV: "production"
ALLOWED_SERVICES: "identity-service,recovery-service"
MAX_RETRIEVE_PER_DAY: "3"
MAX_STORE_PER_MINUTE: "10"
AUDIT_LOG_RETENTION_DAYS: "365"
Secrets
# kubernetes/secrets.yaml
apiVersion: v1
kind: Secret
metadata:
name: backup-service-secrets
namespace: rwa-backup
type: Opaque
stringData:
DATABASE_URL: "postgresql://postgres:password@backup-db:5432/rwa_backup?schema=public"
SERVICE_JWT_SECRET: "your-super-secret-service-jwt-key-min-32-chars"
BACKUP_ENCRYPTION_KEY: "your-256-bit-encryption-key-in-hex-64-chars"
BACKUP_ENCRYPTION_KEY_ID: "key-v1"
Deployment
# kubernetes/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: backup-service
namespace: rwa-backup
spec:
replicas: 2
selector:
matchLabels:
app: backup-service
template:
metadata:
labels:
app: backup-service
spec:
securityContext:
runAsNonRoot: true
runAsUser: 1001
containers:
- name: backup-service
image: registry.example.com/backup-service:v1.0.0
ports:
- containerPort: 3002
envFrom:
- configMapRef:
name: backup-service-config
- secretRef:
name: backup-service-secrets
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health/live
port: 3002
initialDelaySeconds: 15
periodSeconds: 20
readinessProbe:
httpGet:
path: /health/ready
port: 3002
initialDelaySeconds: 5
periodSeconds: 10
Service
# kubernetes/service.yaml
apiVersion: v1
kind: Service
metadata:
name: backup-service
namespace: rwa-backup
spec:
selector:
app: backup-service
ports:
- protocol: TCP
port: 3002
targetPort: 3002
type: ClusterIP
PostgreSQL StatefulSet
# kubernetes/postgres.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: backup-db
namespace: rwa-backup
spec:
serviceName: backup-db
replicas: 1
selector:
matchLabels:
app: backup-db
template:
metadata:
labels:
app: backup-db
spec:
containers:
- name: postgres
image: postgres:15-alpine
ports:
- containerPort: 5432
env:
- name: POSTGRES_DB
value: rwa_backup
- name: POSTGRES_USER
value: postgres
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: backup-db-secrets
key: password
volumeMounts:
- name: postgres-data
mountPath: /var/lib/postgresql/data
resources:
requests:
memory: "256Mi"
cpu: "250m"
limits:
memory: "512Mi"
cpu: "500m"
volumeClaimTemplates:
- metadata:
name: postgres-data
spec:
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 10Gi
Deploy to Kubernetes
# Apply all manifests
kubectl apply -f kubernetes/
# Check deployment status
kubectl -n rwa-backup get pods
# View logs
kubectl -n rwa-backup logs -f deployment/backup-service
# Run migrations
kubectl -n rwa-backup exec -it deployment/backup-service -- \
npx prisma migrate deploy
Database Management
Initial Setup
# Run migrations on first deployment
npx prisma migrate deploy
# Or push schema directly (development only)
npx prisma db push
Backup and Restore
# Backup database
docker-compose exec backup-db pg_dump -U postgres rwa_backup > backup.sql
# Restore database
docker-compose exec -T backup-db psql -U postgres rwa_backup < backup.sql
Migration in Production
# Generate migration (development)
npx prisma migrate dev --name add_new_field
# Apply migration (production)
npx prisma migrate deploy
Environment Variables Reference
Required Variables
| Variable | Description | Example |
|---|---|---|
DATABASE_URL |
PostgreSQL connection string | postgresql://user:pass@host:5432/db |
SERVICE_JWT_SECRET |
JWT secret for service auth (min 32 chars) | Random 64+ char string |
ALLOWED_SERVICES |
Comma-separated allowed services | identity-service,recovery-service |
BACKUP_ENCRYPTION_KEY |
256-bit key in hex (64 chars) | 64 hex characters |
BACKUP_ENCRYPTION_KEY_ID |
Key identifier | key-v1 |
Optional Variables
| Variable | Default | Description |
|---|---|---|
APP_PORT |
3002 |
Server port |
APP_ENV |
development |
Environment (development/production) |
MAX_RETRIEVE_PER_DAY |
3 |
Max retrieves per user per day |
MAX_STORE_PER_MINUTE |
10 |
Max stores per minute |
AUDIT_LOG_RETENTION_DAYS |
365 |
Audit log retention period |
Security Considerations
Network Security
-
Isolate backup-service network
- Use private subnets
- Restrict access to identity-service only
- Use VPN or VPC peering for cross-server communication
-
Firewall rules
# Allow only identity-service IP iptables -A INPUT -p tcp --dport 3002 -s identity-service-ip -j ACCEPT iptables -A INPUT -p tcp --dport 3002 -j DROP -
TLS/SSL
- Use reverse proxy (nginx/traefik) for TLS termination
- Enable mutual TLS for service-to-service communication
Secret Management
-
Use secret management services
- AWS Secrets Manager
- HashiCorp Vault
- Kubernetes Secrets with encryption at rest
-
Rotate secrets regularly
- Rotate encryption keys annually
- Rotate JWT secrets quarterly
- Use key versioning for encryption keys
Database Security
- Use strong passwords
- Enable SSL for database connections
- Regular backups with encryption
- Limit database user permissions
Monitoring and Logging
Health Endpoints
| Endpoint | Purpose |
|---|---|
GET /health |
Basic health check |
GET /health/ready |
Readiness probe (includes DB check) |
GET /health/live |
Liveness probe |
Prometheus Metrics (Optional)
# Add to deployment
- name: PROMETHEUS_ENABLED
value: "true"
- name: PROMETHEUS_PORT
value: "9102"
Log Aggregation
Configure log driver for centralized logging:
logging:
driver: "fluentd"
options:
fluentd-address: "localhost:24224"
tag: "backup-service"
Troubleshooting
Common Issues
Service won't start
# Check logs
docker-compose logs backup-service
# Common causes:
# 1. Database not ready
# 2. Missing environment variables
# 3. Invalid encryption key format
Database connection failed
# Check database is running
docker-compose ps backup-db
# Check database logs
docker-compose logs backup-db
# Test connection
docker-compose exec backup-service \
npx prisma db pull
Authentication errors
# Verify JWT secret matches between services
# Check ALLOWED_SERVICES includes calling service
# Verify token format and expiration
Recovery Procedures
Database Recovery
# Stop service
docker-compose stop backup-service
# Restore from backup
docker-compose exec -T backup-db psql -U postgres rwa_backup < backup.sql
# Run migrations
docker-compose exec backup-service npx prisma migrate deploy
# Start service
docker-compose start backup-service
Key Rotation
- Add new key to encryption service
- Re-encrypt existing data with new key
- Update
BACKUP_ENCRYPTION_KEY_ID - Remove old key after transition period
Scaling
Horizontal Scaling
The service is stateless and can be horizontally scaled:
# Docker Compose scale
docker-compose up -d --scale backup-service=3
# Kubernetes replicas
kubectl -n rwa-backup scale deployment/backup-service --replicas=3
Load Balancing
Use a load balancer in front of multiple instances:
# nginx.conf
upstream backup_service {
least_conn;
server backup-service-1:3002;
server backup-service-2:3002;
server backup-service-3:3002;
}
server {
listen 443 ssl;
server_name backup-api.example.com;
location / {
proxy_pass http://backup_service;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
}
}
Database Scaling
For high availability:
- Use managed PostgreSQL (AWS RDS, GCP Cloud SQL)
- Configure read replicas for read scaling
- Use connection pooling (PgBouncer)
Maintenance
Regular Tasks
| Task | Frequency | Command |
|---|---|---|
| Database backup | Daily | pg_dump rwa_backup > backup.sql |
| Log rotation | Weekly | Automatic with log driver config |
| Security updates | Monthly | Rebuild and redeploy image |
| Audit log cleanup | Monthly | DELETE FROM share_access_logs WHERE created_at < NOW() - INTERVAL '365 days' |
Update Procedure
# 1. Build new image
docker build -t rwa-durian/backup-service:v1.1.0 .
# 2. Push to registry
docker push registry.example.com/backup-service:v1.1.0
# 3. Update deployment
docker-compose pull
docker-compose up -d
# 4. Run migrations if needed
docker-compose exec backup-service npx prisma migrate deploy
# 5. Verify health
curl http://localhost:3002/health