# Reward Service 部署指南 ## 部署概述 本文档描述 Reward Service 的部署架构和操作指南。 ### 部署架构 ``` ┌─────────────────┐ │ Load Balancer │ │ (Nginx/ALB) │ └────────┬────────┘ │ ┌────────────────┼────────────────┐ │ │ │ ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐ │ Reward Svc │ │ Reward Svc │ │ Reward Svc │ │ Instance 1 │ │ Instance 2 │ │ Instance 3 │ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │ │ │ └────────────────┼────────────────┘ │ ┌───────────────────────┼───────────────────────┐ │ │ │ ┌────▼────┐ ┌──────▼──────┐ ┌──────▼──────┐ │PostgreSQL│ │ Redis │ │ Kafka │ │ Primary │ │ Cluster │ │ Cluster │ └────┬────┘ └─────────────┘ └────────────┘ │ ┌────▼────┐ │PostgreSQL│ │ Replica │ └──────────┘ ``` --- ## 环境要求 ### 生产环境配置 | 组件 | 最低配置 | 推荐配置 | |------|---------|---------| | CPU | 2 vCPU | 4 vCPU | | 内存 | 4 GB | 8 GB | | 存储 | 50 GB SSD | 100 GB SSD | | Node.js | 20.x LTS | 20.x LTS | ### 基础设施要求 | 服务 | 版本 | 说明 | |------|------|------| | PostgreSQL | 15.x | 主数据库 | | Redis | 7.x | 缓存和会话 | | Apache Kafka | 3.x | 消息队列 | --- ## Docker 部署 ### Dockerfile ```dockerfile # 构建阶段 FROM node:20-alpine AS builder WORKDIR /app # 复制依赖文件 COPY package*.json ./ COPY prisma ./prisma/ # 安装依赖 RUN npm ci # 生成 Prisma Client RUN npx prisma generate # 复制源代码 COPY . . # 构建 RUN npm run build # 生产阶段 FROM node:20-alpine AS production WORKDIR /app # 安装生产依赖 COPY package*.json ./ RUN npm ci --only=production # 复制构建产物 COPY --from=builder /app/dist ./dist COPY --from=builder /app/prisma ./prisma COPY --from=builder /app/node_modules/.prisma ./node_modules/.prisma # 设置环境变量 ENV NODE_ENV=production ENV PORT=3000 # 健康检查 HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \ CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1 # 暴露端口 EXPOSE 3000 # 启动命令 CMD ["node", "dist/main.js"] ``` ### docker-compose.yml (生产) ```yaml version: '3.8' services: reward-service: build: context: . dockerfile: Dockerfile ports: - "3000:3000" environment: - NODE_ENV=production - DATABASE_URL=${DATABASE_URL} - REDIS_HOST=${REDIS_HOST} - REDIS_PORT=${REDIS_PORT} - KAFKA_BROKERS=${KAFKA_BROKERS} - JWT_SECRET=${JWT_SECRET} depends_on: postgres: condition: service_healthy redis: condition: service_healthy kafka: condition: service_healthy deploy: replicas: 3 resources: limits: cpus: '2' memory: 4G reservations: cpus: '1' memory: 2G restart_policy: condition: on-failure delay: 5s max_attempts: 3 postgres: image: postgres:15-alpine volumes: - postgres-data:/var/lib/postgresql/data environment: - POSTGRES_USER=${POSTGRES_USER} - POSTGRES_PASSWORD=${POSTGRES_PASSWORD} - POSTGRES_DB=${POSTGRES_DB} healthcheck: test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"] interval: 10s timeout: 5s retries: 5 redis: image: redis:7-alpine volumes: - redis-data:/data command: redis-server --appendonly yes healthcheck: test: ["CMD", "redis-cli", "ping"] interval: 10s timeout: 5s retries: 5 kafka: image: confluentinc/cp-kafka:7.5.0 depends_on: - zookeeper environment: KAFKA_BROKER_ID: 1 KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181 KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092 KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1 healthcheck: test: ["CMD-SHELL", "kafka-broker-api-versions --bootstrap-server localhost:9092"] interval: 10s timeout: 10s retries: 5 zookeeper: image: confluentinc/cp-zookeeper:7.5.0 environment: ZOOKEEPER_CLIENT_PORT: 2181 volumes: postgres-data: redis-data: ``` ### 构建和推送镜像 ```bash # 构建镜像 docker build -t reward-service:latest . # 标记镜像 docker tag reward-service:latest your-registry/reward-service:v1.0.0 # 推送到镜像仓库 docker push your-registry/reward-service:v1.0.0 ``` --- ## Kubernetes 部署 ### Deployment ```yaml # k8s/deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: reward-service namespace: rwadurian labels: app: reward-service spec: replicas: 3 selector: matchLabels: app: reward-service template: metadata: labels: app: reward-service spec: containers: - name: reward-service image: your-registry/reward-service:v1.0.0 ports: - containerPort: 3000 env: - name: NODE_ENV value: "production" - name: DATABASE_URL valueFrom: secretKeyRef: name: reward-service-secrets key: database-url - name: REDIS_HOST valueFrom: configMapKeyRef: name: reward-service-config key: redis-host - name: JWT_SECRET valueFrom: secretKeyRef: name: reward-service-secrets key: jwt-secret resources: requests: cpu: "500m" memory: "512Mi" limits: cpu: "2000m" memory: "4Gi" livenessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 30 periodSeconds: 10 readinessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 5 periodSeconds: 5 ``` ### Service ```yaml # k8s/service.yaml apiVersion: v1 kind: Service metadata: name: reward-service namespace: rwadurian spec: selector: app: reward-service ports: - protocol: TCP port: 80 targetPort: 3000 type: ClusterIP ``` ### Ingress ```yaml # k8s/ingress.yaml apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: reward-service-ingress namespace: rwadurian annotations: nginx.ingress.kubernetes.io/rewrite-target: / spec: rules: - host: api.rwadurian.com http: paths: - path: /rewards pathType: Prefix backend: service: name: reward-service port: number: 80 ``` ### ConfigMap ```yaml # k8s/configmap.yaml apiVersion: v1 kind: ConfigMap metadata: name: reward-service-config namespace: rwadurian data: redis-host: "redis-master.rwadurian.svc.cluster.local" redis-port: "6379" kafka-brokers: "kafka-0.kafka.rwadurian.svc.cluster.local:9092" ``` ### Secret ```yaml # k8s/secret.yaml apiVersion: v1 kind: Secret metadata: name: reward-service-secrets namespace: rwadurian type: Opaque stringData: database-url: "postgresql://user:password@postgres:5432/reward_db" jwt-secret: "your-jwt-secret-key" ``` ### HorizontalPodAutoscaler ```yaml # k8s/hpa.yaml apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: reward-service-hpa namespace: rwadurian spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: reward-service minReplicas: 3 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80 ``` ### 部署命令 ```bash # 创建命名空间 kubectl create namespace rwadurian # 应用配置 kubectl apply -f k8s/configmap.yaml kubectl apply -f k8s/secret.yaml # 部署服务 kubectl apply -f k8s/deployment.yaml kubectl apply -f k8s/service.yaml kubectl apply -f k8s/ingress.yaml kubectl apply -f k8s/hpa.yaml # 查看部署状态 kubectl get pods -n rwadurian kubectl get services -n rwadurian # 查看日志 kubectl logs -f deployment/reward-service -n rwadurian ``` --- ## 数据库迁移 ### 生产环境迁移 ```bash # 1. 备份数据库 pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME > backup_$(date +%Y%m%d).sql # 2. 运行迁移 DATABASE_URL=$PRODUCTION_DATABASE_URL npx prisma migrate deploy # 3. 验证迁移 npx prisma db pull --print ``` ### 回滚策略 ```bash # 查看迁移历史 npx prisma migrate status # 回滚到指定版本 (手动) psql -h $DB_HOST -U $DB_USER -d $DB_NAME < rollback_migration.sql ``` --- ## 环境变量配置 ### 生产环境变量 | 变量 | 描述 | 示例 | |------|------|------| | `NODE_ENV` | 运行环境 | `production` | | `PORT` | 服务端口 | `3000` | | `DATABASE_URL` | 数据库连接串 | `postgresql://user:pass@host:5432/db` | | `REDIS_HOST` | Redis主机 | `redis-master` | | `REDIS_PORT` | Redis端口 | `6379` | | `KAFKA_BROKERS` | Kafka集群 | `kafka-0:9092,kafka-1:9092` | | `KAFKA_CLIENT_ID` | Kafka客户端ID | `reward-service` | | `KAFKA_GROUP_ID` | Kafka消费组ID | `reward-service-group` | | `JWT_SECRET` | JWT密钥 | `` | | `LOG_LEVEL` | 日志级别 | `info` | --- ## 监控与告警 ### 健康检查端点 ```http GET /health ``` 响应: ```json { "status": "ok", "service": "reward-service", "timestamp": "2024-12-01T00:00:00.000Z" } ``` ### Prometheus 指标 添加 `@nestjs/terminus` 和 Prometheus 指标: ```typescript // src/api/controllers/metrics.controller.ts @Controller('metrics') export class MetricsController { @Get() @Header('Content-Type', 'text/plain') async getMetrics() { return register.metrics(); } } ``` ### 关键指标 | 指标 | 描述 | 告警阈值 | |------|------|---------| | `http_request_duration_seconds` | 请求响应时间 | P99 > 2s | | `http_requests_total` | 请求总数 | - | | `http_request_errors_total` | 错误请求数 | 错误率 > 1% | | `reward_distributed_total` | 分配的奖励数 | - | | `reward_settled_total` | 结算的奖励数 | - | | `reward_expired_total` | 过期的奖励数 | - | ### Grafana 仪表板 关键面板: 1. 请求吞吐量 (QPS) 2. 响应时间分布 (P50/P90/P99) 3. 错误率 4. 奖励分配/结算/过期趋势 5. 数据库连接池状态 6. Redis 缓存命中率 --- ## 日志管理 ### 日志格式 ```typescript // 结构化日志输出 { "timestamp": "2024-12-01T00:00:00.000Z", "level": "info", "context": "RewardApplicationService", "message": "Distributed 6 rewards for order 123", "metadata": { "orderId": "123", "userId": "100", "rewardCount": 6 } } ``` ### 日志级别 | 级别 | 用途 | |------|------| | `error` | 错误和异常 | | `warn` | 警告信息 | | `info` | 业务日志 | | `debug` | 调试信息 (仅开发环境) | ### ELK 集成 ```yaml # filebeat.yml filebeat.inputs: - type: container paths: - /var/lib/docker/containers/*/*.log processors: - add_kubernetes_metadata: output.elasticsearch: hosts: ["elasticsearch:9200"] indices: - index: "reward-service-%{+yyyy.MM.dd}" ``` --- ## CI/CD 流水线 ### GitHub Actions ```yaml # .github/workflows/deploy.yml name: Deploy to Production on: push: branches: [main] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: '20' - run: npm ci - run: npm run lint - run: npm test build: needs: test runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Build Docker image run: docker build -t reward-service:${{ github.sha }} . - name: Push to registry run: | docker tag reward-service:${{ github.sha }} ${{ secrets.REGISTRY }}/reward-service:${{ github.sha }} docker push ${{ secrets.REGISTRY }}/reward-service:${{ github.sha }} deploy: needs: build runs-on: ubuntu-latest steps: - name: Deploy to Kubernetes run: | kubectl set image deployment/reward-service \ reward-service=${{ secrets.REGISTRY }}/reward-service:${{ github.sha }} \ -n rwadurian ``` --- ## 故障排除 ### 常见问题 #### 1. 服务无法启动 ```bash # 检查日志 kubectl logs -f deployment/reward-service -n rwadurian # 检查环境变量 kubectl exec -it deployment/reward-service -n rwadurian -- env # 检查数据库连接 kubectl exec -it deployment/reward-service -n rwadurian -- \ npx prisma db pull ``` #### 2. 数据库连接问题 ```bash # 测试数据库连接 kubectl run -it --rm debug --image=postgres:15-alpine --restart=Never -- \ psql -h postgres -U user -d reward_db # 检查网络策略 kubectl get networkpolicy -n rwadurian ``` #### 3. Kafka 连接问题 ```bash # 列出 Kafka topics kubectl exec -it kafka-0 -- \ kafka-topics --list --bootstrap-server localhost:9092 # 检查消费者组 kubectl exec -it kafka-0 -- \ kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group reward-service-group ``` ### 回滚部署 ```bash # 查看历史版本 kubectl rollout history deployment/reward-service -n rwadurian # 回滚到上一版本 kubectl rollout undo deployment/reward-service -n rwadurian # 回滚到指定版本 kubectl rollout undo deployment/reward-service -n rwadurian --to-revision=2 ``` --- ## 安全最佳实践 1. **密钥管理**: 使用 Kubernetes Secrets 或外部密钥管理服务 (Vault) 2. **网络隔离**: 使用 NetworkPolicy 限制 Pod 间通信 3. **镜像安全**: 定期扫描镜像漏洞 4. **最小权限**: 使用非 root 用户运行容器 5. **TLS**: 启用服务间 mTLS 6. **审计日志**: 记录所有敏感操作