680 lines
16 KiB
Markdown
680 lines
16 KiB
Markdown
# Reward Service 部署指南
|
|
|
|
## 部署概述
|
|
|
|
本文档描述 Reward Service 的部署架构和操作指南。
|
|
|
|
### 部署架构
|
|
|
|
```
|
|
┌─────────────────┐
|
|
│ Load Balancer │
|
|
│ (Nginx/ALB) │
|
|
└────────┬────────┘
|
|
│
|
|
┌────────────────┼────────────────┐
|
|
│ │ │
|
|
┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
|
|
│ Reward Svc │ │ Reward Svc │ │ Reward Svc │
|
|
│ Instance 1 │ │ Instance 2 │ │ Instance 3 │
|
|
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
|
|
│ │ │
|
|
└────────────────┼────────────────┘
|
|
│
|
|
┌───────────────────────┼───────────────────────┐
|
|
│ │ │
|
|
┌────▼────┐ ┌──────▼──────┐ ┌──────▼──────┐
|
|
│PostgreSQL│ │ Redis │ │ Kafka │
|
|
│ Primary │ │ Cluster │ │ Cluster │
|
|
└────┬────┘ └─────────────┘ └────────────┘
|
|
│
|
|
┌────▼────┐
|
|
│PostgreSQL│
|
|
│ Replica │
|
|
└──────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## 环境要求
|
|
|
|
### 生产环境配置
|
|
|
|
| 组件 | 最低配置 | 推荐配置 |
|
|
|------|---------|---------|
|
|
| CPU | 2 vCPU | 4 vCPU |
|
|
| 内存 | 4 GB | 8 GB |
|
|
| 存储 | 50 GB SSD | 100 GB SSD |
|
|
| Node.js | 20.x LTS | 20.x LTS |
|
|
|
|
### 基础设施要求
|
|
|
|
| 服务 | 版本 | 说明 |
|
|
|------|------|------|
|
|
| PostgreSQL | 15.x | 主数据库 |
|
|
| Redis | 7.x | 缓存和会话 |
|
|
| Apache Kafka | 3.x | 消息队列 |
|
|
|
|
---
|
|
|
|
## Docker 部署
|
|
|
|
### Dockerfile
|
|
|
|
```dockerfile
|
|
# 构建阶段
|
|
FROM node:20-alpine AS builder
|
|
|
|
WORKDIR /app
|
|
|
|
# 复制依赖文件
|
|
COPY package*.json ./
|
|
COPY prisma ./prisma/
|
|
|
|
# 安装依赖
|
|
RUN npm ci
|
|
|
|
# 生成 Prisma Client
|
|
RUN npx prisma generate
|
|
|
|
# 复制源代码
|
|
COPY . .
|
|
|
|
# 构建
|
|
RUN npm run build
|
|
|
|
# 生产阶段
|
|
FROM node:20-alpine AS production
|
|
|
|
WORKDIR /app
|
|
|
|
# 安装生产依赖
|
|
COPY package*.json ./
|
|
RUN npm ci --only=production
|
|
|
|
# 复制构建产物
|
|
COPY --from=builder /app/dist ./dist
|
|
COPY --from=builder /app/prisma ./prisma
|
|
COPY --from=builder /app/node_modules/.prisma ./node_modules/.prisma
|
|
|
|
# 设置环境变量
|
|
ENV NODE_ENV=production
|
|
ENV PORT=3000
|
|
|
|
# 健康检查
|
|
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
|
|
CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1
|
|
|
|
# 暴露端口
|
|
EXPOSE 3000
|
|
|
|
# 启动命令
|
|
CMD ["node", "dist/main.js"]
|
|
```
|
|
|
|
### docker-compose.yml (生产)
|
|
|
|
```yaml
|
|
version: '3.8'
|
|
|
|
services:
|
|
reward-service:
|
|
build:
|
|
context: .
|
|
dockerfile: Dockerfile
|
|
ports:
|
|
- "3000:3000"
|
|
environment:
|
|
- NODE_ENV=production
|
|
- DATABASE_URL=${DATABASE_URL}
|
|
- REDIS_HOST=${REDIS_HOST}
|
|
- REDIS_PORT=${REDIS_PORT}
|
|
- KAFKA_BROKERS=${KAFKA_BROKERS}
|
|
- JWT_SECRET=${JWT_SECRET}
|
|
depends_on:
|
|
postgres:
|
|
condition: service_healthy
|
|
redis:
|
|
condition: service_healthy
|
|
kafka:
|
|
condition: service_healthy
|
|
deploy:
|
|
replicas: 3
|
|
resources:
|
|
limits:
|
|
cpus: '2'
|
|
memory: 4G
|
|
reservations:
|
|
cpus: '1'
|
|
memory: 2G
|
|
restart_policy:
|
|
condition: on-failure
|
|
delay: 5s
|
|
max_attempts: 3
|
|
|
|
postgres:
|
|
image: postgres:15-alpine
|
|
volumes:
|
|
- postgres-data:/var/lib/postgresql/data
|
|
environment:
|
|
- POSTGRES_USER=${POSTGRES_USER}
|
|
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD}
|
|
- POSTGRES_DB=${POSTGRES_DB}
|
|
healthcheck:
|
|
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
|
|
interval: 10s
|
|
timeout: 5s
|
|
retries: 5
|
|
|
|
redis:
|
|
image: redis:7-alpine
|
|
volumes:
|
|
- redis-data:/data
|
|
command: redis-server --appendonly yes
|
|
healthcheck:
|
|
test: ["CMD", "redis-cli", "ping"]
|
|
interval: 10s
|
|
timeout: 5s
|
|
retries: 5
|
|
|
|
kafka:
|
|
image: confluentinc/cp-kafka:7.5.0
|
|
depends_on:
|
|
- zookeeper
|
|
environment:
|
|
KAFKA_BROKER_ID: 1
|
|
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
|
|
KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://kafka:9092
|
|
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
|
|
healthcheck:
|
|
test: ["CMD-SHELL", "kafka-broker-api-versions --bootstrap-server localhost:9092"]
|
|
interval: 10s
|
|
timeout: 10s
|
|
retries: 5
|
|
|
|
zookeeper:
|
|
image: confluentinc/cp-zookeeper:7.5.0
|
|
environment:
|
|
ZOOKEEPER_CLIENT_PORT: 2181
|
|
|
|
volumes:
|
|
postgres-data:
|
|
redis-data:
|
|
```
|
|
|
|
### 构建和推送镜像
|
|
|
|
```bash
|
|
# 构建镜像
|
|
docker build -t reward-service:latest .
|
|
|
|
# 标记镜像
|
|
docker tag reward-service:latest your-registry/reward-service:v1.0.0
|
|
|
|
# 推送到镜像仓库
|
|
docker push your-registry/reward-service:v1.0.0
|
|
```
|
|
|
|
---
|
|
|
|
## Kubernetes 部署
|
|
|
|
### Deployment
|
|
|
|
```yaml
|
|
# k8s/deployment.yaml
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
metadata:
|
|
name: reward-service
|
|
namespace: rwadurian
|
|
labels:
|
|
app: reward-service
|
|
spec:
|
|
replicas: 3
|
|
selector:
|
|
matchLabels:
|
|
app: reward-service
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: reward-service
|
|
spec:
|
|
containers:
|
|
- name: reward-service
|
|
image: your-registry/reward-service:v1.0.0
|
|
ports:
|
|
- containerPort: 3000
|
|
env:
|
|
- name: NODE_ENV
|
|
value: "production"
|
|
- name: DATABASE_URL
|
|
valueFrom:
|
|
secretKeyRef:
|
|
name: reward-service-secrets
|
|
key: database-url
|
|
- name: REDIS_HOST
|
|
valueFrom:
|
|
configMapKeyRef:
|
|
name: reward-service-config
|
|
key: redis-host
|
|
- name: JWT_SECRET
|
|
valueFrom:
|
|
secretKeyRef:
|
|
name: reward-service-secrets
|
|
key: jwt-secret
|
|
resources:
|
|
requests:
|
|
cpu: "500m"
|
|
memory: "512Mi"
|
|
limits:
|
|
cpu: "2000m"
|
|
memory: "4Gi"
|
|
livenessProbe:
|
|
httpGet:
|
|
path: /health
|
|
port: 3000
|
|
initialDelaySeconds: 30
|
|
periodSeconds: 10
|
|
readinessProbe:
|
|
httpGet:
|
|
path: /health
|
|
port: 3000
|
|
initialDelaySeconds: 5
|
|
periodSeconds: 5
|
|
```
|
|
|
|
### Service
|
|
|
|
```yaml
|
|
# k8s/service.yaml
|
|
apiVersion: v1
|
|
kind: Service
|
|
metadata:
|
|
name: reward-service
|
|
namespace: rwadurian
|
|
spec:
|
|
selector:
|
|
app: reward-service
|
|
ports:
|
|
- protocol: TCP
|
|
port: 80
|
|
targetPort: 3000
|
|
type: ClusterIP
|
|
```
|
|
|
|
### Ingress
|
|
|
|
```yaml
|
|
# k8s/ingress.yaml
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: Ingress
|
|
metadata:
|
|
name: reward-service-ingress
|
|
namespace: rwadurian
|
|
annotations:
|
|
nginx.ingress.kubernetes.io/rewrite-target: /
|
|
spec:
|
|
rules:
|
|
- host: api.rwadurian.com
|
|
http:
|
|
paths:
|
|
- path: /rewards
|
|
pathType: Prefix
|
|
backend:
|
|
service:
|
|
name: reward-service
|
|
port:
|
|
number: 80
|
|
```
|
|
|
|
### ConfigMap
|
|
|
|
```yaml
|
|
# k8s/configmap.yaml
|
|
apiVersion: v1
|
|
kind: ConfigMap
|
|
metadata:
|
|
name: reward-service-config
|
|
namespace: rwadurian
|
|
data:
|
|
redis-host: "redis-master.rwadurian.svc.cluster.local"
|
|
redis-port: "6379"
|
|
kafka-brokers: "kafka-0.kafka.rwadurian.svc.cluster.local:9092"
|
|
```
|
|
|
|
### Secret
|
|
|
|
```yaml
|
|
# k8s/secret.yaml
|
|
apiVersion: v1
|
|
kind: Secret
|
|
metadata:
|
|
name: reward-service-secrets
|
|
namespace: rwadurian
|
|
type: Opaque
|
|
stringData:
|
|
database-url: "postgresql://user:password@postgres:5432/reward_db"
|
|
jwt-secret: "your-jwt-secret-key"
|
|
```
|
|
|
|
### HorizontalPodAutoscaler
|
|
|
|
```yaml
|
|
# k8s/hpa.yaml
|
|
apiVersion: autoscaling/v2
|
|
kind: HorizontalPodAutoscaler
|
|
metadata:
|
|
name: reward-service-hpa
|
|
namespace: rwadurian
|
|
spec:
|
|
scaleTargetRef:
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
name: reward-service
|
|
minReplicas: 3
|
|
maxReplicas: 10
|
|
metrics:
|
|
- type: Resource
|
|
resource:
|
|
name: cpu
|
|
target:
|
|
type: Utilization
|
|
averageUtilization: 70
|
|
- type: Resource
|
|
resource:
|
|
name: memory
|
|
target:
|
|
type: Utilization
|
|
averageUtilization: 80
|
|
```
|
|
|
|
### 部署命令
|
|
|
|
```bash
|
|
# 创建命名空间
|
|
kubectl create namespace rwadurian
|
|
|
|
# 应用配置
|
|
kubectl apply -f k8s/configmap.yaml
|
|
kubectl apply -f k8s/secret.yaml
|
|
|
|
# 部署服务
|
|
kubectl apply -f k8s/deployment.yaml
|
|
kubectl apply -f k8s/service.yaml
|
|
kubectl apply -f k8s/ingress.yaml
|
|
kubectl apply -f k8s/hpa.yaml
|
|
|
|
# 查看部署状态
|
|
kubectl get pods -n rwadurian
|
|
kubectl get services -n rwadurian
|
|
|
|
# 查看日志
|
|
kubectl logs -f deployment/reward-service -n rwadurian
|
|
```
|
|
|
|
---
|
|
|
|
## 数据库迁移
|
|
|
|
### 生产环境迁移
|
|
|
|
```bash
|
|
# 1. 备份数据库
|
|
pg_dump -h $DB_HOST -U $DB_USER -d $DB_NAME > backup_$(date +%Y%m%d).sql
|
|
|
|
# 2. 运行迁移
|
|
DATABASE_URL=$PRODUCTION_DATABASE_URL npx prisma migrate deploy
|
|
|
|
# 3. 验证迁移
|
|
npx prisma db pull --print
|
|
```
|
|
|
|
### 回滚策略
|
|
|
|
```bash
|
|
# 查看迁移历史
|
|
npx prisma migrate status
|
|
|
|
# 回滚到指定版本 (手动)
|
|
psql -h $DB_HOST -U $DB_USER -d $DB_NAME < rollback_migration.sql
|
|
```
|
|
|
|
---
|
|
|
|
## 环境变量配置
|
|
|
|
### 生产环境变量
|
|
|
|
| 变量 | 描述 | 示例 |
|
|
|------|------|------|
|
|
| `NODE_ENV` | 运行环境 | `production` |
|
|
| `PORT` | 服务端口 | `3000` |
|
|
| `DATABASE_URL` | 数据库连接串 | `postgresql://user:pass@host:5432/db` |
|
|
| `REDIS_HOST` | Redis主机 | `redis-master` |
|
|
| `REDIS_PORT` | Redis端口 | `6379` |
|
|
| `KAFKA_BROKERS` | Kafka集群 | `kafka-0:9092,kafka-1:9092` |
|
|
| `KAFKA_CLIENT_ID` | Kafka客户端ID | `reward-service` |
|
|
| `KAFKA_GROUP_ID` | Kafka消费组ID | `reward-service-group` |
|
|
| `JWT_SECRET` | JWT密钥 | `<strong-secret>` |
|
|
| `LOG_LEVEL` | 日志级别 | `info` |
|
|
|
|
---
|
|
|
|
## 监控与告警
|
|
|
|
### 健康检查端点
|
|
|
|
```http
|
|
GET /health
|
|
```
|
|
|
|
响应:
|
|
```json
|
|
{
|
|
"status": "ok",
|
|
"service": "reward-service",
|
|
"timestamp": "2024-12-01T00:00:00.000Z"
|
|
}
|
|
```
|
|
|
|
### Prometheus 指标
|
|
|
|
添加 `@nestjs/terminus` 和 Prometheus 指标:
|
|
|
|
```typescript
|
|
// src/api/controllers/metrics.controller.ts
|
|
@Controller('metrics')
|
|
export class MetricsController {
|
|
@Get()
|
|
@Header('Content-Type', 'text/plain')
|
|
async getMetrics() {
|
|
return register.metrics();
|
|
}
|
|
}
|
|
```
|
|
|
|
### 关键指标
|
|
|
|
| 指标 | 描述 | 告警阈值 |
|
|
|------|------|---------|
|
|
| `http_request_duration_seconds` | 请求响应时间 | P99 > 2s |
|
|
| `http_requests_total` | 请求总数 | - |
|
|
| `http_request_errors_total` | 错误请求数 | 错误率 > 1% |
|
|
| `reward_distributed_total` | 分配的奖励数 | - |
|
|
| `reward_settled_total` | 结算的奖励数 | - |
|
|
| `reward_expired_total` | 过期的奖励数 | - |
|
|
|
|
### Grafana 仪表板
|
|
|
|
关键面板:
|
|
1. 请求吞吐量 (QPS)
|
|
2. 响应时间分布 (P50/P90/P99)
|
|
3. 错误率
|
|
4. 奖励分配/结算/过期趋势
|
|
5. 数据库连接池状态
|
|
6. Redis 缓存命中率
|
|
|
|
---
|
|
|
|
## 日志管理
|
|
|
|
### 日志格式
|
|
|
|
```typescript
|
|
// 结构化日志输出
|
|
{
|
|
"timestamp": "2024-12-01T00:00:00.000Z",
|
|
"level": "info",
|
|
"context": "RewardApplicationService",
|
|
"message": "Distributed 6 rewards for order 123",
|
|
"metadata": {
|
|
"orderId": "123",
|
|
"userId": "100",
|
|
"rewardCount": 6
|
|
}
|
|
}
|
|
```
|
|
|
|
### 日志级别
|
|
|
|
| 级别 | 用途 |
|
|
|------|------|
|
|
| `error` | 错误和异常 |
|
|
| `warn` | 警告信息 |
|
|
| `info` | 业务日志 |
|
|
| `debug` | 调试信息 (仅开发环境) |
|
|
|
|
### ELK 集成
|
|
|
|
```yaml
|
|
# filebeat.yml
|
|
filebeat.inputs:
|
|
- type: container
|
|
paths:
|
|
- /var/lib/docker/containers/*/*.log
|
|
processors:
|
|
- add_kubernetes_metadata:
|
|
|
|
output.elasticsearch:
|
|
hosts: ["elasticsearch:9200"]
|
|
indices:
|
|
- index: "reward-service-%{+yyyy.MM.dd}"
|
|
```
|
|
|
|
---
|
|
|
|
## CI/CD 流水线
|
|
|
|
### GitHub Actions
|
|
|
|
```yaml
|
|
# .github/workflows/deploy.yml
|
|
name: Deploy to Production
|
|
|
|
on:
|
|
push:
|
|
branches: [main]
|
|
|
|
jobs:
|
|
test:
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
- uses: actions/setup-node@v4
|
|
with:
|
|
node-version: '20'
|
|
- run: npm ci
|
|
- run: npm run lint
|
|
- run: npm test
|
|
|
|
build:
|
|
needs: test
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
- name: Build Docker image
|
|
run: docker build -t reward-service:${{ github.sha }} .
|
|
- name: Push to registry
|
|
run: |
|
|
docker tag reward-service:${{ github.sha }} ${{ secrets.REGISTRY }}/reward-service:${{ github.sha }}
|
|
docker push ${{ secrets.REGISTRY }}/reward-service:${{ github.sha }}
|
|
|
|
deploy:
|
|
needs: build
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Deploy to Kubernetes
|
|
run: |
|
|
kubectl set image deployment/reward-service \
|
|
reward-service=${{ secrets.REGISTRY }}/reward-service:${{ github.sha }} \
|
|
-n rwadurian
|
|
```
|
|
|
|
---
|
|
|
|
## 故障排除
|
|
|
|
### 常见问题
|
|
|
|
#### 1. 服务无法启动
|
|
|
|
```bash
|
|
# 检查日志
|
|
kubectl logs -f deployment/reward-service -n rwadurian
|
|
|
|
# 检查环境变量
|
|
kubectl exec -it deployment/reward-service -n rwadurian -- env
|
|
|
|
# 检查数据库连接
|
|
kubectl exec -it deployment/reward-service -n rwadurian -- \
|
|
npx prisma db pull
|
|
```
|
|
|
|
#### 2. 数据库连接问题
|
|
|
|
```bash
|
|
# 测试数据库连接
|
|
kubectl run -it --rm debug --image=postgres:15-alpine --restart=Never -- \
|
|
psql -h postgres -U user -d reward_db
|
|
|
|
# 检查网络策略
|
|
kubectl get networkpolicy -n rwadurian
|
|
```
|
|
|
|
#### 3. Kafka 连接问题
|
|
|
|
```bash
|
|
# 列出 Kafka topics
|
|
kubectl exec -it kafka-0 -- \
|
|
kafka-topics --list --bootstrap-server localhost:9092
|
|
|
|
# 检查消费者组
|
|
kubectl exec -it kafka-0 -- \
|
|
kafka-consumer-groups --bootstrap-server localhost:9092 --describe --group reward-service-group
|
|
```
|
|
|
|
### 回滚部署
|
|
|
|
```bash
|
|
# 查看历史版本
|
|
kubectl rollout history deployment/reward-service -n rwadurian
|
|
|
|
# 回滚到上一版本
|
|
kubectl rollout undo deployment/reward-service -n rwadurian
|
|
|
|
# 回滚到指定版本
|
|
kubectl rollout undo deployment/reward-service -n rwadurian --to-revision=2
|
|
```
|
|
|
|
---
|
|
|
|
## 安全最佳实践
|
|
|
|
1. **密钥管理**: 使用 Kubernetes Secrets 或外部密钥管理服务 (Vault)
|
|
2. **网络隔离**: 使用 NetworkPolicy 限制 Pod 间通信
|
|
3. **镜像安全**: 定期扫描镜像漏洞
|
|
4. **最小权限**: 使用非 root 用户运行容器
|
|
5. **TLS**: 启用服务间 mTLS
|
|
6. **审计日志**: 记录所有敏感操作
|