17 KiB
17 KiB
Planting Service 部署文档
目录
部署概述
部署架构
┌─────────────────┐
│ Load Balancer │
│ (Nginx/ALB) │
└────────┬────────┘
│
┌───────────────────┼───────────────────┐
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ Planting Service│ │ Planting Service│ │ Planting Service│
│ Instance 1 │ │ Instance 2 │ │ Instance 3 │
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
│ │ │
└───────────────────┼───────────────────┘
│
┌────────▼────────┐
│ PostgreSQL │
│ (Primary/RDS) │
└─────────────────┘
部署方式
| 方式 | 适用场景 | 复杂度 |
|---|---|---|
| Docker Compose | 开发/测试环境 | 低 |
| Docker Swarm | 小规模生产 | 中 |
| Kubernetes | 大规模生产 | 高 |
| Cloud Run/ECS | 托管云服务 | 中 |
环境要求
生产环境最低配置
| 资源 | 最低配置 | 推荐配置 |
|---|---|---|
| CPU | 2 核 | 4 核 |
| 内存 | 2 GB | 4 GB |
| 磁盘 | 20 GB SSD | 50 GB SSD |
| 网络 | 100 Mbps | 1 Gbps |
数据库配置
| 环境 | 实例类型 | 存储 |
|---|---|---|
| 开发 | db.t3.micro | 20 GB |
| 测试 | db.t3.small | 50 GB |
| 生产 | db.r5.large | 200 GB |
配置管理
环境变量
# 基础配置
NODE_ENV=production
PORT=3003
# 数据库
DATABASE_URL=postgresql://user:password@host:5432/dbname?schema=public
# JWT 认证
JWT_SECRET=<secure-random-string>
# 外部服务
WALLET_SERVICE_URL=http://wallet-service:3002
IDENTITY_SERVICE_URL=http://identity-service:3001
REFERRAL_SERVICE_URL=http://referral-service:3004
# 日志
LOG_LEVEL=info
# 性能
MAX_CONNECTIONS=100
QUERY_TIMEOUT=30000
配置文件示例
.env.production
NODE_ENV=production
PORT=3003
DATABASE_URL=postgresql://planting:${DB_PASSWORD}@db.prod.internal:5432/rwadurian_planting?schema=public&connection_limit=50
JWT_SECRET=${JWT_SECRET}
WALLET_SERVICE_URL=http://wallet-service.prod.internal:3002
IDENTITY_SERVICE_URL=http://identity-service.prod.internal:3001
REFERRAL_SERVICE_URL=http://referral-service.prod.internal:3004
LOG_LEVEL=info
敏感信息管理
使用密钥管理服务:
# AWS Secrets Manager
aws secretsmanager get-secret-value --secret-id planting-service/prod
# Kubernetes Secrets
kubectl create secret generic planting-secrets \
--from-literal=DATABASE_URL='...' \
--from-literal=JWT_SECRET='...'
Docker 部署
生产 Dockerfile
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
# 复制依赖文件
COPY package*.json ./
# 安装依赖
RUN npm ci
# 复制 Prisma schema 并生成客户端
COPY prisma ./prisma/
RUN npx prisma generate
# 复制源代码
COPY . .
# 构建
RUN npm run build
# Production stage
FROM node:20-alpine AS production
WORKDIR /app
# 创建非 root 用户
RUN addgroup -g 1001 -S nodejs && \
adduser -S nestjs -u 1001 -G nodejs
# 复制依赖文件
COPY package*.json ./
# 仅安装生产依赖
RUN npm ci --only=production
# 复制 Prisma
COPY prisma ./prisma/
RUN npx prisma generate
# 复制构建产物
COPY --from=builder /app/dist ./dist
# 切换到非 root 用户
USER nestjs
# 暴露端口
EXPOSE 3003
# 健康检查
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://localhost:3003/api/v1/health || exit 1
# 启动命令
CMD ["node", "dist/main"]
Docker Compose 生产配置
# docker-compose.prod.yml
version: '3.8'
services:
planting-service:
build:
context: .
dockerfile: Dockerfile
target: production
image: planting-service:${VERSION:-latest}
container_name: planting-service
restart: unless-stopped
ports:
- "3003:3003"
environment:
- NODE_ENV=production
- DATABASE_URL=${DATABASE_URL}
- JWT_SECRET=${JWT_SECRET}
- WALLET_SERVICE_URL=${WALLET_SERVICE_URL}
- IDENTITY_SERVICE_URL=${IDENTITY_SERVICE_URL}
- REFERRAL_SERVICE_URL=${REFERRAL_SERVICE_URL}
healthcheck:
test: ["CMD", "wget", "--spider", "-q", "http://localhost:3003/api/v1/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
deploy:
resources:
limits:
cpus: '2'
memory: 2G
reservations:
cpus: '1'
memory: 1G
logging:
driver: json-file
options:
max-size: "10m"
max-file: "3"
networks:
- rwadurian-network
networks:
rwadurian-network:
external: true
部署命令
# 构建镜像
docker build -t planting-service:v1.0.0 .
# 推送到仓库
docker tag planting-service:v1.0.0 registry.example.com/planting-service:v1.0.0
docker push registry.example.com/planting-service:v1.0.0
# 部署
docker-compose -f docker-compose.prod.yml up -d
# 查看日志
docker-compose -f docker-compose.prod.yml logs -f planting-service
# 重启
docker-compose -f docker-compose.prod.yml restart planting-service
# 停止
docker-compose -f docker-compose.prod.yml down
Kubernetes 部署
Deployment
# k8s/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: planting-service
labels:
app: planting-service
spec:
replicas: 3
selector:
matchLabels:
app: planting-service
template:
metadata:
labels:
app: planting-service
spec:
containers:
- name: planting-service
image: registry.example.com/planting-service:v1.0.0
ports:
- containerPort: 3003
env:
- name: NODE_ENV
value: production
- name: PORT
value: "3003"
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: planting-secrets
key: DATABASE_URL
- name: JWT_SECRET
valueFrom:
secretKeyRef:
name: planting-secrets
key: JWT_SECRET
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "2000m"
memory: "2Gi"
livenessProbe:
httpGet:
path: /api/v1/health
port: 3003
initialDelaySeconds: 30
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
readinessProbe:
httpGet:
path: /api/v1/health/ready
port: 3003
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 3
failureThreshold: 3
imagePullSecrets:
- name: registry-credentials
Service
# k8s/service.yaml
apiVersion: v1
kind: Service
metadata:
name: planting-service
spec:
selector:
app: planting-service
ports:
- protocol: TCP
port: 3003
targetPort: 3003
type: ClusterIP
Ingress
# k8s/ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: planting-service-ingress
annotations:
kubernetes.io/ingress.class: nginx
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- api.rwadurian.com
secretName: api-tls
rules:
- host: api.rwadurian.com
http:
paths:
- path: /api/v1/planting
pathType: Prefix
backend:
service:
name: planting-service
port:
number: 3003
HPA (Horizontal Pod Autoscaler)
# k8s/hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: planting-service-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: planting-service
minReplicas: 3
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
部署命令
# 创建 secrets
kubectl create secret generic planting-secrets \
--from-literal=DATABASE_URL='postgresql://...' \
--from-literal=JWT_SECRET='...'
# 应用配置
kubectl apply -f k8s/
# 查看状态
kubectl get pods -l app=planting-service
kubectl get svc planting-service
# 查看日志
kubectl logs -f -l app=planting-service
# 扩容
kubectl scale deployment planting-service --replicas=5
# 回滚
kubectl rollout undo deployment/planting-service
数据库迁移
迁移策略
┌─────────────────────────────────────────────────────────────────┐
│ 数据库迁移流程 │
├─────────────────────────────────────────────────────────────────┤
│ 1. 创建迁移脚本 (开发环境) │
│ npx prisma migrate dev --name add_new_feature │
│ │
│ 2. 代码审查迁移脚本 │
│ 检查 prisma/migrations/ 目录 │
│ │
│ 3. 测试环境验证 │
│ npx prisma migrate deploy │
│ │
│ 4. 生产环境部署 │
│ - 备份数据库 │
│ - 运行迁移 (npx prisma migrate deploy) │
│ - 部署新版本应用 │
└─────────────────────────────────────────────────────────────────┘
迁移命令
# 开发环境 - 创建迁移
npx prisma migrate dev --name add_new_feature
# 生产环境 - 应用迁移
npx prisma migrate deploy
# 查看迁移状态
npx prisma migrate status
# 重置数据库 (仅开发)
npx prisma migrate reset
迁移最佳实践
- 向后兼容: 新版本应用应兼容旧数据库 schema
- 分步迁移: 大型变更分多个小迁移执行
- 备份优先: 生产迁移前必须备份
- 回滚脚本: 准备对应的回滚 SQL
健康检查
端点说明
| 端点 | 用途 | 检查内容 |
|---|---|---|
/api/v1/health |
存活检查 | 服务是否运行 |
/api/v1/health/ready |
就绪检查 | 服务是否可接收请求 |
健康检查实现
// src/api/controllers/health.controller.ts
@Controller('health')
export class HealthController {
@Get()
check() {
return {
status: 'ok',
timestamp: new Date().toISOString(),
service: 'planting-service',
};
}
@Get('ready')
async ready() {
// 可添加数据库连接检查
return {
status: 'ready',
timestamp: new Date().toISOString(),
};
}
}
负载均衡器配置
Nginx 配置
upstream planting_service {
server planting-service-1:3003;
server planting-service-2:3003;
server planting-service-3:3003;
}
server {
listen 80;
server_name api.rwadurian.com;
location /api/v1/planting {
proxy_pass http://planting_service;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
# 健康检查
proxy_connect_timeout 5s;
proxy_read_timeout 30s;
}
location /health {
proxy_pass http://planting_service/api/v1/health;
proxy_connect_timeout 5s;
proxy_read_timeout 5s;
}
}
监控与日志
日志配置
// src/main.ts
import { Logger } from '@nestjs/common';
async function bootstrap() {
const app = await NestFactory.create(AppModule, {
logger: process.env.NODE_ENV === 'production'
? ['error', 'warn', 'log']
: ['error', 'warn', 'log', 'debug', 'verbose'],
});
// ...
}
日志格式
{
"timestamp": "2024-11-30T10:00:00.000Z",
"level": "info",
"context": "PlantingApplicationService",
"message": "Order created",
"metadata": {
"orderNo": "PO202411300001",
"userId": "1",
"treeCount": 5
}
}
Prometheus 指标
# prometheus/scrape_configs
- job_name: 'planting-service'
static_configs:
- targets: ['planting-service:3003']
metrics_path: '/metrics'
Grafana 仪表板
关键指标:
- 请求速率 (requests/second)
- 响应时间 (p50, p95, p99)
- 错误率
- 数据库连接池状态
- 订单创建数
- 支付成功率
故障排查
常见问题
1. 服务无法启动
# 检查日志
docker logs planting-service
# 常见原因
# - 数据库连接失败
# - 环境变量缺失
# - 端口冲突
2. 数据库连接失败
# 检查连接
psql $DATABASE_URL -c "SELECT 1"
# 检查 Prisma
npx prisma db pull
3. 内存不足
# 检查内存使用
docker stats planting-service
# 调整 Node.js 内存限制
NODE_OPTIONS="--max-old-space-size=4096" node dist/main
4. 高延迟
# 检查数据库查询
# 启用 Prisma 查询日志
# 检查外部服务响应
curl -w "@curl-format.txt" http://wallet-service:3002/health
调试命令
# 进入容器
docker exec -it planting-service sh
# 检查网络
docker exec planting-service ping db
# 检查环境变量
docker exec planting-service env | grep DATABASE
# 实时日志
docker logs -f --tail 100 planting-service
回滚策略
Docker 回滚
# 停止当前版本
docker-compose -f docker-compose.prod.yml down
# 启动上一版本
docker-compose -f docker-compose.prod.yml up -d --no-build
Kubernetes 回滚
# 查看部署历史
kubectl rollout history deployment/planting-service
# 回滚到上一版本
kubectl rollout undo deployment/planting-service
# 回滚到指定版本
kubectl rollout undo deployment/planting-service --to-revision=2
# 查看回滚状态
kubectl rollout status deployment/planting-service
数据库回滚
-- 准备回滚脚本
-- prisma/rollback/20241130_rollback.sql
-- 示例:回滚列添加
ALTER TABLE "PlantingOrder" DROP COLUMN IF EXISTS "newColumn";
回滚检查清单
- 确认问题根因
- 通知相关团队
- 执行回滚操作
- 验证服务恢复
- 检查数据一致性
- 更新事故报告
部署检查清单
部署前
- 代码审查通过
- 所有测试通过
- 数据库迁移已测试
- 环境变量已配置
- 备份已完成
部署中
- 监控仪表板就绪
- 日志收集正常
- 渐进式部署 (金丝雀/蓝绿)
- 健康检查通过
部署后
- 功能验证
- 性能验证
- 错误率监控
- 用户反馈收集