761 lines
17 KiB
Markdown
761 lines
17 KiB
Markdown
# Planting Service 部署文档
|
|
|
|
## 目录
|
|
|
|
- [部署概述](#部署概述)
|
|
- [环境要求](#环境要求)
|
|
- [配置管理](#配置管理)
|
|
- [Docker 部署](#docker-部署)
|
|
- [Kubernetes 部署](#kubernetes-部署)
|
|
- [数据库迁移](#数据库迁移)
|
|
- [健康检查](#健康检查)
|
|
- [监控与日志](#监控与日志)
|
|
- [故障排查](#故障排查)
|
|
- [回滚策略](#回滚策略)
|
|
|
|
---
|
|
|
|
## 部署概述
|
|
|
|
### 部署架构
|
|
|
|
```
|
|
┌─────────────────┐
|
|
│ Load Balancer │
|
|
│ (Nginx/ALB) │
|
|
└────────┬────────┘
|
|
│
|
|
┌───────────────────┼───────────────────┐
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
|
|
│ Planting Service│ │ Planting Service│ │ Planting Service│
|
|
│ Instance 1 │ │ Instance 2 │ │ Instance 3 │
|
|
└────────┬────────┘ └────────┬────────┘ └────────┬────────┘
|
|
│ │ │
|
|
└───────────────────┼───────────────────┘
|
|
│
|
|
┌────────▼────────┐
|
|
│ PostgreSQL │
|
|
│ (Primary/RDS) │
|
|
└─────────────────┘
|
|
```
|
|
|
|
### 部署方式
|
|
|
|
| 方式 | 适用场景 | 复杂度 |
|
|
|-----|---------|-------|
|
|
| Docker Compose | 开发/测试环境 | 低 |
|
|
| Docker Swarm | 小规模生产 | 中 |
|
|
| Kubernetes | 大规模生产 | 高 |
|
|
| Cloud Run/ECS | 托管云服务 | 中 |
|
|
|
|
---
|
|
|
|
## 环境要求
|
|
|
|
### 生产环境最低配置
|
|
|
|
| 资源 | 最低配置 | 推荐配置 |
|
|
|-----|---------|---------|
|
|
| CPU | 2 核 | 4 核 |
|
|
| 内存 | 2 GB | 4 GB |
|
|
| 磁盘 | 20 GB SSD | 50 GB SSD |
|
|
| 网络 | 100 Mbps | 1 Gbps |
|
|
|
|
### 数据库配置
|
|
|
|
| 环境 | 实例类型 | 存储 |
|
|
|-----|---------|------|
|
|
| 开发 | db.t3.micro | 20 GB |
|
|
| 测试 | db.t3.small | 50 GB |
|
|
| 生产 | db.r5.large | 200 GB |
|
|
|
|
---
|
|
|
|
## 配置管理
|
|
|
|
### 环境变量
|
|
|
|
```bash
|
|
# 基础配置
|
|
NODE_ENV=production
|
|
PORT=3003
|
|
|
|
# 数据库
|
|
DATABASE_URL=postgresql://user:password@host:5432/dbname?schema=public
|
|
|
|
# JWT 认证
|
|
JWT_SECRET=<secure-random-string>
|
|
|
|
# 外部服务
|
|
WALLET_SERVICE_URL=http://wallet-service:3002
|
|
IDENTITY_SERVICE_URL=http://identity-service:3001
|
|
REFERRAL_SERVICE_URL=http://referral-service:3004
|
|
|
|
# 日志
|
|
LOG_LEVEL=info
|
|
|
|
# 性能
|
|
MAX_CONNECTIONS=100
|
|
QUERY_TIMEOUT=30000
|
|
```
|
|
|
|
### 配置文件示例
|
|
|
|
**.env.production**
|
|
|
|
```env
|
|
NODE_ENV=production
|
|
PORT=3003
|
|
DATABASE_URL=postgresql://planting:${DB_PASSWORD}@db.prod.internal:5432/rwadurian_planting?schema=public&connection_limit=50
|
|
JWT_SECRET=${JWT_SECRET}
|
|
WALLET_SERVICE_URL=http://wallet-service.prod.internal:3002
|
|
IDENTITY_SERVICE_URL=http://identity-service.prod.internal:3001
|
|
REFERRAL_SERVICE_URL=http://referral-service.prod.internal:3004
|
|
LOG_LEVEL=info
|
|
```
|
|
|
|
### 敏感信息管理
|
|
|
|
使用密钥管理服务:
|
|
|
|
```bash
|
|
# AWS Secrets Manager
|
|
aws secretsmanager get-secret-value --secret-id planting-service/prod
|
|
|
|
# Kubernetes Secrets
|
|
kubectl create secret generic planting-secrets \
|
|
--from-literal=DATABASE_URL='...' \
|
|
--from-literal=JWT_SECRET='...'
|
|
```
|
|
|
|
---
|
|
|
|
## Docker 部署
|
|
|
|
### 生产 Dockerfile
|
|
|
|
```dockerfile
|
|
# Build stage
|
|
FROM node:20-alpine AS builder
|
|
|
|
WORKDIR /app
|
|
|
|
# 复制依赖文件
|
|
COPY package*.json ./
|
|
|
|
# 安装依赖
|
|
RUN npm ci
|
|
|
|
# 复制 Prisma schema 并生成客户端
|
|
COPY prisma ./prisma/
|
|
RUN npx prisma generate
|
|
|
|
# 复制源代码
|
|
COPY . .
|
|
|
|
# 构建
|
|
RUN npm run build
|
|
|
|
# Production stage
|
|
FROM node:20-alpine AS production
|
|
|
|
WORKDIR /app
|
|
|
|
# 创建非 root 用户
|
|
RUN addgroup -g 1001 -S nodejs && \
|
|
adduser -S nestjs -u 1001 -G nodejs
|
|
|
|
# 复制依赖文件
|
|
COPY package*.json ./
|
|
|
|
# 仅安装生产依赖
|
|
RUN npm ci --only=production
|
|
|
|
# 复制 Prisma
|
|
COPY prisma ./prisma/
|
|
RUN npx prisma generate
|
|
|
|
# 复制构建产物
|
|
COPY --from=builder /app/dist ./dist
|
|
|
|
# 切换到非 root 用户
|
|
USER nestjs
|
|
|
|
# 暴露端口
|
|
EXPOSE 3003
|
|
|
|
# 健康检查
|
|
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
|
|
CMD wget --no-verbose --tries=1 --spider http://localhost:3003/api/v1/health || exit 1
|
|
|
|
# 启动命令
|
|
CMD ["node", "dist/main"]
|
|
```
|
|
|
|
### Docker Compose 生产配置
|
|
|
|
```yaml
|
|
# docker-compose.prod.yml
|
|
version: '3.8'
|
|
|
|
services:
|
|
planting-service:
|
|
build:
|
|
context: .
|
|
dockerfile: Dockerfile
|
|
target: production
|
|
image: planting-service:${VERSION:-latest}
|
|
container_name: planting-service
|
|
restart: unless-stopped
|
|
ports:
|
|
- "3003:3003"
|
|
environment:
|
|
- NODE_ENV=production
|
|
- DATABASE_URL=${DATABASE_URL}
|
|
- JWT_SECRET=${JWT_SECRET}
|
|
- WALLET_SERVICE_URL=${WALLET_SERVICE_URL}
|
|
- IDENTITY_SERVICE_URL=${IDENTITY_SERVICE_URL}
|
|
- REFERRAL_SERVICE_URL=${REFERRAL_SERVICE_URL}
|
|
healthcheck:
|
|
test: ["CMD", "wget", "--spider", "-q", "http://localhost:3003/api/v1/health"]
|
|
interval: 30s
|
|
timeout: 10s
|
|
retries: 3
|
|
start_period: 40s
|
|
deploy:
|
|
resources:
|
|
limits:
|
|
cpus: '2'
|
|
memory: 2G
|
|
reservations:
|
|
cpus: '1'
|
|
memory: 1G
|
|
logging:
|
|
driver: json-file
|
|
options:
|
|
max-size: "10m"
|
|
max-file: "3"
|
|
networks:
|
|
- rwadurian-network
|
|
|
|
networks:
|
|
rwadurian-network:
|
|
external: true
|
|
```
|
|
|
|
### 部署命令
|
|
|
|
```bash
|
|
# 构建镜像
|
|
docker build -t planting-service:v1.0.0 .
|
|
|
|
# 推送到仓库
|
|
docker tag planting-service:v1.0.0 registry.example.com/planting-service:v1.0.0
|
|
docker push registry.example.com/planting-service:v1.0.0
|
|
|
|
# 部署
|
|
docker-compose -f docker-compose.prod.yml up -d
|
|
|
|
# 查看日志
|
|
docker-compose -f docker-compose.prod.yml logs -f planting-service
|
|
|
|
# 重启
|
|
docker-compose -f docker-compose.prod.yml restart planting-service
|
|
|
|
# 停止
|
|
docker-compose -f docker-compose.prod.yml down
|
|
```
|
|
|
|
---
|
|
|
|
## Kubernetes 部署
|
|
|
|
### Deployment
|
|
|
|
```yaml
|
|
# k8s/deployment.yaml
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
metadata:
|
|
name: planting-service
|
|
labels:
|
|
app: planting-service
|
|
spec:
|
|
replicas: 3
|
|
selector:
|
|
matchLabels:
|
|
app: planting-service
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: planting-service
|
|
spec:
|
|
containers:
|
|
- name: planting-service
|
|
image: registry.example.com/planting-service:v1.0.0
|
|
ports:
|
|
- containerPort: 3003
|
|
env:
|
|
- name: NODE_ENV
|
|
value: production
|
|
- name: PORT
|
|
value: "3003"
|
|
- name: DATABASE_URL
|
|
valueFrom:
|
|
secretKeyRef:
|
|
name: planting-secrets
|
|
key: DATABASE_URL
|
|
- name: JWT_SECRET
|
|
valueFrom:
|
|
secretKeyRef:
|
|
name: planting-secrets
|
|
key: JWT_SECRET
|
|
resources:
|
|
requests:
|
|
cpu: "500m"
|
|
memory: "512Mi"
|
|
limits:
|
|
cpu: "2000m"
|
|
memory: "2Gi"
|
|
livenessProbe:
|
|
httpGet:
|
|
path: /api/v1/health
|
|
port: 3003
|
|
initialDelaySeconds: 30
|
|
periodSeconds: 10
|
|
timeoutSeconds: 5
|
|
failureThreshold: 3
|
|
readinessProbe:
|
|
httpGet:
|
|
path: /api/v1/health/ready
|
|
port: 3003
|
|
initialDelaySeconds: 5
|
|
periodSeconds: 5
|
|
timeoutSeconds: 3
|
|
failureThreshold: 3
|
|
imagePullSecrets:
|
|
- name: registry-credentials
|
|
```
|
|
|
|
### Service
|
|
|
|
```yaml
|
|
# k8s/service.yaml
|
|
apiVersion: v1
|
|
kind: Service
|
|
metadata:
|
|
name: planting-service
|
|
spec:
|
|
selector:
|
|
app: planting-service
|
|
ports:
|
|
- protocol: TCP
|
|
port: 3003
|
|
targetPort: 3003
|
|
type: ClusterIP
|
|
```
|
|
|
|
### Ingress
|
|
|
|
```yaml
|
|
# k8s/ingress.yaml
|
|
apiVersion: networking.k8s.io/v1
|
|
kind: Ingress
|
|
metadata:
|
|
name: planting-service-ingress
|
|
annotations:
|
|
kubernetes.io/ingress.class: nginx
|
|
cert-manager.io/cluster-issuer: letsencrypt-prod
|
|
spec:
|
|
tls:
|
|
- hosts:
|
|
- api.rwadurian.com
|
|
secretName: api-tls
|
|
rules:
|
|
- host: api.rwadurian.com
|
|
http:
|
|
paths:
|
|
- path: /api/v1/planting
|
|
pathType: Prefix
|
|
backend:
|
|
service:
|
|
name: planting-service
|
|
port:
|
|
number: 3003
|
|
```
|
|
|
|
### HPA (Horizontal Pod Autoscaler)
|
|
|
|
```yaml
|
|
# k8s/hpa.yaml
|
|
apiVersion: autoscaling/v2
|
|
kind: HorizontalPodAutoscaler
|
|
metadata:
|
|
name: planting-service-hpa
|
|
spec:
|
|
scaleTargetRef:
|
|
apiVersion: apps/v1
|
|
kind: Deployment
|
|
name: planting-service
|
|
minReplicas: 3
|
|
maxReplicas: 10
|
|
metrics:
|
|
- type: Resource
|
|
resource:
|
|
name: cpu
|
|
target:
|
|
type: Utilization
|
|
averageUtilization: 70
|
|
- type: Resource
|
|
resource:
|
|
name: memory
|
|
target:
|
|
type: Utilization
|
|
averageUtilization: 80
|
|
```
|
|
|
|
### 部署命令
|
|
|
|
```bash
|
|
# 创建 secrets
|
|
kubectl create secret generic planting-secrets \
|
|
--from-literal=DATABASE_URL='postgresql://...' \
|
|
--from-literal=JWT_SECRET='...'
|
|
|
|
# 应用配置
|
|
kubectl apply -f k8s/
|
|
|
|
# 查看状态
|
|
kubectl get pods -l app=planting-service
|
|
kubectl get svc planting-service
|
|
|
|
# 查看日志
|
|
kubectl logs -f -l app=planting-service
|
|
|
|
# 扩容
|
|
kubectl scale deployment planting-service --replicas=5
|
|
|
|
# 回滚
|
|
kubectl rollout undo deployment/planting-service
|
|
```
|
|
|
|
---
|
|
|
|
## 数据库迁移
|
|
|
|
### 迁移策略
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ 数据库迁移流程 │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ 1. 创建迁移脚本 (开发环境) │
|
|
│ npx prisma migrate dev --name add_new_feature │
|
|
│ │
|
|
│ 2. 代码审查迁移脚本 │
|
|
│ 检查 prisma/migrations/ 目录 │
|
|
│ │
|
|
│ 3. 测试环境验证 │
|
|
│ npx prisma migrate deploy │
|
|
│ │
|
|
│ 4. 生产环境部署 │
|
|
│ - 备份数据库 │
|
|
│ - 运行迁移 (npx prisma migrate deploy) │
|
|
│ - 部署新版本应用 │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
### 迁移命令
|
|
|
|
```bash
|
|
# 开发环境 - 创建迁移
|
|
npx prisma migrate dev --name add_new_feature
|
|
|
|
# 生产环境 - 应用迁移
|
|
npx prisma migrate deploy
|
|
|
|
# 查看迁移状态
|
|
npx prisma migrate status
|
|
|
|
# 重置数据库 (仅开发)
|
|
npx prisma migrate reset
|
|
```
|
|
|
|
### 迁移最佳实践
|
|
|
|
1. **向后兼容**: 新版本应用应兼容旧数据库 schema
|
|
2. **分步迁移**: 大型变更分多个小迁移执行
|
|
3. **备份优先**: 生产迁移前必须备份
|
|
4. **回滚脚本**: 准备对应的回滚 SQL
|
|
|
|
---
|
|
|
|
## 健康检查
|
|
|
|
### 端点说明
|
|
|
|
| 端点 | 用途 | 检查内容 |
|
|
|-----|------|---------|
|
|
| `/api/v1/health` | 存活检查 | 服务是否运行 |
|
|
| `/api/v1/health/ready` | 就绪检查 | 服务是否可接收请求 |
|
|
|
|
### 健康检查实现
|
|
|
|
```typescript
|
|
// src/api/controllers/health.controller.ts
|
|
|
|
@Controller('health')
|
|
export class HealthController {
|
|
@Get()
|
|
check() {
|
|
return {
|
|
status: 'ok',
|
|
timestamp: new Date().toISOString(),
|
|
service: 'planting-service',
|
|
};
|
|
}
|
|
|
|
@Get('ready')
|
|
async ready() {
|
|
// 可添加数据库连接检查
|
|
return {
|
|
status: 'ready',
|
|
timestamp: new Date().toISOString(),
|
|
};
|
|
}
|
|
}
|
|
```
|
|
|
|
### 负载均衡器配置
|
|
|
|
**Nginx 配置**
|
|
|
|
```nginx
|
|
upstream planting_service {
|
|
server planting-service-1:3003;
|
|
server planting-service-2:3003;
|
|
server planting-service-3:3003;
|
|
}
|
|
|
|
server {
|
|
listen 80;
|
|
server_name api.rwadurian.com;
|
|
|
|
location /api/v1/planting {
|
|
proxy_pass http://planting_service;
|
|
proxy_http_version 1.1;
|
|
proxy_set_header Upgrade $http_upgrade;
|
|
proxy_set_header Connection 'upgrade';
|
|
proxy_set_header Host $host;
|
|
proxy_cache_bypass $http_upgrade;
|
|
|
|
# 健康检查
|
|
proxy_connect_timeout 5s;
|
|
proxy_read_timeout 30s;
|
|
}
|
|
|
|
location /health {
|
|
proxy_pass http://planting_service/api/v1/health;
|
|
proxy_connect_timeout 5s;
|
|
proxy_read_timeout 5s;
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 监控与日志
|
|
|
|
### 日志配置
|
|
|
|
```typescript
|
|
// src/main.ts
|
|
import { Logger } from '@nestjs/common';
|
|
|
|
async function bootstrap() {
|
|
const app = await NestFactory.create(AppModule, {
|
|
logger: process.env.NODE_ENV === 'production'
|
|
? ['error', 'warn', 'log']
|
|
: ['error', 'warn', 'log', 'debug', 'verbose'],
|
|
});
|
|
// ...
|
|
}
|
|
```
|
|
|
|
### 日志格式
|
|
|
|
```json
|
|
{
|
|
"timestamp": "2024-11-30T10:00:00.000Z",
|
|
"level": "info",
|
|
"context": "PlantingApplicationService",
|
|
"message": "Order created",
|
|
"metadata": {
|
|
"orderNo": "PO202411300001",
|
|
"userId": "1",
|
|
"treeCount": 5
|
|
}
|
|
}
|
|
```
|
|
|
|
### Prometheus 指标
|
|
|
|
```yaml
|
|
# prometheus/scrape_configs
|
|
- job_name: 'planting-service'
|
|
static_configs:
|
|
- targets: ['planting-service:3003']
|
|
metrics_path: '/metrics'
|
|
```
|
|
|
|
### Grafana 仪表板
|
|
|
|
关键指标:
|
|
- 请求速率 (requests/second)
|
|
- 响应时间 (p50, p95, p99)
|
|
- 错误率
|
|
- 数据库连接池状态
|
|
- 订单创建数
|
|
- 支付成功率
|
|
|
|
---
|
|
|
|
## 故障排查
|
|
|
|
### 常见问题
|
|
|
|
#### 1. 服务无法启动
|
|
|
|
```bash
|
|
# 检查日志
|
|
docker logs planting-service
|
|
|
|
# 常见原因
|
|
# - 数据库连接失败
|
|
# - 环境变量缺失
|
|
# - 端口冲突
|
|
```
|
|
|
|
#### 2. 数据库连接失败
|
|
|
|
```bash
|
|
# 检查连接
|
|
psql $DATABASE_URL -c "SELECT 1"
|
|
|
|
# 检查 Prisma
|
|
npx prisma db pull
|
|
```
|
|
|
|
#### 3. 内存不足
|
|
|
|
```bash
|
|
# 检查内存使用
|
|
docker stats planting-service
|
|
|
|
# 调整 Node.js 内存限制
|
|
NODE_OPTIONS="--max-old-space-size=4096" node dist/main
|
|
```
|
|
|
|
#### 4. 高延迟
|
|
|
|
```bash
|
|
# 检查数据库查询
|
|
# 启用 Prisma 查询日志
|
|
|
|
# 检查外部服务响应
|
|
curl -w "@curl-format.txt" http://wallet-service:3002/health
|
|
```
|
|
|
|
### 调试命令
|
|
|
|
```bash
|
|
# 进入容器
|
|
docker exec -it planting-service sh
|
|
|
|
# 检查网络
|
|
docker exec planting-service ping db
|
|
|
|
# 检查环境变量
|
|
docker exec planting-service env | grep DATABASE
|
|
|
|
# 实时日志
|
|
docker logs -f --tail 100 planting-service
|
|
```
|
|
|
|
---
|
|
|
|
## 回滚策略
|
|
|
|
### Docker 回滚
|
|
|
|
```bash
|
|
# 停止当前版本
|
|
docker-compose -f docker-compose.prod.yml down
|
|
|
|
# 启动上一版本
|
|
docker-compose -f docker-compose.prod.yml up -d --no-build
|
|
```
|
|
|
|
### Kubernetes 回滚
|
|
|
|
```bash
|
|
# 查看部署历史
|
|
kubectl rollout history deployment/planting-service
|
|
|
|
# 回滚到上一版本
|
|
kubectl rollout undo deployment/planting-service
|
|
|
|
# 回滚到指定版本
|
|
kubectl rollout undo deployment/planting-service --to-revision=2
|
|
|
|
# 查看回滚状态
|
|
kubectl rollout status deployment/planting-service
|
|
```
|
|
|
|
### 数据库回滚
|
|
|
|
```sql
|
|
-- 准备回滚脚本
|
|
-- prisma/rollback/20241130_rollback.sql
|
|
|
|
-- 示例:回滚列添加
|
|
ALTER TABLE "PlantingOrder" DROP COLUMN IF EXISTS "newColumn";
|
|
```
|
|
|
|
### 回滚检查清单
|
|
|
|
- [ ] 确认问题根因
|
|
- [ ] 通知相关团队
|
|
- [ ] 执行回滚操作
|
|
- [ ] 验证服务恢复
|
|
- [ ] 检查数据一致性
|
|
- [ ] 更新事故报告
|
|
|
|
---
|
|
|
|
## 部署检查清单
|
|
|
|
### 部署前
|
|
|
|
- [ ] 代码审查通过
|
|
- [ ] 所有测试通过
|
|
- [ ] 数据库迁移已测试
|
|
- [ ] 环境变量已配置
|
|
- [ ] 备份已完成
|
|
|
|
### 部署中
|
|
|
|
- [ ] 监控仪表板就绪
|
|
- [ ] 日志收集正常
|
|
- [ ] 渐进式部署 (金丝雀/蓝绿)
|
|
- [ ] 健康检查通过
|
|
|
|
### 部署后
|
|
|
|
- [ ] 功能验证
|
|
- [ ] 性能验证
|
|
- [ ] 错误率监控
|
|
- [ ] 用户反馈收集
|