1219 lines
26 KiB
Markdown
1219 lines
26 KiB
Markdown
# Admin Service 部署文档
|
|
|
|
## 目录
|
|
|
|
- [1. 部署概述](#1-部署概述)
|
|
- [2. 环境准备](#2-环境准备)
|
|
- [3. 快速开始](#3-快速开始)
|
|
- [4. 本地部署](#4-本地部署)
|
|
- [5. Docker 部署](#5-docker-部署)
|
|
- [6. 生产环境部署](#6-生产环境部署)
|
|
- [7. 监控和维护](#7-监控和维护)
|
|
- [8. 故障排查](#8-故障排查)
|
|
|
|
---
|
|
|
|
## 1. 部署概述
|
|
|
|
### 1.1 部署架构
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────┐
|
|
│ Load Balancer │
|
|
│ (Nginx / AWS ALB / etc.) │
|
|
└───────────────────┬─────────────────────────────┘
|
|
│
|
|
┌───────────┼───────────┐
|
|
│ │ │
|
|
▼ ▼ ▼
|
|
┌──────────┐ ┌──────────┐ ┌──────────┐
|
|
│ Admin │ │ Admin │ │ Admin │
|
|
│ Service │ │ Service │ │ Service │
|
|
│ Instance │ │ Instance │ │ Instance │
|
|
└────┬─────┘ └────┬─────┘ └────┬─────┘
|
|
│ │ │
|
|
└─────────────┼─────────────┘
|
|
│
|
|
▼
|
|
┌──────────────────┐
|
|
│ PostgreSQL │
|
|
│ (Primary + │
|
|
│ Replicas) │
|
|
└──────────────────┘
|
|
```
|
|
|
|
### 1.2 部署环境
|
|
|
|
| 环境 | 说明 | 数据库 | 实例数 |
|
|
|-----|------|--------|-------|
|
|
| **Development** | 开发环境 | 本地/Docker | 1 |
|
|
| **Staging** | 预发布环境 | 独立数据库 | 1-2 |
|
|
| **Production** | 生产环境 | 高可用集群 | 3+ |
|
|
|
|
### 1.3 系统要求
|
|
|
|
#### 最低配置
|
|
|
|
| 资源 | 要求 |
|
|
|-----|------|
|
|
| **CPU** | 2 核心 |
|
|
| **内存** | 2 GB |
|
|
| **硬盘** | 20 GB (SSD) |
|
|
| **网络** | 100 Mbps |
|
|
|
|
#### 推荐配置 (生产环境)
|
|
|
|
| 资源 | 要求 |
|
|
|-----|------|
|
|
| **CPU** | 4 核心 |
|
|
| **内存** | 4-8 GB |
|
|
| **硬盘** | 50 GB (SSD) |
|
|
| **网络** | 1 Gbps |
|
|
|
|
---
|
|
|
|
## 2. 环境准备
|
|
|
|
### 2.1 服务器准备
|
|
|
|
```bash
|
|
# Ubuntu 22.04 LTS 示例
|
|
|
|
# 1. 更新系统
|
|
sudo apt update && sudo apt upgrade -y
|
|
|
|
# 2. 安装基础工具
|
|
sudo apt install -y \
|
|
curl \
|
|
wget \
|
|
git \
|
|
build-essential \
|
|
ca-certificates \
|
|
gnupg \
|
|
lsb-release
|
|
|
|
# 3. 安装 Node.js 20.x
|
|
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
|
|
sudo apt install -y nodejs
|
|
|
|
# 验证
|
|
node --version # v20.x.x
|
|
npm --version # 10.x.x
|
|
|
|
# 4. 安装 PM2 (进程管理器)
|
|
sudo npm install -g pm2
|
|
|
|
# 5. 安装 PostgreSQL 16
|
|
sudo sh -c 'echo "deb http://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list'
|
|
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
|
|
sudo apt update
|
|
sudo apt install -y postgresql-16
|
|
```
|
|
|
|
### 2.2 数据库配置
|
|
|
|
```bash
|
|
# 1. 切换到 postgres 用户
|
|
sudo -u postgres psql
|
|
|
|
# 2. 创建数据库和用户
|
|
CREATE DATABASE admin_service_prod;
|
|
CREATE USER admin_service WITH ENCRYPTED PASSWORD 'your_secure_password';
|
|
GRANT ALL PRIVILEGES ON DATABASE admin_service_prod TO admin_service;
|
|
|
|
# 3. 退出 psql
|
|
\q
|
|
|
|
# 4. 配置 PostgreSQL 允许远程连接 (如果需要)
|
|
sudo nano /etc/postgresql/16/main/postgresql.conf
|
|
# 修改: listen_addresses = '*'
|
|
|
|
sudo nano /etc/postgresql/16/main/pg_hba.conf
|
|
# 添加: host all all 0.0.0.0/0 md5
|
|
|
|
# 5. 重启 PostgreSQL
|
|
sudo systemctl restart postgresql
|
|
```
|
|
|
|
### 2.3 防火墙配置
|
|
|
|
```bash
|
|
# UFW 防火墙配置
|
|
sudo ufw allow 22/tcp # SSH
|
|
sudo ufw allow 3005/tcp # Admin Service (或通过 Nginx 反向代理)
|
|
sudo ufw allow 5432/tcp # PostgreSQL (仅内网)
|
|
|
|
sudo ufw enable
|
|
sudo ufw status
|
|
```
|
|
|
|
---
|
|
|
|
## 3. 快速开始
|
|
|
|
### 3.1 一键启动 (推荐)
|
|
|
|
使用 `deploy.sh` 脚本快速启动所有服务:
|
|
|
|
```bash
|
|
# 进入项目目录
|
|
cd backend/services/admin-service
|
|
|
|
# 启动所有服务 (包含 PostgreSQL, Redis)
|
|
./deploy.sh start
|
|
|
|
# 检查服务状态
|
|
./deploy.sh status
|
|
|
|
# 查看日志
|
|
./deploy.sh logs
|
|
|
|
# 健康检查
|
|
./deploy.sh health
|
|
```
|
|
|
|
### 3.2 验证部署
|
|
|
|
```bash
|
|
# 健康检查
|
|
curl http://localhost:3010/api/v1/health
|
|
|
|
# 预期响应
|
|
{
|
|
"status": "ok",
|
|
"service": "admin-service",
|
|
"timestamp": "2025-12-02T12:00:00.000Z"
|
|
}
|
|
```
|
|
|
|
### 3.3 环境文件说明
|
|
|
|
| 文件 | 用途 | 说明 |
|
|
|------|------|------|
|
|
| `.env.example` | 配置模板 | 所有配置项的参考 |
|
|
| `.env.development` | 本地开发 | 使用本地数据库连接 |
|
|
| `.env.production` | 生产环境 | 使用变量引用,部署时注入 |
|
|
| `.env.test` | 测试环境 | 独立的测试数据库 |
|
|
| `.env` | 实际使用 | 复制自上述文件,不提交到 Git |
|
|
|
|
### 3.4 deploy.sh 命令速查
|
|
|
|
```bash
|
|
# 构建
|
|
./deploy.sh build # 构建 Docker 镜像
|
|
./deploy.sh build-no-cache # 无缓存构建
|
|
|
|
# 生命周期
|
|
./deploy.sh start # 启动所有服务
|
|
./deploy.sh stop # 停止所有服务
|
|
./deploy.sh restart # 重启服务
|
|
./deploy.sh up # 前台启动 (查看日志)
|
|
./deploy.sh down # 停止并删除容器和卷
|
|
|
|
# 监控
|
|
./deploy.sh logs # 实时日志
|
|
./deploy.sh logs-tail # 最近 100 行日志
|
|
./deploy.sh status # 服务状态
|
|
./deploy.sh health # 健康检查
|
|
|
|
# 数据库
|
|
./deploy.sh migrate # 生产迁移
|
|
./deploy.sh migrate-dev # 开发迁移
|
|
./deploy.sh prisma-studio # Prisma GUI
|
|
|
|
# 开发
|
|
./deploy.sh dev # 开发模式
|
|
./deploy.sh test # 运行测试
|
|
./deploy.sh shell # 进入容器
|
|
|
|
# 清理
|
|
./deploy.sh clean # 清理容器
|
|
./deploy.sh clean-all # 清理容器、卷和镜像
|
|
|
|
# 信息
|
|
./deploy.sh info # 显示服务信息
|
|
```
|
|
|
|
---
|
|
|
|
## 4. 本地部署
|
|
|
|
### 4.1 克隆代码
|
|
|
|
```bash
|
|
cd /opt
|
|
sudo git clone https://github.com/your-org/rwa-durian.git
|
|
cd rwa-durian/backend/services/admin-service
|
|
|
|
# 设置权限
|
|
sudo chown -R $USER:$USER /opt/rwa-durian
|
|
```
|
|
|
|
### 4.2 安装依赖
|
|
|
|
```bash
|
|
npm ci --omit=dev
|
|
```
|
|
|
|
### 4.3 环境配置
|
|
|
|
创建 `.env.production`:
|
|
|
|
```env
|
|
# 应用配置
|
|
NODE_ENV=production
|
|
APP_PORT=3010
|
|
API_PREFIX=api/v1
|
|
|
|
# 数据库配置
|
|
DATABASE_URL=postgresql://admin_service:your_secure_password@localhost:5432/rwa_admin?schema=public
|
|
|
|
# 日志配置
|
|
LOG_LEVEL=info
|
|
|
|
# CORS 配置
|
|
CORS_ORIGIN=https://admin.rwadurian.com,https://app.rwadurian.com
|
|
|
|
# 安全配置 (待实现)
|
|
JWT_SECRET=your_super_secret_jwt_key_change_in_production
|
|
```
|
|
|
|
### 4.4 数据库迁移
|
|
|
|
```bash
|
|
# 生成 Prisma Client
|
|
npm run prisma:generate
|
|
|
|
# 运行迁移
|
|
npm run prisma:migrate:deploy
|
|
|
|
# (可选) 运行初始化脚本
|
|
psql -U admin_service -d admin_service_prod -f database/init.sql
|
|
```
|
|
|
|
### 4.5 构建应用
|
|
|
|
```bash
|
|
npm run build
|
|
```
|
|
|
|
### 4.6 使用 PM2 启动
|
|
|
|
创建 `ecosystem.config.js`:
|
|
|
|
```javascript
|
|
module.exports = {
|
|
apps: [
|
|
{
|
|
name: 'admin-service',
|
|
script: 'dist/main.js',
|
|
instances: 2, // CPU 核心数
|
|
exec_mode: 'cluster',
|
|
env: {
|
|
NODE_ENV: 'production',
|
|
APP_PORT: 3010,
|
|
},
|
|
env_file: '.env.production',
|
|
error_file: 'logs/error.log',
|
|
out_file: 'logs/out.log',
|
|
log_date_format: 'YYYY-MM-DD HH:mm:ss',
|
|
merge_logs: true,
|
|
autorestart: true,
|
|
max_memory_restart: '500M',
|
|
watch: false,
|
|
},
|
|
],
|
|
};
|
|
```
|
|
|
|
启动服务:
|
|
|
|
```bash
|
|
# 启动
|
|
pm2 start ecosystem.config.js
|
|
|
|
# 查看状态
|
|
pm2 status
|
|
|
|
# 查看日志
|
|
pm2 logs admin-service
|
|
|
|
# 重启
|
|
pm2 restart admin-service
|
|
|
|
# 停止
|
|
pm2 stop admin-service
|
|
|
|
# 删除
|
|
pm2 delete admin-service
|
|
```
|
|
|
|
### 4.7 设置开机自启动
|
|
|
|
```bash
|
|
# 保存 PM2 进程列表
|
|
pm2 save
|
|
|
|
# 生成启动脚本
|
|
pm2 startup systemd
|
|
|
|
# 执行输出的命令 (类似):
|
|
# sudo env PATH=$PATH:/usr/bin /usr/lib/node_modules/pm2/bin/pm2 startup systemd -u your_user --hp /home/your_user
|
|
```
|
|
|
|
### 4.8 验证部署
|
|
|
|
```bash
|
|
# 检查服务状态
|
|
curl http://localhost:3010/api/v1/health
|
|
|
|
# 预期响应
|
|
{"status": "ok", "service": "admin-service", "timestamp": "..."}
|
|
|
|
# 检查版本查询
|
|
curl "http://localhost:3010/api/v1/versions/check-update?platform=android¤tVersionCode=1"
|
|
|
|
# PM2 状态
|
|
pm2 status
|
|
```
|
|
|
|
---
|
|
|
|
## 5. Docker 部署
|
|
|
|
### 5.1 使用 deploy.sh (推荐)
|
|
|
|
```bash
|
|
# 构建镜像
|
|
./deploy.sh build
|
|
|
|
# 启动所有服务
|
|
./deploy.sh start
|
|
|
|
# 查看状态
|
|
./deploy.sh status
|
|
|
|
# 运行数据库迁移
|
|
./deploy.sh migrate
|
|
```
|
|
|
|
### 5.2 Dockerfile
|
|
|
|
**已配置的 Dockerfile** 特性:
|
|
```dockerfile
|
|
# 构建阶段
|
|
FROM node:20-alpine AS builder
|
|
|
|
WORKDIR /app
|
|
|
|
# 安装 OpenSSL (Prisma 需要)
|
|
RUN apk add --no-cache openssl
|
|
|
|
# 复制 package.json 和 package-lock.json
|
|
COPY package*.json ./
|
|
COPY prisma ./prisma/
|
|
|
|
# 安装依赖
|
|
RUN npm ci
|
|
|
|
# 生成 Prisma Client
|
|
RUN npx prisma generate
|
|
|
|
# 复制源代码
|
|
COPY . .
|
|
|
|
# 构建
|
|
RUN npm run build
|
|
|
|
# 生产阶段
|
|
FROM node:20-alpine
|
|
|
|
WORKDIR /app
|
|
|
|
RUN apk add --no-cache openssl
|
|
|
|
# 复制依赖
|
|
COPY --from=builder /app/node_modules ./node_modules
|
|
COPY --from=builder /app/package*.json ./
|
|
COPY --from=builder /app/dist ./dist
|
|
COPY --from=builder /app/prisma ./prisma
|
|
|
|
# 暴露端口
|
|
EXPOSE 3010
|
|
|
|
# 健康检查
|
|
HEALTHCHECK --interval=30s --timeout=10s --retries=3 --start-period=40s \
|
|
CMD curl -f http://localhost:3010/api/v1/health || exit 1
|
|
|
|
# 启动命令
|
|
CMD ["node", "dist/main.js"]
|
|
```
|
|
|
|
### 5.3 Docker Compose
|
|
|
|
**docker-compose.yml** 服务架构:
|
|
|
|
```
|
|
┌─────────────────────────────────────┐
|
|
│ admin-service (3010) │
|
|
│ NestJS Application │
|
|
└──────────────┬──────────────────────┘
|
|
│
|
|
┌───────┴───────┐
|
|
│ │
|
|
▼ ▼
|
|
┌──────────────┐ ┌──────────────┐
|
|
│ PostgreSQL │ │ Redis │
|
|
│ (5433) │ │ (6380) │
|
|
└──────────────┘ └──────────────┘
|
|
```
|
|
|
|
**端口映射** (避免与其他服务冲突):
|
|
- admin-service: 3010
|
|
- PostgreSQL: 5433 (外部) → 5432 (内部)
|
|
- Redis: 6380 (外部) → 6379 (内部)
|
|
|
|
```yaml
|
|
services:
|
|
admin-service:
|
|
build: .
|
|
container_name: rwa-admin-service
|
|
ports:
|
|
- "3010:3010"
|
|
environment:
|
|
- NODE_ENV=production
|
|
- APP_PORT=3010
|
|
- API_PREFIX=api/v1
|
|
- DATABASE_URL=postgresql://postgres:password@postgres:5432/rwa_admin?schema=public
|
|
- JWT_SECRET=your-admin-jwt-secret-change-in-production
|
|
- REDIS_HOST=redis
|
|
- REDIS_PORT=6379
|
|
depends_on:
|
|
postgres:
|
|
condition: service_healthy
|
|
healthcheck:
|
|
test: ["CMD", "curl", "-f", "http://localhost:3010/api/v1/health"]
|
|
interval: 30s
|
|
timeout: 10s
|
|
retries: 3
|
|
start_period: 40s
|
|
restart: unless-stopped
|
|
|
|
postgres:
|
|
image: postgres:16-alpine
|
|
container_name: rwa-admin-postgres
|
|
environment:
|
|
- POSTGRES_USER=postgres
|
|
- POSTGRES_PASSWORD=password
|
|
- POSTGRES_DB=rwa_admin
|
|
ports:
|
|
- "5433:5432"
|
|
volumes:
|
|
- postgres_data:/var/lib/postgresql/data
|
|
- ./database/init.sql:/docker-entrypoint-initdb.d/init.sql:ro
|
|
healthcheck:
|
|
test: ["CMD-SHELL", "pg_isready -U postgres -d rwa_admin"]
|
|
interval: 5s
|
|
timeout: 5s
|
|
retries: 10
|
|
restart: unless-stopped
|
|
|
|
redis:
|
|
image: redis:7-alpine
|
|
container_name: rwa-admin-redis
|
|
ports:
|
|
- "6380:6379"
|
|
volumes:
|
|
- redis_data:/data
|
|
healthcheck:
|
|
test: ["CMD", "redis-cli", "ping"]
|
|
interval: 5s
|
|
timeout: 5s
|
|
retries: 10
|
|
restart: unless-stopped
|
|
|
|
volumes:
|
|
postgres_data:
|
|
name: admin-service-postgres-data
|
|
redis_data:
|
|
name: admin-service-redis-data
|
|
```
|
|
|
|
### 5.4 Docker 部署步骤
|
|
|
|
```bash
|
|
# 使用 deploy.sh (推荐)
|
|
./deploy.sh build # 构建镜像
|
|
./deploy.sh start # 启动服务
|
|
./deploy.sh migrate # 运行迁移
|
|
./deploy.sh logs # 查看日志
|
|
./deploy.sh status # 查看状态
|
|
./deploy.sh stop # 停止服务
|
|
./deploy.sh down # 清理 (包括数据)
|
|
|
|
# 或使用原生 docker compose
|
|
docker compose build
|
|
docker compose up -d
|
|
docker compose exec admin-service npx prisma migrate deploy
|
|
docker compose logs -f admin-service
|
|
docker compose ps
|
|
docker compose down
|
|
docker compose down -v
|
|
```
|
|
|
|
### 5.5 Docker 健康检查
|
|
|
|
```bash
|
|
# 检查容器健康状态
|
|
docker ps
|
|
|
|
# 查看健康检查日志
|
|
docker inspect rwa-admin-service | jq '.[0].State.Health'
|
|
|
|
# 手动健康检查
|
|
docker exec rwa-admin-service curl -f http://localhost:3010/api/v1/health
|
|
```
|
|
|
|
---
|
|
|
|
## 6. 生产环境部署
|
|
|
|
### 6.1 Nginx 反向代理
|
|
|
|
**安装 Nginx**:
|
|
```bash
|
|
sudo apt install -y nginx
|
|
```
|
|
|
|
**集成到 RWA API 网关** (`/etc/nginx/sites-available/rwaapi.szaiai.com.conf`):
|
|
```nginx
|
|
upstream admin_service {
|
|
least_conn;
|
|
server 192.168.1.111:3010;
|
|
# server 192.168.1.112:3010; # 多实例负载均衡
|
|
}
|
|
|
|
# 在主 server 块中添加 admin-service 路由
|
|
server {
|
|
listen 443 ssl http2;
|
|
server_name rwaapi.szaiai.com;
|
|
|
|
# SSL 配置 (已在主配置中设置)
|
|
ssl_certificate /etc/letsencrypt/live/rwaapi.szaiai.com/fullchain.pem;
|
|
ssl_certificate_key /etc/letsencrypt/live/rwaapi.szaiai.com/privkey.pem;
|
|
include /etc/nginx/snippets/ssl-params.conf;
|
|
|
|
# Admin Service 路由 - 版本管理
|
|
location /api/v1/versions {
|
|
include /etc/nginx/snippets/proxy-params.conf;
|
|
include /etc/nginx/snippets/cors-params.conf;
|
|
proxy_pass http://admin_service;
|
|
}
|
|
|
|
# Admin Service 路由 - 管理接口 (预留)
|
|
location /api/v1/admin {
|
|
include /etc/nginx/snippets/proxy-params.conf;
|
|
include /etc/nginx/snippets/cors-params.conf;
|
|
proxy_pass http://admin_service;
|
|
}
|
|
|
|
# ... 其他服务路由 (identity, wallet, etc.)
|
|
}
|
|
```
|
|
|
|
**启用配置**:
|
|
```bash
|
|
# 测试配置
|
|
sudo nginx -t
|
|
|
|
# 重载配置
|
|
sudo systemctl reload nginx
|
|
```
|
|
|
|
### 6.2 SSL 证书 (Let's Encrypt)
|
|
|
|
```bash
|
|
# 安装 Certbot
|
|
sudo apt install -y certbot python3-certbot-nginx
|
|
|
|
# 获取证书
|
|
sudo certbot --nginx -d admin-api.rwadurian.com
|
|
|
|
# 自动续期测试
|
|
sudo certbot renew --dry-run
|
|
|
|
# 自动续期 (crontab)
|
|
sudo crontab -e
|
|
# 添加: 0 3 * * * certbot renew --quiet
|
|
```
|
|
|
|
### 6.3 日志管理
|
|
|
|
#### 日志轮转
|
|
|
|
创建 `/etc/logrotate.d/admin-service`:
|
|
|
|
```
|
|
/opt/rwa-durian/backend/services/admin-service/logs/*.log {
|
|
daily
|
|
rotate 30
|
|
compress
|
|
delaycompress
|
|
notifempty
|
|
create 0640 your_user your_user
|
|
sharedscripts
|
|
postrotate
|
|
pm2 reloadLogs
|
|
endscript
|
|
}
|
|
```
|
|
|
|
#### 查看日志
|
|
|
|
```bash
|
|
# PM2 日志
|
|
pm2 logs admin-service
|
|
|
|
# 实时日志
|
|
pm2 logs admin-service --lines 100
|
|
|
|
# 错误日志
|
|
pm2 logs admin-service --err
|
|
|
|
# Nginx 日志
|
|
sudo tail -f /var/log/nginx/admin-service-access.log
|
|
sudo tail -f /var/log/nginx/admin-service-error.log
|
|
```
|
|
|
|
### 6.4 数据库备份
|
|
|
|
#### 自动备份脚本
|
|
|
|
创建 `/opt/scripts/backup-admin-db.sh`:
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
|
|
# 配置
|
|
DB_NAME="admin_service_prod"
|
|
DB_USER="admin_service"
|
|
BACKUP_DIR="/opt/backups/admin-service"
|
|
DATE=$(date +%Y%m%d_%H%M%S)
|
|
RETENTION_DAYS=30
|
|
|
|
# 创建备份目录
|
|
mkdir -p $BACKUP_DIR
|
|
|
|
# 执行备份
|
|
pg_dump -U $DB_USER -d $DB_NAME -F c -b -v -f "$BACKUP_DIR/admin_service_$DATE.backup"
|
|
|
|
# 压缩
|
|
gzip "$BACKUP_DIR/admin_service_$DATE.backup"
|
|
|
|
# 删除旧备份
|
|
find $BACKUP_DIR -name "*.backup.gz" -mtime +$RETENTION_DAYS -delete
|
|
|
|
echo "Backup completed: admin_service_$DATE.backup.gz"
|
|
```
|
|
|
|
#### 设置定时任务
|
|
|
|
```bash
|
|
chmod +x /opt/scripts/backup-admin-db.sh
|
|
|
|
# 添加到 crontab
|
|
crontab -e
|
|
# 每天凌晨 2 点备份
|
|
0 2 * * * /opt/scripts/backup-admin-db.sh >> /var/log/admin-service-backup.log 2>&1
|
|
```
|
|
|
|
#### 恢复数据库
|
|
|
|
```bash
|
|
# 解压备份
|
|
gunzip admin_service_20250103_020000.backup.gz
|
|
|
|
# 恢复
|
|
pg_restore -U admin_service -d admin_service_prod -v admin_service_20250103_020000.backup
|
|
```
|
|
|
|
### 6.5 监控告警
|
|
|
|
#### PM2 监控
|
|
|
|
```bash
|
|
# 安装 PM2 Plus (可选)
|
|
pm2 install pm2-logrotate
|
|
pm2 install pm2-server-monit
|
|
|
|
# 查看监控
|
|
pm2 monit
|
|
```
|
|
|
|
#### 健康检查脚本
|
|
|
|
创建 `/opt/scripts/health-check.sh`:
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
|
|
HEALTH_URL="http://localhost:3010/api/v1/health"
|
|
ALERT_EMAIL="admin@rwadurian.com"
|
|
|
|
response=$(curl -s -o /dev/null -w "%{http_code}" $HEALTH_URL)
|
|
|
|
if [ "$response" != "200" ]; then
|
|
echo "Admin Service health check failed! HTTP code: $response" | \
|
|
mail -s "Admin Service Alert" $ALERT_EMAIL
|
|
|
|
# 自动重启 (可选)
|
|
pm2 restart admin-service
|
|
fi
|
|
```
|
|
|
|
#### 设置监控定时任务
|
|
|
|
```bash
|
|
crontab -e
|
|
# 每 5 分钟检查一次
|
|
*/5 * * * * /opt/scripts/health-check.sh
|
|
```
|
|
|
|
---
|
|
|
|
## 7. 监控和维护
|
|
|
|
### 7.1 性能监控
|
|
|
|
#### 应用指标
|
|
|
|
```bash
|
|
# CPU 和内存使用
|
|
pm2 monit
|
|
|
|
# 详细指标
|
|
pm2 describe admin-service
|
|
|
|
# 进程列表
|
|
pm2 list
|
|
```
|
|
|
|
#### 数据库监控
|
|
|
|
```bash
|
|
# 连接数
|
|
sudo -u postgres psql -c "SELECT count(*) FROM pg_stat_activity WHERE datname = 'admin_service_prod';"
|
|
|
|
# 慢查询
|
|
sudo -u postgres psql -d admin_service_prod -c "SELECT query, calls, total_time, mean_time FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10;"
|
|
|
|
# 数据库大小
|
|
sudo -u postgres psql -c "SELECT pg_size_pretty(pg_database_size('admin_service_prod'));"
|
|
```
|
|
|
|
### 7.2 日常维护
|
|
|
|
#### 更新应用
|
|
|
|
```bash
|
|
cd /opt/rwa-durian/backend/services/admin-service
|
|
|
|
# 1. 备份当前版本
|
|
cp -r dist dist.backup.$(date +%Y%m%d)
|
|
|
|
# 2. 拉取最新代码
|
|
git pull origin main
|
|
|
|
# 3. 安装依赖
|
|
npm ci --omit=dev
|
|
|
|
# 4. 运行迁移
|
|
npm run prisma:migrate:deploy
|
|
|
|
# 5. 构建
|
|
npm run build
|
|
|
|
# 6. 重启服务
|
|
pm2 restart admin-service
|
|
|
|
# 7. 验证
|
|
curl http://localhost:3010/api/v1/health
|
|
|
|
# 8. 查看日志
|
|
pm2 logs admin-service --lines 50
|
|
```
|
|
|
|
#### 数据库维护
|
|
|
|
```bash
|
|
# 1. 分析表
|
|
sudo -u postgres psql -d admin_service_prod -c "ANALYZE;"
|
|
|
|
# 2. 清理死元组
|
|
sudo -u postgres psql -d admin_service_prod -c "VACUUM ANALYZE;"
|
|
|
|
# 3. 重建索引
|
|
sudo -u postgres psql -d admin_service_prod -c "REINDEX DATABASE admin_service_prod;"
|
|
```
|
|
|
|
### 7.3 扩容方案
|
|
|
|
#### 垂直扩容 (增加资源)
|
|
|
|
```bash
|
|
# 1. 调整 PM2 实例数
|
|
pm2 scale admin-service 4 # 增加到 4 个实例
|
|
|
|
# 2. 调整内存限制
|
|
# 编辑 ecosystem.config.js
|
|
max_memory_restart: '1G' # 增加到 1GB
|
|
|
|
pm2 restart admin-service
|
|
```
|
|
|
|
#### 水平扩容 (增加服务器)
|
|
|
|
1. 在新服务器上重复本地部署步骤
|
|
2. 配置 Nginx 负载均衡:
|
|
|
|
```nginx
|
|
upstream admin_service {
|
|
least_conn;
|
|
server 192.168.1.111:3010 weight=1;
|
|
server 192.168.1.112:3010 weight=1;
|
|
server 192.168.1.113:3010 weight=1;
|
|
}
|
|
```
|
|
|
|
3. 重新加载 Nginx:
|
|
```bash
|
|
sudo nginx -t
|
|
sudo systemctl reload nginx
|
|
```
|
|
|
|
---
|
|
|
|
## 8. 故障排查
|
|
|
|
### 8.1 常见问题
|
|
|
|
#### 问题 1: 服务无法启动
|
|
|
|
**症状**:
|
|
```bash
|
|
pm2 logs admin-service
|
|
# Error: Cannot find module '@prisma/client'
|
|
```
|
|
|
|
**解决方案**:
|
|
```bash
|
|
npm run prisma:generate
|
|
pm2 restart admin-service
|
|
```
|
|
|
|
#### 问题 2: 数据库连接失败
|
|
|
|
**症状**:
|
|
```
|
|
Error: P1001: Can't reach database server
|
|
```
|
|
|
|
**排查步骤**:
|
|
```bash
|
|
# 1. 检查 PostgreSQL 状态
|
|
sudo systemctl status postgresql
|
|
|
|
# 2. 测试数据库连接
|
|
psql -U admin_service -h localhost -d admin_service_prod
|
|
|
|
# 3. 检查 DATABASE_URL 配置
|
|
cat .env.production | grep DATABASE_URL
|
|
|
|
# 4. 检查防火墙
|
|
sudo ufw status
|
|
```
|
|
|
|
#### 问题 3: 内存泄漏
|
|
|
|
**症状**:
|
|
```bash
|
|
pm2 list
|
|
# admin-service 内存持续增长
|
|
```
|
|
|
|
**排查步骤**:
|
|
```bash
|
|
# 1. 查看内存使用
|
|
pm2 describe admin-service
|
|
|
|
# 2. 分析堆内存
|
|
node --inspect dist/main.js
|
|
# 访问 chrome://inspect
|
|
|
|
# 3. 临时解决 - 重启
|
|
pm2 restart admin-service
|
|
|
|
# 4. 调整内存限制
|
|
# ecosystem.config.js
|
|
max_memory_restart: '500M'
|
|
```
|
|
|
|
#### 问题 4: 高并发性能下降
|
|
|
|
**症状**: 响应时间变长,超时增加
|
|
|
|
**优化方案**:
|
|
|
|
1. **增加实例数**:
|
|
```bash
|
|
pm2 scale admin-service +2
|
|
```
|
|
|
|
2. **数据库连接池**:
|
|
```javascript
|
|
// prisma/schema.prisma
|
|
datasource db {
|
|
provider = "postgresql"
|
|
url = env("DATABASE_URL")
|
|
}
|
|
|
|
generator client {
|
|
provider = "prisma-client-js"
|
|
previewFeatures = ["metrics"]
|
|
}
|
|
|
|
// src/infrastructure/prisma/prisma.service.ts
|
|
@Injectable()
|
|
export class PrismaService extends PrismaClient {
|
|
constructor() {
|
|
super({
|
|
datasources: {
|
|
db: {
|
|
url: process.env.DATABASE_URL,
|
|
},
|
|
},
|
|
connectionLimit: 10, // 增加连接池
|
|
});
|
|
}
|
|
}
|
|
```
|
|
|
|
3. **添加缓存** (Redis):
|
|
```bash
|
|
# 安装 Redis
|
|
sudo apt install -y redis-server
|
|
|
|
# 配置缓存
|
|
npm install @nestjs/cache-manager cache-manager cache-manager-redis-store
|
|
```
|
|
|
|
### 8.2 日志分析
|
|
|
|
#### 错误日志
|
|
|
|
```bash
|
|
# 查看最近的错误
|
|
pm2 logs admin-service --err --lines 100
|
|
|
|
# 搜索特定错误
|
|
pm2 logs admin-service --err | grep "Error"
|
|
|
|
# Nginx 错误日志
|
|
sudo tail -100 /var/log/nginx/admin-service-error.log
|
|
```
|
|
|
|
#### 性能分析
|
|
|
|
```bash
|
|
# PM2 性能监控
|
|
pm2 monit
|
|
|
|
# Node.js profiler
|
|
node --prof dist/main.js
|
|
# 生成 isolate-*.log
|
|
|
|
# 分析 profile
|
|
node --prof-process isolate-*.log > profile.txt
|
|
```
|
|
|
|
### 8.3 回滚策略
|
|
|
|
#### 应用回滚
|
|
|
|
```bash
|
|
# 1. 停止服务
|
|
pm2 stop admin-service
|
|
|
|
# 2. 恢复备份代码
|
|
rm -rf dist
|
|
mv dist.backup.20250103 dist
|
|
|
|
# 3. 回滚数据库迁移 (谨慎!)
|
|
DATABASE_URL="..." npx prisma migrate resolve --rolled-back 20250103100000_add_new_field
|
|
|
|
# 4. 重启服务
|
|
pm2 start admin-service
|
|
|
|
# 5. 验证
|
|
curl http://localhost:3010/api/v1/health
|
|
```
|
|
|
|
#### 数据库回滚
|
|
|
|
```bash
|
|
# 恢复数据库备份
|
|
pg_restore -U admin_service -d admin_service_prod -c admin_service_20250103_020000.backup
|
|
```
|
|
|
|
---
|
|
|
|
## 9. 安全加固
|
|
|
|
### 9.1 应用安全
|
|
|
|
```bash
|
|
# 1. 限制 Node.js 进程权限
|
|
# 创建专用用户
|
|
sudo useradd -r -s /bin/false admin_service
|
|
|
|
# 2. 设置文件权限
|
|
sudo chown -R admin_service:admin_service /opt/rwa-durian/backend/services/admin-service
|
|
sudo chmod -R 750 /opt/rwa-durian/backend/services/admin-service
|
|
|
|
# 3. 使用环境变量管理敏感信息
|
|
# .env.production 权限
|
|
chmod 600 .env.production
|
|
```
|
|
|
|
### 9.2 数据库安全
|
|
|
|
```bash
|
|
# 1. 修改默认密码
|
|
sudo -u postgres psql
|
|
ALTER USER admin_service WITH PASSWORD 'new_strong_password';
|
|
|
|
# 2. 限制网络访问
|
|
# /etc/postgresql/16/main/pg_hba.conf
|
|
host admin_service_prod admin_service 127.0.0.1/32 md5
|
|
|
|
# 3. 启用 SSL
|
|
# postgresql.conf
|
|
ssl = on
|
|
```
|
|
|
|
### 9.3 Nginx 安全
|
|
|
|
```nginx
|
|
# 隐藏版本号
|
|
server_tokens off;
|
|
|
|
# 限流
|
|
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=10r/s;
|
|
|
|
server {
|
|
# ...
|
|
location /api/v1/ {
|
|
limit_req zone=api_limit burst=20;
|
|
proxy_pass http://admin_service;
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## 10. 快速参考
|
|
|
|
### 10.1 常用命令
|
|
|
|
```bash
|
|
# PM2 管理
|
|
pm2 start admin-service
|
|
pm2 stop admin-service
|
|
pm2 restart admin-service
|
|
pm2 reload admin-service # 零停机重启
|
|
pm2 delete admin-service
|
|
pm2 logs admin-service
|
|
pm2 monit
|
|
|
|
# 数据库
|
|
npm run prisma:migrate:deploy
|
|
npm run prisma:generate
|
|
npm run prisma:studio
|
|
|
|
# 构建
|
|
npm run build
|
|
npm run start:prod
|
|
|
|
# 健康检查
|
|
curl http://localhost:3010/api/v1/health
|
|
|
|
# Nginx
|
|
sudo systemctl status nginx
|
|
sudo systemctl reload nginx
|
|
sudo nginx -t
|
|
```
|
|
|
|
### 10.2 deploy.sh 命令速查
|
|
|
|
```bash
|
|
# 构建
|
|
./deploy.sh build # 构建镜像
|
|
./deploy.sh build-no-cache # 无缓存构建
|
|
|
|
# 生命周期
|
|
./deploy.sh start # 启动服务
|
|
./deploy.sh stop # 停止服务
|
|
./deploy.sh restart # 重启服务
|
|
./deploy.sh up # 前台启动
|
|
./deploy.sh down # 停止并清理
|
|
|
|
# 监控
|
|
./deploy.sh logs # 实时日志
|
|
./deploy.sh logs-tail # 最近日志
|
|
./deploy.sh status # 服务状态
|
|
./deploy.sh health # 健康检查
|
|
|
|
# 数据库
|
|
./deploy.sh migrate # 生产迁移
|
|
./deploy.sh migrate-dev # 开发迁移
|
|
./deploy.sh prisma-studio # Prisma GUI
|
|
|
|
# 开发
|
|
./deploy.sh dev # 开发模式
|
|
./deploy.sh test # 运行测试
|
|
./deploy.sh shell # 进入容器
|
|
|
|
# 清理
|
|
./deploy.sh clean # 清理容器
|
|
./deploy.sh clean-all # 完全清理
|
|
|
|
# 信息
|
|
./deploy.sh info # 服务信息
|
|
```
|
|
|
|
### 10.3 检查清单
|
|
|
|
#### 部署前
|
|
|
|
- [ ] 代码已通过所有测试
|
|
- [ ] 环境变量已正确配置
|
|
- [ ] 数据库迁移已准备
|
|
- [ ] SSL 证书已配置
|
|
- [ ] 备份策略已设置
|
|
- [ ] 监控告警已配置
|
|
|
|
#### 部署后
|
|
|
|
- [ ] 服务健康检查通过
|
|
- [ ] 数据库连接正常
|
|
- [ ] API 端点可访问
|
|
- [ ] 日志正常输出
|
|
- [ ] 性能指标正常
|
|
- [ ] 备份自动执行
|
|
|
|
---
|
|
|
|
**最后更新**: 2025-12-03
|
|
**版本**: 1.0.0
|
|
**维护者**: RWA Durian Team
|