Commit Graph

2 Commits

Author SHA1 Message Date
hailin e16ec7930d feat(knowledge): add file upload with text extraction for knowledge base
支持在管理后台知识库页面上传文件(PDF、Word、TXT、Markdown),
自动提取文本内容,管理员预览编辑后保存为知识库文章。

## 后端 (knowledge-service)

- 新增 TextExtractionService:文件文本提取服务
  - PDF 提取:使用 pdf-parse v2 (PDFParse class API)
  - Word (.docx) 提取:使用 mammoth.extractRawText()
  - TXT/Markdown:直接 UTF-8 解码
  - 支持中英文混合字数统计
  - 文件大小限制 200MB,类型校验(MIME 白名单)
  - 空文本 PDF(扫描件/图片)返回友好错误提示

- 新增上传接口:POST /knowledge/articles/upload
  - 使用 NestJS FileInterceptor 处理 multipart/form-data
  - 仅提取文本并返回,不直接创建文章(两步流程)
  - 返回:extractedText, suggestedTitle, wordCount, pageCount

- 新增 ExtractedTextResponse DTO
- KnowledgeModule 注册 TextExtractionService

## 前端 (admin-client)

- knowledge.api.ts:新增 uploadFile() 方法(FormData + 120s 超时)
- useKnowledge.ts:新增 useUploadKnowledgeFile hook
- KnowledgePage.tsx:
  - 新增 Segmented 切换器(手动输入 / 文件上传),仅新建时显示
  - 文件上传模式显示 Upload.Dragger 拖拽上传区域
  - 上传后自动提取文本,填入标题+内容字段
  - 提取完成自动切回手动模式,管理员可预览编辑后保存
  - 显示提取结果(字数、页数)

## 用户流程

新建文章 → 切换"文件上传" → 拖入/选择文件 → 系统提取文本
→ 自动填入标题+内容 → 管理员编辑确认 → 点击保存

## 依赖

- pdf-parse@^2.4.5(PDF 文本提取)
- mammoth@^1.8.0(Word 文档文本提取)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 22:58:19 -08:00
hailin afd707d15f refactor(services): implement 4-layer Clean Architecture for all backend services
Refactored all 6 backend services to 4-layer Clean Architecture pattern
following knowledge-service as reference implementation.

## Architecture Pattern (4-Layer)

```
src/
├── domain/              # Pure business entities and interfaces
│   ├── entities/        # Domain entities (no ORM decorators)
│   ├── repositories/    # Repository interfaces + Symbol tokens
│   └── value-objects/   # Enums and value types
├── application/
│   ├── dtos/            # Data transfer objects
│   └── services/        # Application services (use case orchestration)
├── adapters/
│   ├── inbound/         # Controllers, gateways (API endpoints)
│   └── outbound/
│       ├── persistence/ # Repository implementations
│       ├── clients/     # External service clients
│       └── storage/     # File storage adapters
└── infrastructure/
    └── database/postgres/
        └── entities/    # ORM entities with decorators
```

## Services Refactored

### user-service
- adapters/inbound: AuthController, UserController
- adapters/outbound/persistence: UserPostgresRepository, VerificationCodePostgresRepository
- application/services: AuthService, UserService
- application/dtos: AuthDto, UserDto

### payment-service
- adapters/inbound: OrderController, PaymentController
- adapters/outbound/persistence: OrderPostgresRepository, PaymentPostgresRepository
- adapters/outbound/payment-methods: AlipayAdapter, WechatPayAdapter, StripeAdapter
- application/services: OrderService, PaymentService
- application/dtos: OrderDto, PaymentDto

### file-service
- adapters/inbound: FileController
- adapters/outbound/persistence: FilePostgresRepository
- adapters/outbound/storage: MinioStorageAdapter
- application/services: FileService
- application/dtos: UploadFileDto

### conversation-service
- adapters/inbound: ConversationController, InternalController, ConversationGateway
- adapters/outbound/persistence: ConversationPostgresRepository, MessagePostgresRepository, TokenUsagePostgresRepository
- application/services: ConversationService
- application/dtos: ConversationDto

### knowledge-service
- adapters/inbound: KnowledgeController, MemoryController, InternalMemoryController
- adapters/outbound/persistence: KnowledgePostgresRepository, MemoryPostgresRepository
- application/services: KnowledgeService, MemoryService
- application/dtos: KnowledgeDto, MemoryDto

### evolution-service
- domain/entities: AdminEntity
- domain/repositories: IAdminRepository (Symbol-based DI)
- domain/value-objects: AdminRole enum
- adapters/inbound: AdminController, EvolutionController
- adapters/outbound/persistence: AdminPostgresRepository
- adapters/outbound/clients: ConversationClient, KnowledgeClient
- application/services: AdminService, EvolutionService
- application/dtos: AdminDto, EvolutionDto
- infrastructure/database/postgres/entities: AdminORM

## Key Improvements
- Symbol-based dependency injection for repository interfaces
- ORM entities separated from domain entities
- Consistent 4-layer structure across all services
- DTOs for API contracts
- Clear separation: domain logic vs infrastructure concerns

## Configuration
- Updated turbo.json: renamed "pipeline" to "tasks" for Turbo 2.0+

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-24 22:18:22 -08:00