628 lines
18 KiB
Markdown
628 lines
18 KiB
Markdown
# 创建 2-of-3 钱包流程分析与潜在Bug
|
||
|
||
## 理论流程(应该如何工作)
|
||
|
||
### 环境:2台手机 + 1个 server-party-co-managed
|
||
|
||
```
|
||
手机1 (发起者)
|
||
↓
|
||
1. 调用 createNewSession(walletName="测试", t=2, n=3)
|
||
↓
|
||
2. 服务器创建会话,返回 sessionId + inviteCode
|
||
↓
|
||
3. 手机1 显示邀请码二维码
|
||
↓
|
||
4. server-party-co-managed 检测到新会话,自动加入(第1个参与者)
|
||
↓
|
||
5. 手机2 扫描二维码,调用 validateInviteCode + joinKeygenViaGrpc(第2个参与者)
|
||
↓
|
||
6. 服务器检测到参与者数量 = thresholdT (2)
|
||
↓
|
||
7. 服务器广播 "session_started" 事件给所有参与者(手机1、手机2、server)
|
||
↓
|
||
8. 所有参与者收到事件,调用 startKeygenAsInitiator/startKeygenAsJoiner
|
||
↓
|
||
9. TSS keygen 协议运行(9轮通信)
|
||
↓
|
||
10. 完成,所有参与者保存各自的分片
|
||
```
|
||
|
||
---
|
||
|
||
## 实际代码流程(当前实现)
|
||
|
||
### 手机1(发起者)
|
||
|
||
#### 步骤 1: 创建会话
|
||
**代码位置**: `MainViewModel.kt:253-330`
|
||
|
||
```kotlin
|
||
fun createNewSession(walletName: String, thresholdT: Int, thresholdN: Int, participantName: String) {
|
||
safeLaunch { // ← 【潜在问题1】如果抛出异常会被捕获
|
||
_uiState.update { it.copy(isLoading = true, error = null) }
|
||
|
||
val result = repository.createSession(walletName, thresholdT, thresholdN)
|
||
|
||
result.fold(
|
||
onSuccess = { sessionResult ->
|
||
_currentSessionId.value = sessionResult.sessionId
|
||
_createdInviteCode.value = sessionResult.inviteCode
|
||
|
||
// 【关键】获取会话状态,检查参与者数量
|
||
val statusResult = repository.getSessionStatus(sessionResult.sessionId)
|
||
statusResult.fold(
|
||
onSuccess = { status ->
|
||
_sessionParticipants.value = status.participants.map { ... }
|
||
// ✅ 正确显示参与者列表
|
||
},
|
||
onFailure = { e ->
|
||
// ⚠️ 失败时只使用自己
|
||
_sessionParticipants.value = listOf(participantName)
|
||
}
|
||
)
|
||
},
|
||
onFailure = { e ->
|
||
_uiState.update { it.copy(isLoading = false, error = e.message) }
|
||
}
|
||
)
|
||
}
|
||
}
|
||
```
|
||
|
||
**潜在问题**:
|
||
- ✅ Result 处理正确
|
||
- ⚠️ 如果 getSessionStatus 失败,参与者列表不准确
|
||
- ⚠️ 但这不影响实际的 keygen 启动
|
||
|
||
---
|
||
|
||
#### 步骤 2: 等待 session_started 事件
|
||
**代码位置**: `MainViewModel.kt:382-406`
|
||
|
||
```kotlin
|
||
repository.setSessionEventCallback { event ->
|
||
when (event.eventType) {
|
||
"session_started" -> {
|
||
val currentSessionId = _currentSessionId.value
|
||
if (currentSessionId != null && event.sessionId == currentSessionId) {
|
||
android.util.Log.d("MainViewModel", "Session started event for keygen initiator, triggering keygen")
|
||
|
||
safeLaunch { // ← 【关键问题!】
|
||
startKeygenAsInitiator(
|
||
sessionId = currentSessionId,
|
||
thresholdT = event.thresholdT,
|
||
thresholdN = event.thresholdN,
|
||
selectedParties = event.selectedParties
|
||
)
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
**这是 Bug 的根源!**
|
||
|
||
问题分析:
|
||
1. `setSessionEventCallback` 是在 **另一个线程**(WebSocket 事件线程)中回调的
|
||
2. 在回调中使用 `safeLaunch` 启动协程
|
||
3. **如果 `startKeygenAsInitiator` 抛出异常**,`safeLaunch` 会捕获并更新 `_uiState.error`
|
||
4. 但是,**用户可能没有看到错误提示**,因为:
|
||
- UI 可能正在显示"等待参与者"界面
|
||
- `_uiState.error` 的更新可能被忽略
|
||
- 没有明确的错误反馈路径
|
||
|
||
---
|
||
|
||
#### 步骤 3: 执行 keygen
|
||
**代码位置**: `MainViewModel.kt:537-570`
|
||
|
||
```kotlin
|
||
private suspend fun startKeygenAsInitiator(
|
||
sessionId: String,
|
||
thresholdT: Int,
|
||
thresholdN: Int,
|
||
selectedParties: List<String>
|
||
) {
|
||
android.util.Log.d("MainViewModel", "Starting keygen as initiator: sessionId=$sessionId, t=$thresholdT, n=$thresholdN")
|
||
|
||
val result = repository.startKeygenAsInitiator(
|
||
sessionId = sessionId,
|
||
thresholdT = thresholdT,
|
||
thresholdN = thresholdN,
|
||
password = ""
|
||
)
|
||
|
||
result.fold(
|
||
onSuccess = { share ->
|
||
_publicKey.value = share.publicKey
|
||
_uiState.update {
|
||
it.copy(
|
||
lastCreatedAddress = share.address,
|
||
successMessage = "钱包创建成功!"
|
||
)
|
||
}
|
||
},
|
||
onFailure = { e ->
|
||
// ⚠️ 错误被记录到 _uiState.error
|
||
_uiState.update { it.copy(error = e.message) }
|
||
}
|
||
)
|
||
}
|
||
```
|
||
|
||
**潜在问题**:
|
||
- ✅ Result 处理正确
|
||
- ⚠️ 但如果函数本身抛出异常(非 Result.failure),外层的 `safeLaunch` 会捕获
|
||
- ⚠️ 这会导致**双重错误处理**:
|
||
1. `startKeygenAsInitiator` 更新 `_uiState.error`(如果是 Result.failure)
|
||
2. `safeLaunch` 也更新 `_uiState.error`(如果是异常)
|
||
|
||
---
|
||
|
||
### 手机2(加入者)
|
||
|
||
#### 步骤 1: 扫描邀请码
|
||
**代码位置**: `MainViewModel.kt:609-641`
|
||
|
||
```kotlin
|
||
fun validateInviteCode(inviteCode: String) {
|
||
safeLaunch {
|
||
_uiState.update { it.copy(isLoading = true, error = null) }
|
||
|
||
val result = repository.validateInviteCode(inviteCode)
|
||
|
||
result.fold(
|
||
onSuccess = { validateResult ->
|
||
_joinSessionInfo.value = JoinKeygenSessionInfo(...)
|
||
_uiState.update { it.copy(isLoading = false) }
|
||
},
|
||
onFailure = { e ->
|
||
_uiState.update { it.copy(isLoading = false, error = e.message) }
|
||
}
|
||
)
|
||
}
|
||
}
|
||
```
|
||
|
||
**状态**: ✅ 处理正确
|
||
|
||
---
|
||
|
||
#### 步骤 2: 加入会话
|
||
**代码位置**: `MainViewModel.kt:648-706`
|
||
|
||
```kotlin
|
||
fun joinKeygen(inviteCode: String, password: String) {
|
||
safeLaunch {
|
||
_uiState.update { it.copy(isLoading = true, error = null) }
|
||
|
||
val result = repository.joinKeygenViaGrpc(
|
||
inviteCode = pendingInviteCode,
|
||
joinToken = pendingJoinToken,
|
||
password = password
|
||
)
|
||
|
||
result.fold(
|
||
onSuccess = { joinResult ->
|
||
// 【关键】保存 joinResult 用于后续 keygen
|
||
pendingJoinKeygenInfo = JoinKeygenInfo(
|
||
sessionId = joinResult.sessionId,
|
||
partyIndex = joinResult.partyIndex,
|
||
partyId = joinResult.partyId,
|
||
participantIds = joinResult.participantIds
|
||
)
|
||
|
||
// ✅ 等待 session_started 事件
|
||
_uiState.update { it.copy(isLoading = false) }
|
||
},
|
||
onFailure = { e ->
|
||
_uiState.update { it.copy(isLoading = false, error = e.message) }
|
||
}
|
||
)
|
||
}
|
||
}
|
||
```
|
||
|
||
**状态**: ✅ 处理正确
|
||
|
||
---
|
||
|
||
#### 步骤 3: 等待 session_started 事件
|
||
**代码位置**: `MainViewModel.kt:408-413`
|
||
|
||
```kotlin
|
||
// Check if this is for keygen joiner (JoinKeygen)
|
||
val joinKeygenInfo = pendingJoinKeygenInfo
|
||
if (joinKeygenInfo != null && event.sessionId == joinKeygenInfo.sessionId) {
|
||
android.util.Log.d("MainViewModel", "Session started event for keygen joiner, triggering keygen")
|
||
startKeygenAsJoiner() // ← 【注意】没有用 safeLaunch 包裹!
|
||
}
|
||
```
|
||
|
||
**关键发现!**
|
||
|
||
对比发起者和加入者:
|
||
- **发起者**: `safeLaunch { startKeygenAsInitiator(...) }` ← 包了 safeLaunch
|
||
- **加入者**: `startKeygenAsJoiner()` ← 没有包 safeLaunch
|
||
|
||
**这是不一致的!**
|
||
|
||
---
|
||
|
||
#### 步骤 4: 执行 keygen(加入者)
|
||
**代码位置**: `MainViewModel.kt:714-764`
|
||
|
||
```kotlin
|
||
private suspend fun startKeygenAsJoiner() {
|
||
safeLaunch { // ← 【注意】这里也用了 safeLaunch
|
||
val joinInfo = pendingJoinKeygenInfo ?: return
|
||
|
||
_uiState.update { it.copy(isLoading = true, error = null) }
|
||
|
||
val result = repository.startKeygenAsJoiner(
|
||
sessionId = joinInfo.sessionId,
|
||
partyIndex = joinInfo.partyIndex,
|
||
participantIds = joinInfo.participantIds,
|
||
password = pendingPassword
|
||
)
|
||
|
||
result.fold(
|
||
onSuccess = { share ->
|
||
_joinKeygenPublicKey.value = share.publicKey
|
||
_uiState.update {
|
||
it.copy(
|
||
isLoading = false,
|
||
successMessage = "成功加入钱包!"
|
||
)
|
||
}
|
||
},
|
||
onFailure = { e ->
|
||
_uiState.update { it.copy(isLoading = false, error = e.message) }
|
||
}
|
||
)
|
||
}
|
||
}
|
||
```
|
||
|
||
**问题**:
|
||
- `startKeygenAsJoiner` 自己已经用了 `safeLaunch`
|
||
- 但在事件回调中调用它时,**没有**再包一层 `safeLaunch`
|
||
- 这和发起者的处理方式不同!
|
||
|
||
**不一致性总结**:
|
||
|
||
| 角色 | 事件回调中 | 函数自身 | 总包裹层数 |
|
||
|-----|-----------|---------|----------|
|
||
| 发起者 | `safeLaunch { startKeygenAsInitiator() }` | 无 safeLaunch | 1层 |
|
||
| 加入者 | `startKeygenAsJoiner()` | `safeLaunch { ... }` | 1层 |
|
||
|
||
虽然都是1层,但**位置不同**!
|
||
|
||
---
|
||
|
||
## 🐛 已发现的Bug清单
|
||
|
||
### Bug 1: 事件回调中的异常处理不一致 ⚠️
|
||
|
||
**位置**: `MainViewModel.kt:398-413`
|
||
|
||
**问题**:
|
||
- 发起者:事件回调中使用 `safeLaunch` 包裹
|
||
- 加入者:事件回调中直接调用(函数内部有 `safeLaunch`)
|
||
|
||
**影响**:
|
||
- 如果发起者的 `startKeygenAsInitiator` 在被 `safeLaunch` 调用**之前**抛出异常(例如参数验证),会被捕获
|
||
- 但加入者的 `startKeygenAsJoiner` 在事件回调中直接调用,如果函数调用本身抛出异常(不是内部的),不会被捕获
|
||
|
||
**建议**: 统一处理方式
|
||
|
||
---
|
||
|
||
### Bug 2: safeLaunch 双重包裹可能导致静默失败 🚨
|
||
|
||
**位置**: `MainViewModel.kt:398-405` + `MainViewModel.kt:537-570`
|
||
|
||
**问题流程**:
|
||
```
|
||
事件回调
|
||
↓
|
||
safeLaunch { // ← 第1层异常捕获
|
||
startKeygenAsInitiator()
|
||
↓
|
||
如果抛出异常 X
|
||
↓
|
||
}
|
||
↓
|
||
} catch (e: Exception) { // ← 捕获异常 X
|
||
_uiState.update { it.copy(error = ...) } // ← 更新错误
|
||
}
|
||
```
|
||
|
||
但是:
|
||
1. `startKeygenAsInitiator` 内部已经处理了 `Result.failure`
|
||
2. 外层 `safeLaunch` 只能捕获**运行时异常**
|
||
3. 如果 `repository.startKeygenAsInitiator` 返回 `Result.failure`,不会抛出异常
|
||
4. **所以外层 safeLaunch 实际上没什么用**
|
||
|
||
**更严重的问题**:
|
||
如果 `startKeygenAsInitiator` 内部处理了错误(更新了 `_uiState.error`),但UI已经切换到其他状态,**用户可能看不到错误**!
|
||
|
||
---
|
||
|
||
### Bug 3: 参与者数量不足时没有明确错误 ⚠️
|
||
|
||
**场景**:
|
||
- 创建 2-of-3 会话
|
||
- server-party-co-managed 没有自动加入(配置错误)
|
||
- 只有手机1(发起者)
|
||
- **服务器不会广播 session_started 事件**
|
||
|
||
**当前行为**:
|
||
- 手机1 一直显示"等待参与者加入..."
|
||
- **没有超时提示**
|
||
- **没有明确的错误消息**
|
||
|
||
**建议**: 添加超时机制和友好提示
|
||
|
||
---
|
||
|
||
### Bug 4: getSessionStatus 失败时参与者列表不准确 ⚠️
|
||
|
||
**位置**: `MainViewModel.kt:302-321`
|
||
|
||
```kotlin
|
||
val statusResult = repository.getSessionStatus(sessionResult.sessionId)
|
||
statusResult.fold(
|
||
onSuccess = { status ->
|
||
_sessionParticipants.value = status.participants.map { ... }
|
||
},
|
||
onFailure = { e ->
|
||
// ⚠️ 失败时只显示自己
|
||
_sessionParticipants.value = listOf(participantName)
|
||
}
|
||
)
|
||
```
|
||
|
||
**问题**:
|
||
- 如果 `getSessionStatus` 失败,参与者列表显示为1
|
||
- 但实际上可能已经有多个参与者(例如 server-party-co-managed)
|
||
- **这会误导用户**,以为没人加入
|
||
|
||
---
|
||
|
||
### Bug 5: 事件回调中的 return 没有处理 ⚠️
|
||
|
||
**位置**: `MainViewModel.kt:714` (startKeygenAsJoiner)
|
||
|
||
```kotlin
|
||
private suspend fun startKeygenAsJoiner() {
|
||
safeLaunch {
|
||
val joinInfo = pendingJoinKeygenInfo ?: return // ← 这个 return 只返回 lambda
|
||
// ...
|
||
}
|
||
}
|
||
```
|
||
|
||
**问题**:
|
||
- `return` 只会退出 `safeLaunch` 的 lambda
|
||
- 不会更新 UI 状态或显示错误
|
||
- **用户不知道为什么 keygen 没有启动**
|
||
|
||
**建议**: 如果 `joinInfo` 为 null,应该记录错误并通知用户
|
||
|
||
---
|
||
|
||
## 🔍 为什么会创建失败?
|
||
|
||
### 最可能的原因
|
||
|
||
#### 原因 1: server-party-co-managed 没有正确加入 🔴
|
||
|
||
**检查**:
|
||
1. server-party-co-managed 是否正在运行?
|
||
2. 配置文件中是否启用了自动加入?
|
||
3. 服务器日志中是否有加入记录?
|
||
|
||
**验证命令**:
|
||
```bash
|
||
# 检查 server-party-co-managed 日志
|
||
tail -f /path/to/server-party-co-managed/logs/server.log | grep "join"
|
||
```
|
||
|
||
**预期日志**:
|
||
```
|
||
[INFO] Detected new session: sessionId=xxx
|
||
[INFO] Auto-joining session as backup party
|
||
[INFO] Successfully joined session, partyId=backup-party-1
|
||
```
|
||
|
||
如果**没有这些日志**,说明 server-party-co-managed 没有加入!
|
||
|
||
---
|
||
|
||
#### 原因 2: session_started 事件没有被触发 🔴
|
||
|
||
**条件**:
|
||
- 服务器只有在 `participants.size >= thresholdT` 时才会广播 `session_started`
|
||
- 2-of-3 需要至少 2 个参与者
|
||
|
||
**检查**:
|
||
1. 服务器端参与者列表有多少个?
|
||
2. 手机1 的日志中是否有 "Session started event"?
|
||
|
||
**预期日志(手机1)**:
|
||
```
|
||
MainViewModel: === MainViewModel received session event ===
|
||
MainViewModel: eventType: session_started
|
||
MainViewModel: sessionId: xxxxxxxx
|
||
MainViewModel: Session started event for keygen initiator, triggering keygen
|
||
```
|
||
|
||
如果**没有这条日志**,说明事件没有触发!
|
||
|
||
---
|
||
|
||
#### 原因 3: startKeygenAsInitiator 内部失败但没有显示错误 🔴
|
||
|
||
**场景**:
|
||
1. `session_started` 事件触发了
|
||
2. 调用了 `startKeygenAsInitiator`
|
||
3. 但 `repository.startKeygenAsInitiator` 返回 `Result.failure`
|
||
4. 错误被记录到 `_uiState.error`
|
||
5. **但 UI 没有显示错误**(因为还在"等待参与者"界面)
|
||
|
||
**检查日志**:
|
||
```
|
||
MainViewModel: Session started event for keygen initiator, triggering keygen
|
||
MainViewModel: Starting keygen as initiator: sessionId=xxx, t=2, n=3
|
||
TssRepository: Starting keygen as initiator
|
||
TssRepository: Error: [具体错误信息] ← 看这里!
|
||
```
|
||
|
||
如果有这条错误日志,说明 keygen 启动失败了!
|
||
|
||
---
|
||
|
||
## 🛠️ 调试步骤
|
||
|
||
### 步骤 1: 检查 server-party-co-managed
|
||
|
||
```bash
|
||
# 1. 检查进程是否运行
|
||
ps aux | grep server-party-co-managed
|
||
|
||
# 2. 检查配置文件
|
||
cat /path/to/server-party-co-managed/config.yml | grep -A 10 "auto_join"
|
||
|
||
# 3. 查看最近日志
|
||
tail -f /path/to/server-party-co-managed/logs/server.log
|
||
```
|
||
|
||
### 步骤 2: 抓取手机1(发起者)日志
|
||
|
||
```bash
|
||
adb logcat -c
|
||
adb logcat -v time | grep -E "MainViewModel|TssRepository|GrpcClient|session_started"
|
||
```
|
||
|
||
**重点看**:
|
||
1. "Creating new session" → 会话创建
|
||
2. "Session created successfully" → 会话创建成功
|
||
3. "Session status fetched: X participants" → 参与者数量
|
||
4. "Session started event" → 事件触发
|
||
5. "Starting keygen as initiator" → keygen 启动
|
||
|
||
### 步骤 3: 抓取手机2(加入者)日志
|
||
|
||
```bash
|
||
adb logcat -c
|
||
adb logcat -v time | grep -E "MainViewModel|TssRepository|GrpcClient|session_started"
|
||
```
|
||
|
||
**重点看**:
|
||
1. "Validate success: sessionId=" → 邀请码验证成功
|
||
2. "Join keygen success: partyIndex=" → 加入成功
|
||
3. "Session started event for keygen joiner" → 收到事件
|
||
|
||
---
|
||
|
||
## 🚀 推荐修复方案
|
||
|
||
### 修复 1: 统一事件回调中的异常处理
|
||
|
||
```kotlin
|
||
repository.setSessionEventCallback { event ->
|
||
when (event.eventType) {
|
||
"session_started" -> {
|
||
// 统一使用 safeLaunch 包裹所有启动函数
|
||
val currentSessionId = _currentSessionId.value
|
||
if (currentSessionId != null && event.sessionId == currentSessionId) {
|
||
android.util.Log.d("MainViewModel", "Session started event for keygen initiator")
|
||
safeLaunch {
|
||
startKeygenAsInitiator(...)
|
||
}
|
||
}
|
||
|
||
val joinKeygenInfo = pendingJoinKeygenInfo
|
||
if (joinKeygenInfo != null && event.sessionId == joinKeygenInfo.sessionId) {
|
||
android.util.Log.d("MainViewModel", "Session started event for keygen joiner")
|
||
safeLaunch { // ← 添加 safeLaunch
|
||
startKeygenAsJoiner()
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
### 修复 2: 移除 startKeygenAsJoiner 内部的 safeLaunch
|
||
|
||
```kotlin
|
||
private suspend fun startKeygenAsJoiner() {
|
||
// 移除内部的 safeLaunch,由调用方负责异常处理
|
||
val joinInfo = pendingJoinKeygenInfo
|
||
if (joinInfo == null) {
|
||
android.util.Log.e("MainViewModel", "startKeygenAsJoiner: joinInfo is null!")
|
||
_uiState.update { it.copy(error = "加入信息丢失,请重试") }
|
||
return
|
||
}
|
||
|
||
_uiState.update { it.copy(isLoading = true, error = null) }
|
||
|
||
val result = repository.startKeygenAsJoiner(...)
|
||
// ...
|
||
}
|
||
```
|
||
|
||
### 修复 3: 添加超时机制
|
||
|
||
在 `createNewSession` 后启动超时计时器:
|
||
|
||
```kotlin
|
||
// 5分钟超时
|
||
val timeoutJob = viewModelScope.launch {
|
||
delay(300_000) // 5 minutes
|
||
if (_currentSessionId.value != null && _publicKey.value == null) {
|
||
_uiState.update {
|
||
it.copy(
|
||
error = "等待超时:参与者数量不足或服务器未响应",
|
||
isLoading = false
|
||
)
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 📊 总结
|
||
|
||
### 最可能导致失败的原因(按概率排序)
|
||
|
||
1. **🔴 server-party-co-managed 没有自动加入** (70%)
|
||
- 检查配置和日志
|
||
|
||
2. **🔴 session_started 事件没有触发** (20%)
|
||
- 参与者数量不足
|
||
- WebSocket 连接问题
|
||
|
||
3. **🟡 startKeygenAsInitiator 失败但错误被忽略** (8%)
|
||
- 检查手机日志中的异常
|
||
|
||
4. **🟢 safeLaunch 包裹问题** (2%)
|
||
- 理论上不会导致完全失败
|
||
- 但可能导致错误信息不清晰
|
||
|
||
### 立即行动项
|
||
|
||
1. **检查 server-party-co-managed 状态** ← 最重要!
|
||
2. **抓取手机日志,搜索 "session_started"**
|
||
3. **搜索日志中的 "Caught exception" 或 "Error:"**
|
||
4. **把日志发给我进行详细分析**
|
||
|
||
---
|
||
|
||
**请先按照"调试步骤"抓取日志,然后我们可以精确定位问题!**
|