evalscope/docs/zh/experiments/benchmark/index.md

103 B

基准测试

记录了一些模型的基准测试结果:

:::{toctree} :maxdepth: 1

mmlu.md :::