evalscope_v0.17.0/evalscope.0.17.0/docs/en/experiments/benchmark/index.md

107 B

Benchmarking

Here are the benchmarking results for some models:

:::{toctree} :maxdepth: 1

mmlu.md :::