This website requires JavaScript.
Explore
Help
Register
Sign In
hailin
/
evalscope
Watch
1
Star
0
Fork
You've already forked evalscope
0
Code
Issues
Pull Requests
Packages
Projects
Releases
Wiki
Activity
412c475c88
evalscope
/
docs
/
en
/
experiments
/
benchmark
/
index.md
107 B
Raw
Blame
History
Benchmarking
Here are the benchmarking results for some models:
:::{toctree} :maxdepth: 1
mmlu.md :::