sglang0.4.5.post1

History

hailin 0558580343 first commit @ sglang v0.4.5.post1		2025-06-29 18:55:37 +08:00
..
README.md	first commit @ sglang v0.4.5.post1	2025-06-29 18:55:37 +08:00
bench_hf.py	first commit @ sglang v0.4.5.post1	2025-06-29 18:55:37 +08:00
bench_sglang.py	first commit @ sglang v0.4.5.post1	2025-06-29 18:55:37 +08:00
data_utils.py	first commit @ sglang v0.4.5.post1	2025-06-29 18:55:37 +08:00
eval_utils.py	first commit @ sglang v0.4.5.post1	2025-06-29 18:55:37 +08:00
prompt_format.yaml	first commit @ sglang v0.4.5.post1	2025-06-29 18:55:37 +08:00

Run evaluation

Host the VLM:

python -m sglang.launch_server --model-path Qwen/Qwen2-VL-7B-Instruct --chat-template qwen2-vl --port 30000

Benchmark:

python benchmark/mmmu/bench_sglang.py --port 30000

It's recommended to reduce the memory usage by appending something ike --mem-fraction-static 0.6 to the command above.

python benchmark/mmmu/bench_hf.py --model-path Qwen/Qwen2-VL-7B-Instruct

Some popular model results: