inference/sglang/benchmark/json_schema/README.md

205 B

Run benchmark

Benchmark sglang

Run Llama-8b

python3 -m sglang.launch_server --model-path meta-llama/Llama-3.1-8B-Instruct --port 30000

Benchmark

python3 bench_sglang.py