# QwQ-32B-Preview > QwQ-32B-Preview is an experimental research model developed by the Qwen team, aimed at enhancing the reasoning capabilities of artificial intelligence. [Model Link](https://modelscope.cn/models/Qwen/QwQ-32B-Preview/summary) The Speed Benchmark tool was used to test the GPU memory usage and inference speed of the QwQ-32B-Preview model under different configurations. The following tests measure the speed and memory usage when generating 2048 tokens, with input lengths of 1, 6144, 14336, and 30720: ## Local Transformers Inference Speed ### Test Environment - NVIDIA A100 80GB * 1 - CUDA 12.1 - Pytorch 2.3.1 - Flash Attention 2.5.8 - Transformers 4.46.0 - EvalScope 0.7.0 ### Stress Testing Command ```shell pip install evalscope[perf] -U ``` ```shell CUDA_VISIBLE_DEVICES=0 evalscope perf \ --parallel 1 \ --model Qwen/QwQ-32B-Preview \ --attn-implementation flash_attention_2 \ --log-every-n-query 1 \ --connect-timeout 60000 \ --read-timeout 60000\ --max-tokens 2048 \ --min-tokens 2048 \ --api local \ --dataset speed_benchmark ``` ### Test Results ```text +---------------+-----------------+----------------+ | Prompt Tokens | Speed(tokens/s) | GPU Memory(GB) | +---------------+-----------------+----------------+ | 1 | 17.92 | 61.58 | | 6144 | 12.61 | 63.72 | | 14336 | 9.01 | 67.31 | | 30720 | 5.61 | 74.47 | +---------------+-----------------+----------------+ ``` ## vLLM Inference Speed ### Test Environment - NVIDIA A100 80GB * 2 - CUDA 12.1 - vLLM 0.6.3 - Pytorch 2.4.0 - Flash Attention 2.6.3 - Transformers 4.46.0 ### Test Command ```shell CUDA_VISIBLE_DEVICES=0,1 evalscope perf \ --parallel 1 \ --model Qwen/QwQ-32B-Preview \ --log-every-n-query 1 \ --connect-timeout 60000 \ --read-timeout 60000\ --max-tokens 2048 \ --min-tokens 2048 \ --api local_vllm \ --dataset speed_benchmark ``` ### Test Results ```text +---------------+-----------------+ | Prompt Tokens | Speed(tokens/s) | +---------------+-----------------+ | 1 | 38.17 | | 6144 | 36.63 | | 14336 | 35.01 | | 30720 | 31.68 | +---------------+-----------------+ ```