|
|
||
|---|---|---|
| .. | ||
| matmul | ||
| README.md | ||
| main.cpp | ||
README.md
KleidiAI benchmark tool
Building
From the kleidiai-root:
Linux®-target
$ mkdir -p build && cd build
$ cmake -DCMAKE_C_COMPILER=/path/to/aarch64-none-linux-gnu-gcc -DCMAKE_CXX_COMPILER=/path/to/aarch64-none-linux-gnu-g++ -DKLEIDIAI_BUILD_BENCHMARK=ON -DCMAKE_BUILD_TYPE=Release ../
Android™-target
$ mkdir -p build && cd build
$ cmake -DCMAKE_TOOLCHAIN_FILE=/path/to/android-ndk/build/cmake/android.toolchain.cmake -DANDROID_ABI=arm64-v8a -DANDROID_PLATFORM=30 -DKLEIDIAI_BUILD_BENCHMARK=ON -DCMAKE_BUILD_TYPE=Release ../
Usage
The dimensions of the LHS- and RHS-matrices needs to be specified with the -m, -n and -k options.
The shape of the LHS-matrix is MxK, and the shape of the RHS-matrix is KxN.
$ ./kleidiai_benchmark -m 13 -n 17 -k 18
Run on (8 X 1800 MHz CPU s)
Load Average: 10.01, 10.06, 10.06
-----------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------------------------------------------------
matmul_clamp_f32_qai8dxp1x8_qsi4cxp4x8_1x4x32_neon_dotprod 123 ns 123 ns 1234567
matmul_clamp_f32_qai8dxp1x8_qsi4cxp8x8_1x8x32_neon_dotprod 123 ns 123 ns 1234567
matmul_clamp_f32_qai8dxp4x8_qsi4cxp4x8_4x4x32_neon_i8mm 123 ns 123 ns 1234567
matmul_clamp_f32_qai8dxp4x8_qsi4cxp4x8_8x4x32_neon_i8mm 123 ns 123 ns 1234567
matmul_clamp_f32_qai8dxp4x8_qsi4cxp8x8_4x8x32_neon_i8mm 123 ns 123 ns 1234567
matmul_clamp_f32_qai8dxp4x8_qsi4cxp8x8_8x8x32_neon_i8mm 123 ns 123 ns 1234567
Filtering
Testcases can be filtered using the --benchmark_filter accepts a regex. To run only the dotprod-testcases:
(Note: The measurement results are placeholders)
$ kleidiai_benchmark --benchmark_filter=dotprod -m 13 -n 17 -k 18
Run on (8 X 1800 MHz CPU s)
Load Average: 10.09, 10.13, 10.09
-----------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------------------------------------------------
matmul_clamp_f32_qai8dxp1x8_qsi4cxp4x8_1x4x32_neon_dotprod 123 ns 123 ns 1234567
matmul_clamp_f32_qai8dxp1x8_qsi4cxp8x8_1x8x32_neon_dotprod 123 ns 123 ns 1234567
This application uses Google Benchmark, so all options that Google Benchmark provides can be used.
To list the options provided use the --help flag or refer to the user guide.