History

hailin 38d813617c first commit		2025-08-03 20:28:19 +08:00
..
README.md	first commit	2025-08-03 20:28:19 +08:00
auto_awq.md	first commit	2025-08-03 20:28:19 +08:00
bitblas.md	first commit	2025-08-03 20:28:19 +08:00
bnb.md	first commit	2025-08-03 20:28:19 +08:00
fp8.md	first commit	2025-08-03 20:28:19 +08:00
gguf.md	first commit	2025-08-03 20:28:19 +08:00
gptqmodel.md	first commit	2025-08-03 20:28:19 +08:00
inc.md	first commit	2025-08-03 20:28:19 +08:00
int4.md	first commit	2025-08-03 20:28:19 +08:00
int8.md	first commit	2025-08-03 20:28:19 +08:00
modelopt.md	first commit	2025-08-03 20:28:19 +08:00
quantized_kvcache.md	first commit	2025-08-03 20:28:19 +08:00
quark.md	first commit	2025-08-03 20:28:19 +08:00
supported_hardware.md	first commit	2025-08-03 20:28:19 +08:00
torchao.md	first commit	2025-08-03 20:28:19 +08:00

README.md

Quantization

Quantization trades off model precision for smaller memory footprint, allowing large models to be run on a wider range of devices.

Contents: