|
|
||
|---|---|---|
| .. | ||
| README.md | ||
| auto_awq.md | ||
| bitblas.md | ||
| bnb.md | ||
| fp8.md | ||
| gguf.md | ||
| gptqmodel.md | ||
| inc.md | ||
| int4.md | ||
| int8.md | ||
| modelopt.md | ||
| quantized_kvcache.md | ||
| quark.md | ||
| supported_hardware.md | ||
| torchao.md | ||
README.md
Quantization
Quantization trades off model precision for smaller memory footprint, allowing large models to be run on a wider range of devices.
Contents: