Technique | Training | Inference | Bit width | PyTorch Support | Code Difficulty |
---|---|---|---|---|---|
Dynamic Quantization | No | Yes | 8 | Built-in | Easy |
Static Quantization | No | Yes | 8 | Built-in | Medium |
QAT | Yes | Yes | 8 | Built-in | Medium-Hard |
Mixed Precision (AMP) | Yes | Yes | 16 | Built-in | Easy |
BitsAndBytes/4-bit | Yes | Yes | 8,4 | Community | Medium |
Custom/Research | Yes | Yes | 4≤ | Custom/Community | Hard |
Created
August 21, 2025 17:23
-
-
Save sadimanna/426ddb946beb439fa58604a3e4474609 to your computer and use it in GitHub Desktop.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment