TensorRT supports two approaches to prepare model for Quantization - Calibration or Training
First we need to add/replace regular model nn.Layers with TRT pytorch_quantization.nn layers. Quantization layers will gather statistics required for quantization.
Once the model is modified we can use the following approaches to gather statistics before quantization:
- Calibrate pre-trainer model
- Train (1 epoch) pre-trainer model