Get one example of neural compressor up and running, with existing trained model.
We target quantization example that quantize model of eager version of pytorch. At the time of writing INC version is v1.9. Git commit is aaad4c35
- Create a conda environment
- Setup per readme of the landing page.
git clone https://github.com/intel/neural-compressor.git #TODO use this fork which has patched minor error
cd neural-compressor
git submodule sync
git submodule update --init --recursive
pip install -r requirements.txt
python setup.py develop # we change to develop
- Install transformer (the codebase is part of the clone, do not clone from public transformer)
cd neural-compressor/examples/pytorch/nlp/huggingface_models/common
python setup.py develop # we change to develop
- Install dependency of glue task
cd neural-compressor/examples/pytorch/nlp/huggingface_models/language-modeling/quantization/ptq_static/eager
pip install -r requirements.txt
- Install pytorch 1.7.1 (pls stick to this version to avoid surprise)
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.2 -c pytorch
- Evaluate pretrained FP32 bert-base/MNLI
cd neural-compressor/examples/pytorch/nlp/huggingface_models/common/examples/text-classification
###TODO
...
- Tune (Quantize) bert-base/MNLI with INC PTQ
cd neural-compressor/examples/pytorch/nlp/huggingface_models/text-classification/quantization/ptq_static/eager
###TODO