ZHAOZHIHAO · January 20, 2024 00:19
diff --git a/tensorrt and torch-tensorrt nvidia-docker examples b/tensorrt and torch-tensorrt nvidia-docker examples
 Install torch-tensorrt by docker
 The recommended way is to install the prebuilt docker https://github.com/pytorch/TensorRT
 The version of torch-tensorrt images https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-23-12.html

 docker pull nvcr.io/nvidia/pytorch:22.05-py3
 docker run --gpus device=0 -it --rm nvcr.io/nvidia/pytorch:22.05-py3

 Run the example
 The example is at https://pytorch.org/TensorRT/_notebooks/vgg-qat.html / https://github.com/pytorch/TensorRT/blob/main/notebooks/vgg-qat.ipynb
 The vgg16.py used in the example is at https://github.com/pytorch/TensorRT/blob/main/examples/int8/training/vgg16/vgg16.py
 To copy the vgg16.py to docker, docker cp ./vgg16.py a072427cbc3e:/workspace where a072427cbc3e is the container id that is check by “docker ps” Output of the example:
 Jit: Average batch time: 4.17 ms
 Trt: Average batch time: 0.68 ms



 Install tensorrt by docker

 Quantize the resnet50 using pytorch-quantization https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/index.html#document-tutorials/quant_resnet50 and save the model as quantized *.onnx model
 https://blog.csdn.net/sdhdsf132452/article/details/130136330

 # pull the image
 docker pull nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04  # the versions can be checked at https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/supported-tags.md
 docker run --gpus device=0 -it --rm nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04

 # get trtexec 
 docker cp ./TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz 07ec8d27504f:/home
 Tar -xvzf TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz

 # convert onnx to tensorrt engine
 docker cp ./quant_resnet50.onnx 07ec8d27504f:/home
 ./TensorRT-8.6.1.6/bin/trtexec --int8 --onnx=./quant_resnet50.onnx --saveEngine=quant_resnet50.engine

 # inference engine using python
 apt install python3 python3-pip
 pip install opencv-python
 docker cp ./cat.jpg 07ec8d27504f:/home
 mkdir images & mv ./cat.jpg ./images/

 Error
 [stdArchiveReader.cpp::nvinfer1::rt::StdArchiveReader::StdArchiveReader::30] Error Code 1: Serialization (Serialization assertion magicTagRead == magicTag failed.Magic tag does not match)
 This error is caused by loading the engine using a different TensorRT version than the version used to build the engine. Check your environment to see if multiple TensorRT libs or cuDNN libs are involved.
 Pip install tensorrt=8.6.1 solves the issue The engine is generated using TensorRT-8.6.1.6.

 The speed of quantized resnet50 with batch_size=4 is 2.2ms
 The prediction for a cat image is really a cat.
 main.py https://www.cnblogs.com/chentiao/p/16671459.html
	Install torch-tensorrt by docker
	The recommended way is to install the prebuilt docker https://github.com/pytorch/TensorRT
	The version of torch-tensorrt images https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-23-12.html

	docker pull nvcr.io/nvidia/pytorch:22.05-py3
	docker run --gpus device=0 -it --rm nvcr.io/nvidia/pytorch:22.05-py3

	Run the example
	The example is at https://pytorch.org/TensorRT/_notebooks/vgg-qat.html / https://github.com/pytorch/TensorRT/blob/main/notebooks/vgg-qat.ipynb
	The vgg16.py used in the example is at https://github.com/pytorch/TensorRT/blob/main/examples/int8/training/vgg16/vgg16.py
	To copy the vgg16.py to docker, docker cp ./vgg16.py a072427cbc3e:/workspace where a072427cbc3e is the container id that is check by “docker ps” Output of the example:
	Jit: Average batch time: 4.17 ms
	Trt: Average batch time: 0.68 ms



	Install tensorrt by docker

	Quantize the resnet50 using pytorch-quantization https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/index.html#document-tutorials/quant_resnet50 and save the model as quantized *.onnx model
	https://blog.csdn.net/sdhdsf132452/article/details/130136330

	# pull the image
	docker pull nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04 # the versions can be checked at https://gitlab.com/nvidia/container-images/cuda/blob/master/doc/supported-tags.md
	docker run --gpus device=0 -it --rm nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04

	# get trtexec
	docker cp ./TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz 07ec8d27504f:/home
	Tar -xvzf TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-11.8.tar.gz

	# convert onnx to tensorrt engine
	docker cp ./quant_resnet50.onnx 07ec8d27504f:/home
	./TensorRT-8.6.1.6/bin/trtexec --int8 --onnx=./quant_resnet50.onnx --saveEngine=quant_resnet50.engine

	# inference engine using python
	apt install python3 python3-pip
	pip install opencv-python
	docker cp ./cat.jpg 07ec8d27504f:/home
	mkdir images & mv ./cat.jpg ./images/

	Error
	[stdArchiveReader.cpp::nvinfer1::rt::StdArchiveReader::StdArchiveReader::30] Error Code 1: Serialization (Serialization assertion magicTagRead == magicTag failed.Magic tag does not match)
	This error is caused by loading the engine using a different TensorRT version than the version used to build the engine. Check your environment to see if multiple TensorRT libs or cuDNN libs are involved.
	Pip install tensorrt=8.6.1 solves the issue The engine is generated using TensorRT-8.6.1.6.

	The speed of quantized resnet50 with batch_size=4 is 2.2ms
	The prediction for a cat image is really a cat.
	main.py https://www.cnblogs.com/chentiao/p/16671459.html