Skip to content

Instantly share code, notes, and snippets.

View ZHAOZHIHAO's full-sized avatar

zhihao zhao ZHAOZHIHAO

  • university of Oklahoma
  • Tulsa, Oklahoma
View GitHub Profile
@ZHAOZHIHAO
ZHAOZHIHAO / tensorrt and torch-tensorrt nvidia-docker examples
Created January 20, 2024 00:19
Run int8 quantization examples using docker, tensorrt or torch-tensorrt on NVIDA GPU cards
Install torch-tensorrt by docker
The recommended way is to install the prebuilt docker https://github.com/pytorch/TensorRT
The version of torch-tensorrt images https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-23-12.html
docker pull nvcr.io/nvidia/pytorch:22.05-py3
docker run --gpus device=0 -it --rm nvcr.io/nvidia/pytorch:22.05-py3
Run the example
The example is at https://pytorch.org/TensorRT/_notebooks/vgg-qat.html / https://github.com/pytorch/TensorRT/blob/main/notebooks/vgg-qat.ipynb
The vgg16.py used in the example is at https://github.com/pytorch/TensorRT/blob/main/examples/int8/training/vgg16/vgg16.py
ChatGPT:
introduction https://www.semrush.com/blog/how-does-chatgpt-work/#
Model quantization:
Basics
A White Paper on Neural Network Quantization
how to implement the quantization aware training from scratch
"EFFICIENT QUANTIZATION-AWARE TRAINING WITH ADAPTIVE CORESET SELECTION"
how to implement post training quantization training from scratch using pytorch
https://jermmy.github.io/2020/07/04/2020-7-4-post-training-quantization-2/
@ZHAOZHIHAO
ZHAOZHIHAO / One sentence describes a technique
Last active March 21, 2022 23:47
One sentence describes a technique
1. Adaboost:
i) An adaboost classifier is a powerful ensemble of classifiers which is formed by successively refitting a weak classifier to different
weighted realizations of a data set.
<i> In the “boosting” approach to machine learning, a powerful ensemble of classifiers is formedby
successively refitting a weak classifier to different weighted realizations of a data set. </i>
Source: Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers
ii) A good explantion for adaboost at page 338-339, The Elements of Statistical Learning, volume 2. Springer
2. Jensen's inequality
In its simplest form the inequality states that the convex transformation of a mean is less than or equal to the mean applied after
@ZHAOZHIHAO
ZHAOZHIHAO / Attentions
Last active September 18, 2020 21:46
Papers on attentions
1. Short walk-through on attention papers
1.1 Attention for CNNs
Squeeze-and-Excitation Networks 2017, figure. 2
Generate a-single-value attention for each channel.
CBAM: Convolutional Block Attention Module 2018, figure. 1 and 2
For CNNs. Combine channel attention and spatial attention: first use channel attention, then spatial attention.
Non-local neural networks 2017, figure. 2
The output of a convolutional layer consists of T*H*W*C (batch size, height, width, channels) units.
For each unit, calculate the effect of all other units to it. This results T*H*W*C values.
Add these values to the original output units.
@ZHAOZHIHAO
ZHAOZHIHAO / Short notes on papers
Last active November 16, 2018 04:27
Short notes on papers
1. Mobilenet
Thanks to:
https://blog.csdn.net/t800ghb/article/details/78879612
https://blog.csdn.net/wfei101/article/details/78310226
(Mobilenet V2)https://blog.csdn.net/u011995719/article/details/79135818
Suppose there are M input channels, and N kernels, then the output are N output channels.
With padding to let the output channel size the same as the input channel.
Standard convolution:
1. Most neural networks are essentially very large correlation engines that will hone in on any statis-
tical, potentially spurious pattern that allows them to model the observed data more accurately.
2. Generative adversarial networks (GANs) have been extremely effective in approximating complex distributions
of high-dimensional, input data samples.
3. The learning objective is to learn the best parameterization of those embeddings such that the correct answer
has higher likelihood among all possible answers.
4. In contrast with past approaches in AI, modern deep learning methods (LeCun et al., 2015; Schmidhuber, 2015; Goodfellow et al.,
Papers:
When Saliency Meets Sentiment: Understanding How Image Content Invokes Emotion and Sentiment
1. Very very good result, code available on github
Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
2. Deep Feature Flow for Video Recognition
code: https://github.com/msracver/Deep-Feature-Flow
Fast object detection and segmentation in a video.
22.25fps reported on the Cityscapes andImageNet VID dataset with NVIDIA K40 GPU and Intel Core i7-4790 CPU.
1. Convert images to a video via ffmpeg
Example: ffmpeg -framerate 4 -i 000%3d.jpg -vcodec mpeg4 -b 800k output.mp4
Detail: https://trac.ffmpeg.org/wiki/Slideshow
The output video may be blured compared to the original images, here is a solution: https://stackoverflow.com/questions/3158235/image-sequence-to-video-quality
2. Capture a video on ubuntu via ffmpeg
ffmpeg -f alsa -ac 2 -i pulse -f x11grab -r 15 -s $(xdpyinfo|grep 'dimensions:'|cut -c14-26) -i :0.0+0,0 -acodec flac -vcodec mpeg4 -qscale 0 -y ~/out.mkv ;
ffmpeg -video_size 1024x768 -framerate 25 -f x11grab -i :0.0+0,0 -f alsa -ac 2 -i hw:0 -qscale 0 output.mkv;
3. Check CPU info, including supported instructions