This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Install torch-tensorrt by docker | |
The recommended way is to install the prebuilt docker https://github.com/pytorch/TensorRT | |
The version of torch-tensorrt images https://docs.nvidia.com/deeplearning/frameworks/pytorch-release-notes/rel-23-12.html | |
docker pull nvcr.io/nvidia/pytorch:22.05-py3 | |
docker run --gpus device=0 -it --rm nvcr.io/nvidia/pytorch:22.05-py3 | |
Run the example | |
The example is at https://pytorch.org/TensorRT/_notebooks/vgg-qat.html / https://github.com/pytorch/TensorRT/blob/main/notebooks/vgg-qat.ipynb | |
The vgg16.py used in the example is at https://github.com/pytorch/TensorRT/blob/main/examples/int8/training/vgg16/vgg16.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ChatGPT: | |
introduction https://www.semrush.com/blog/how-does-chatgpt-work/# | |
Model quantization: | |
Basics | |
A White Paper on Neural Network Quantization | |
how to implement the quantization aware training from scratch | |
"EFFICIENT QUANTIZATION-AWARE TRAINING WITH ADAPTIVE CORESET SELECTION" | |
how to implement post training quantization training from scratch using pytorch | |
https://jermmy.github.io/2020/07/04/2020-7-4-post-training-quantization-2/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. Adaboost: | |
i) An adaboost classifier is a powerful ensemble of classifiers which is formed by successively refitting a weak classifier to different | |
weighted realizations of a data set. | |
<i> In the “boosting” approach to machine learning, a powerful ensemble of classifiers is formedby | |
successively refitting a weak classifier to different weighted realizations of a data set. </i> | |
Source: Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers | |
ii) A good explantion for adaboost at page 338-339, The Elements of Statistical Learning, volume 2. Springer | |
2. Jensen's inequality | |
In its simplest form the inequality states that the convex transformation of a mean is less than or equal to the mean applied after |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. Short walk-through on attention papers | |
1.1 Attention for CNNs | |
Squeeze-and-Excitation Networks 2017, figure. 2 | |
Generate a-single-value attention for each channel. | |
CBAM: Convolutional Block Attention Module 2018, figure. 1 and 2 | |
For CNNs. Combine channel attention and spatial attention: first use channel attention, then spatial attention. | |
Non-local neural networks 2017, figure. 2 | |
The output of a convolutional layer consists of T*H*W*C (batch size, height, width, channels) units. | |
For each unit, calculate the effect of all other units to it. This results T*H*W*C values. | |
Add these values to the original output units. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. Mobilenet | |
Thanks to: | |
https://blog.csdn.net/t800ghb/article/details/78879612 | |
https://blog.csdn.net/wfei101/article/details/78310226 | |
(Mobilenet V2)https://blog.csdn.net/u011995719/article/details/79135818 | |
Suppose there are M input channels, and N kernels, then the output are N output channels. | |
With padding to let the output channel size the same as the input channel. | |
Standard convolution: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. Most neural networks are essentially very large correlation engines that will hone in on any statis- | |
tical, potentially spurious pattern that allows them to model the observed data more accurately. | |
2. Generative adversarial networks (GANs) have been extremely effective in approximating complex distributions | |
of high-dimensional, input data samples. | |
3. The learning objective is to learn the best parameterization of those embeddings such that the correct answer | |
has higher likelihood among all possible answers. | |
4. In contrast with past approaches in AI, modern deep learning methods (LeCun et al., 2015; Schmidhuber, 2015; Goodfellow et al., |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Papers: | |
When Saliency Meets Sentiment: Understanding How Image Content Invokes Emotion and Sentiment | |
1. Very very good result, code available on github | |
Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields | |
2. Deep Feature Flow for Video Recognition | |
code: https://github.com/msracver/Deep-Feature-Flow | |
Fast object detection and segmentation in a video. | |
22.25fps reported on the Cityscapes andImageNet VID dataset with NVIDIA K40 GPU and Intel Core i7-4790 CPU. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. Convert images to a video via ffmpeg | |
Example: ffmpeg -framerate 4 -i 000%3d.jpg -vcodec mpeg4 -b 800k output.mp4 | |
Detail: https://trac.ffmpeg.org/wiki/Slideshow | |
The output video may be blured compared to the original images, here is a solution: https://stackoverflow.com/questions/3158235/image-sequence-to-video-quality | |
2. Capture a video on ubuntu via ffmpeg | |
ffmpeg -f alsa -ac 2 -i pulse -f x11grab -r 15 -s $(xdpyinfo|grep 'dimensions:'|cut -c14-26) -i :0.0+0,0 -acodec flac -vcodec mpeg4 -qscale 0 -y ~/out.mkv ; | |
ffmpeg -video_size 1024x768 -framerate 25 -f x11grab -i :0.0+0,0 -f alsa -ac 2 -i hw:0 -qscale 0 output.mkv; | |
3. Check CPU info, including supported instructions |