ZHAOZHIHAO · January 11, 2024 21:57
diff --git a/gistfile1.txt b/gistfile1.txt
 ChatGPT:
    introduction https://www.semrush.com/blog/how-does-chatgpt-work/#
    
 Model quantization:
    Basics
        A White Paper on Neural Network Quantization
    how to implement the quantization aware training from scratch 
        "EFFICIENT QUANTIZATION-AWARE TRAINING WITH ADAPTIVE CORESET SELECTION"
    how to implement post training quantization training from scratch using pytorch
        https://jermmy.github.io/2020/07/04/2020-7-4-post-training-quantization-2/
    torch.round gradient
        https://discuss.pytorch.org/t/torch-round-gradient/28628/6
        https://www.reddit.com/r/pytorch/comments/lqi315/how_does_quantize_per_tensor_work_in_relation/
    why doesn't pytorch support quantized model inference on GPU?
        https://stackoverflow.com/questions/69718379/running-pytorch-quantized-model-on-cuda-gpu
	ChatGPT:
	introduction https://www.semrush.com/blog/how-does-chatgpt-work/#

	Model quantization:
	Basics
	A White Paper on Neural Network Quantization
	how to implement the quantization aware training from scratch
	"EFFICIENT QUANTIZATION-AWARE TRAINING WITH ADAPTIVE CORESET SELECTION"
	how to implement post training quantization training from scratch using pytorch
	https://jermmy.github.io/2020/07/04/2020-7-4-post-training-quantization-2/
	torch.round gradient
	https://discuss.pytorch.org/t/torch-round-gradient/28628/6
	https://www.reddit.com/r/pytorch/comments/lqi315/how_does_quantize_per_tensor_work_in_relation/
	why doesn't pytorch support quantized model inference on GPU?
	https://stackoverflow.com/questions/69718379/running-pytorch-quantized-model-on-cuda-gpu