jerryzh168’s gists

jerryzh168 / c2_conv_transpose.py

Created November 10, 2018 01:59

caffe2 conv transpose

	from caffe2.python.functional import Functional as F
	import numpy as np
	x = np.arange(1, 10).reshape([1, 3, 3, 1]).astype(np.float32)
	w = np.zeros([1, 3, 3, 1]).astype(np.float32)
	b = np.zeros([1]).astype(np.float32)
	y = F.ConvTranspose(x, w, b, kernels=[3,3], pads=[0, 0, 0, 0], strides=[1,1], output_shape = [1, 5, 5, 1], order="NHWC")
	print(y)

jerryzh168 / resnet18-qdq-placement-example

Created July 22, 2021 21:31

	GraphModule(
	(conv1): ConvReLU2d(
	(0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3))
	(1): ReLU()
	)
	(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
	(layer1): Module(
	(0): Module(
	(conv1): ConvReLU2d(
	(0): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

jerryzh168 / quantized resnet18 model in PyTorch with quantized reference patterns

Last active July 30, 2021 22:38

	quantized: GraphModule(
	(conv1): ConvReLU2d(
	(0): Conv2d(
	3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3)
	(weight_post_process): MinMaxObserver(min_val=-0.10300777107477188, max_val=0.09756611287593842)
	)
	(1): ReLU(inplace=True)
	)
	(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
	(layer1): Module(

jerryzh168 / quantizing resnet18 and produce reference quantized model

Created August 3, 2021 00:51

	import torchvision.models as models
	import torch
	from torch.quantization.quantize_fx import prepare_fx, convert_fx

	rn18 = models.resnet18().eval()

	qconfig_dict = {"": torch.quantization.default_qconfig}
	rn18 = prepare_fx(rn18, qconfig_dict)
	rn18 = convert_fx(rn18, is_reference=True)
	print(rn18)

jerryzh168 / gist:1c762e51ddec8c38e642380839c8303c

Created August 3, 2021 21:30

def forward(self, input):                                                                                                                      input_1 = input                                                                                                                            _0_conv1_input_scale_0 = getattr(self, "0_conv1_input_scale_0")                                                                            _0_conv1_input_zero_point_0 = getattr(self, "0_conv1_input_zero_point_0")                                                                  quantize_per_tensor = torch.quantize_per_tensor(input_1, _0_conv1_input_scale_0, _0_conv1_input_zero_point_0, torch.qint8);  input_1 = _0_conv1_input_scale_0 = _0_conv1_input_zero_point_0 = None                                                                                    dequantize = quantize_per_tensor.dequantize();  quantize_per_tensor = None                                                                 _0_conv1 = getattr(self, "0").conv1(dequantize)

jerryzh168 / gist:09897c4574132e8c11eee4a43bbe5b6d

Created August 3, 2021 21:39

	r* 7) [Quantize]_output.dequant.scale
	[TensorRT] VERBOSE: Removing (Unnamed Layer* 10) [Convolution]_output.quant.scale
	[TensorRT] VERBOSE: Removing (Unnamed Layer* 12) [Quantize]_output.dequant.scale
	[TensorRT] VERBOSE: Removing (Unnamed Layer* 16) [Activation]_output.quant.scale
	[TensorRT] VERBOSE: Removing (Unnamed Layer* 18) [Quantize]_output.dequant.scale
	[TensorRT] VERBOSE: Removing (Unnamed Layer* 22) [Activation]_output.quant.scale
	[TensorRT] VERBOSE: Removing (Unnamed Layer* 24) [Quantize]_output.dequant.scale
	[TensorRT] VERBOSE: Removing (Unnamed Layer* 27) [Convolution]_output.quant.scale
	[TensorRT] VERBOSE: Removing (Unnamed Layer* 29) [Quantize]_output.dequant.scale
	[TensorRT] VERBOSE: Removing (Unnamed Layer* 33) [Activation]_output.quant.scale

jerryzh168 / gist:76f5e0aeeb96229367b10fa1ebbc0dd8

Created August 9, 2024 16:16

cache: {(<class 'torchao.quantization.autoquant.AQFloatLinearWeight'>, torch.Size([2, 256]), torch.Size([1152, 256]), torch.Size([1152]), torch.bfloat16): 0.014147199876606464, (<class 'torchao.quantization.autoquant.AQWeightOnlyQuantizedLinearWeight'>, torch.Size([2, 256]), torch.Size([1152, 256]), torch.Size([1152]), torch.bfloat16): 0.017664000391960144, (<class 'torchao.quantization.autoquant.AQWeightOnlyQuantizedLinearWeight2'>, torch.Size([2, 256]), torch.Size([1152, 256]), torch.Size([1152]), torch.bfloat16): 0.012953600008040666, (<class 'torchao.quantization.autoquant.AQInt8DynamicallyQuantizedLinearWeight'>, torch.Size([2, 256]), torch.Size([1152, 256]), torch.Size([1152]), torch.bfloat16): 0.013567999936640263, (<class 'torchao.quantization.autoquant.AQFloatLinearWeight'>, torch.Size([2, 1152]), torch.Size([1152, 1152]), torch.Size([1152]), torch.bfloat16): 0.017347200028598308, (<class 'torchao.quantization.autoquant.AQWeightOnlyQuantizedLinearWeight'>, torch.Size([2, 1152]), torch.Size([1152, 115

jerryzh168 / gist:58f5afc3e8884be7e3f55025a6187fa2

Created August 12, 2024 20:07

	/home/jerryzh/anaconda3/envs/ao_new/lib/python3.9/site-packages/torchao-0.4.0+gitd3b8d43-py3.9-linux-x86_64.egg/torchao/ops.py:12: FutureWarning: `torch.library.impl_abstract` was renamed to `torch.library.register_fake`. Please use that instead; we will remove `torch.library.impl_abstract` in a future version of PyTorch.
	return torch.library.impl_abstract(f"{name}")(func)
	W0812 13:06:21.489861 139713931614016 torch/_logging/_internal.py:416] Using TORCH_LOGS environment variable for log settings, ignoring call to set_logs
	V0812 13:06:21.510456 139713931614016 torch/_dynamo/convert_frame.py:776] [0/0] torchdynamo start compiling _quantized_linear_op /home/jerryzh/anaconda3/envs/ao_new/lib/python3.9/site-packages/torchao-0.4.0+gitd3b8d43-py3.9-linux-x86_64.egg/torchao/quantization/autoquant.py:382, stack (elided 6 frames):
	V0812 13:06:21.510456 139713931614016 torch/_dynamo/convert_frame.py:776] [0/0] File "/home/jerryzh/ao/test/integration/test_integration.py", line 1561, in <module>
	V0812 13:06:21.510456

jerryzh168 / gist:ea53bc9ad6eb61dc8d9be0298eebbfd9

Created August 12, 2024 23:50

	_int8da_int8w_api
	(2, 1152, 256): elapsed time: 0.2791945648193359, bf16 elapsed time: 0.3701180648803711
	(2, 1152, 1152): elapsed time: 0.31040128707885745, bf16 elapsed time: 0.3642512130737305
	(2, 6912, 1152): elapsed time: 0.2989423942565918, bf16 elapsed time: 0.3584345626831055
	(600, 1152, 4096): elapsed time: 0.31704416275024416, bf16 elapsed time: 0.344552001953125
	(8192, 1152, 1152): elapsed time: 0.3063484764099121, bf16 elapsed time: 0.3529404830932617
	(8192, 4608, 1152): elapsed time: 0.2982643127441406, bf16 elapsed time: 0.36193439483642575
	(8192, 32, 1152): elapsed time: 0.30923904418945314, bf16 elapsed time: 0.35560993194580076
	_int8wo_api
	(2, 1152, 256): elapsed time: 0.31670495986938474, bf16 elapsed time: 0.35319679260253906

jerryzh168 / gist:08d0732128cef0b85d43168091d7e675

Created August 14, 2024 02:20

	_int8da_int8w_api
	(2, 1152, 256): elapsed time: 0.3003932762145996, bf16 elapsed time: 0.21411584854125976
	(2, 1152, 1152): elapsed time: 0.3486528015136719, bf16 elapsed time: 0.22110015869140626
	(2, 6912, 1152): elapsed time: 0.2898115158081055, bf16 elapsed time: 0.20723712921142579
	(600, 1152, 4096): elapsed time: 0.27465951919555665, bf16 elapsed time: 0.2206876754760742
	(8192, 1152, 1152): elapsed time: 0.3263916778564453, bf16 elapsed time: 0.2734268760681152
	(8192, 4608, 1152): elapsed time: 0.6204579162597657, bf16 elapsed time: 0.44056800842285154
	(8192, 32, 1152): elapsed time: 0.3365078353881836, bf16 elapsed time: 0.3866156768798828
	_int8wo_api
	(2, 1152, 256): elapsed time: 0.3044166374206543, bf16 elapsed time: 0.21549856185913085

Jerry Zhang jerryzh168