Skip to content

Instantly share code, notes, and snippets.

@albanD
Last active May 18, 2020 19:21
Show Gist options
  • Save albanD/51b524a8b6872f9d8c25f6e24494a1fa to your computer and use it in GitHub Desktop.
Save albanD/51b524a8b6872f9d8c25f6e24494a1fa to your computer and use it in GitHub Desktop.
Python function common dtype

Ops to test on python side

If nothing is specified, all argument combination should be considered

CPU and GPU

  • copy_ no_sparse && no_quantize && self!=source && not_copy_transpose
  • gather
  • gather(out=)
  • scatter_(Tensor)
  • scatter(Tensor)
  • scatter_(value)
  • scatter(value)
  • index_put_ all calls
  • index
  • index(out=)
  • where
  • logical_not_ CPU + GPU
  • logical_not CPU + GPU
  • logical_not(out=) CPU + GPU
  • backward of unfold with step >= size
  • backward of unfold with step < size
  • sum / sum(out=)
  • prod / prod(out=)
  • mean / mean(out=)
  • norm / norm(out=) Except without dim and sparse.
  • all / all(out=)
  • any / any(out=)
  • min_values
  • max_values
  • argmax / argmax(out=)
  • argmin / argmin(out=)
  • var / var(out=)
  • std / std(out=)
  • var_mean / var_mean(out=)
  • std_mean / std_mean(out=)

CPU only

  • cumsum normal + dimname
  • cumsum(out=) normal + dimname
  • cumprod normal + dimname
  • cumprod(out=) normal + dimname
  • scatter_add_(Tensor)
  • scatter_add(Tensor)
  • _min
  • _min(out=)
  • min
  • min(out=)
  • _max
  • _max(out=)
  • max
  • max(out=)
  • index_select
  • index_select(out=)
  • masked_fill_(Tensor)
  • masked_fill_(value)
  • masked_select / masked_select(out=) byte mask
  • masked_select / masked_select(out=) boolean mask

GPU only

  • _fused_dropout_backward (backward of _fused_dropout used for dropout when train + cuda + 0<p<1 + numel!=0 + inplace=false)

Quantized

  • int_repr
  • Created as output of quantized ops like qadd / qmul / qconv
  • quant/dequant in most quantization ops
  • More quant for "fake_*" functions

raw data:

dont_compute_common_dtype

copy_impl

  • copy_
  • copy_ no_sparse + no_quantize + self!=source + not_copy_transpose

cpu_cum_base_kernel

  • cumsum_cpu_kernel
  • _cumsum_cpu
  • native: _cumsum if CPU
  • _cumsum_out_cpu
  • native: _cumsum.out if CPU
  • cumprod_cpu_kernel
  • same as above
  • cumsum CPU + dimname
  • cumsum(out=) CPU + dimname
  • cumprod CPU + dimname
  • cumprod(out=) CPU + dimname

cpu_scatter_gather_base_kernel

  • gather_cpu_kernel
  • scatter_cpu_kernel
  • scatter_fill_cpu_kernel
  • scatter_add_cpu_kernel cuda_scatter_gather_base_kernel
  • gather_cuda_kernel
  • scatter_cuda_kernel cuda_scatter_fill_base_kernel
  • scatter_fill_cuda_kernel
  • gather CPU + GPU
  • gather(out=) CPU + GPU
  • scatter_(Tensor) CPU + GPU
  • scatter(Tensor) CPU + GPU
  • scatter_(value) CPU + GPU
  • scatter(value) CPU + GPU
  • scatter_add_(Tensor) CPU
  • scatter_add(Tensor) CPU

compare_base_kernel

  • min_kernel_impl
  • _min_out_cpu + _min(out=)
  • _min_cpu + _min
  • max_kernel_impl
  • _min CPU
  • _min(out=) CPU
  • min CPU
  • min(out=) CPU
  • _max CPU
  • _max(out=) CPU
  • max CPU
  • max(out=) CPU

masked_scale_kernel

  • masked_scale_cuda
  • native: _masked_scale
  • _fused_dropout_backward (backward of _fused_dropout used for dropout when train + cuda + 0<p<1 + numel!=0 + inplace=false)

index_put_impl

  • index_put_
  • index_put_ all calls

index index_out

  • index
  • index(out=)

index_select_out_cpu_ index_select_cpu_

  • index_select CPU
  • index_select(out=) CPU

masked_fill_impl_cpu

  • masked_fill__cpu
  • masked_fill_(Tensor) CPU
  • masked_fill_(value) CPU

_s_where

  • where
  • where CPU + GPU

logical_not_out

  • logical_not
  • logical_not_
  • logical_not_ CPU + GPU
  • logical_not CPU + GPU
  • logical_not(out=) CPU + GPU

_make_unfold_backward_iter_over_grad_out

  • unfold_backward_cpu_kernel
  • unfold_backward_cuda_kernel
  • backward of unfold with step >= size

_make_unfold_backward_iter_over_grad_in

  • unfold_backward_cpu_kernel
  • unfold_backward_cuda_kernel
  • backward of unfold with step < size

int_repr_quant_cpu int_repr_quant_cuda

  • int_repr CPU + GPU

quantize_tensor_per_tensor_affine_cuda

  • quantize_tensor_per_tensor_affine GPU
  • PerTensorAffineQuantizer::quantize() dequantize_tensor_per_tensor_affine_cuda
  • dequantize_tensor_per_tensor_affine GPU
  • PerTensorAffineQuantizer::dequantize()
  • Created as output of quantized ops like qadd / qmul / qconv
  • quant/dequant in most quantization ops

fake_quantize_tensor_kernel_cuda fake_quantize_grad_tensor_kernel_cuda make_per_tensor_quantized_tensor_cuda fake_quantize_per_channel_affine fake_quantize_per_channel_affine_backward

  • More quant for "fake_*" functions

compute_common_dtype_only_for_inputs

comparison_op

reduce_op

  • make_reduction
  • sum / sum(out=)
  • prod / prod(out=)
  • mean / mean(out=)
  • norm / norm(out=) Except without dim and sparse.
  • all / all(out=)
  • any / any(out=)
  • min_values
  • max_values
  • argmax / argmax(out=)
  • argmin / argmin(out=)
  • var / var(out=) with dim provided
  • std / std(out=) with dim provided
  • var_mean / var_mean(out=)
  • std_mean / std_mean(out=)
  • two_pass_reduction
  • parallel_reduce if output.nelement() == 1
  • prelu backward with weight.nelement() == 1
  • binary_kernel_reduce_vec - sum_kernel_impl CPU - prod_kernel_impl CPU - and_kernel_impl CPU - or_kernel_impl CPU - min_values_kernel_impl CPU - max_values_kernel_impl CPU
  • min_all_kernel_impl if not bool + min() Without dim provided CPU
  • max_all_kernel_impl if not bool + max() Without dim provided CPU
  • TH_TENSOR_APPLY_REDUCTION_SUM_PARALLEL - THTensor(sumall)() (NOT THE THC VERSION !)
    • THTensor(maskedSelect)() -> _th_masked_select
    • masked_select / masked_select(out=) byte mask CPU
    • THTensor_(maskedSelectBool)() -> _th_masked_select_bool
    • masked_select / masked_select(out=) boolean mask CPU
    • THTensor_(meanall)() -> Not bound to TH
    • var_all -> _th_var
    • var no dim
    • std_all -> _th_std
    • std no dim
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment