Skip to content

Instantly share code, notes, and snippets.

View davidberard98's full-sized avatar

David Berard davidberard98

  • PyTorch
  • Menlo Park, CA
View GitHub Profile
test__softmax_function (__main__.TestCudaFuser) ... ok
test__softmax_function_half_to_float (__main__.TestCudaFuser) ... ok
test_addcmul_ops (__main__.TestCudaFuser) ... ok
test_alias_pass_fix (__main__.TestCudaFuser) ... ERROR
test_autocast_1 (__main__.TestCudaFuser) ... skipped 'Failing windows test - see 73620'
test_autocast_1_bfloat (__main__.TestCudaFuser) ... skipped 'device does not support BFloat16'
test_autocast_2 (__main__.TestCudaFuser) ... skipped 'Failing windows test - see 73620'
test_autocast_2_bfloat (__main__.TestCudaFuser) ... skipped 'device does not support BFloat16'
test_backward_type (__main__.TestCudaFuser) ... ERROR
test_batch_norm_half (__main__.TestCudaFuser) ... C:\actions-runner\_work\pytorch\pytorch\build\win_tmp\build\torch\nn\modules\module.py:1383: UserWarning: positional arguments and argument "destination" are deprecated. nn.Module.state_dict will not accept them in the future. Refer to https://pytorch.org/docs/master/generated/torch.nn.Module.html#torch.nn.Module.st
$ gpui python3 test/test_jit_fuser_te.py -k test_binary_tensor_scalar_ops
EE
======================================================================
ERROR: test_binary_tensor_scalar_ops (__main__.TestTEFuserDynamic)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/fsx/users/dberard/pytorch/torch/testing/_comparison.py", line 981, in originate_pairs
return [pair_type(actual, expected, id=id, **options)]
File "/fsx/users/dberard/pytorch/torch/testing/_internal/common_utils.py", line 1520, in __init__
super().__init__(actual, expected, check_dtype=False, **other_parameters)
$ PYTORCH_JIT_LOG_LEVEL=">>graph_fuser" LTC_TS_CUDA=1 gpui python3 check_lazy.py --check_model resnet18 --test eval --output_file results_ltc/resnet18_eval &> lazy_output.txt
srun: error: ioctl(TIOCGWINSZ): Inappropriate ioctl for device
srun: error: Not using a pseudo-terminal, disregarding --pty option
[DUMP graph_fuser.cpp:2249] Before Fusion:
[DUMP graph_fuser.cpp:2249] graph(%p0 : Tensor):
[DUMP graph_fuser.cpp:2249] return (%p0)
[DEBUG graph_fuser.cpp:2256] insert conditional constant from profile_ivalue: graph(%p0 : Tensor):
[DEBUG graph_fuser.cpp:2256] return (%p0)
[DEBUG graph_fuser.cpp:2261] After Profiling Nodes Removed: graph(%p0 : Tensor):
[DEBUG graph_fuser.cpp:2261] return (%p0)
$ python ../../../test/test_jit_cuda_fuser.py -v
test__softmax_function (__main__.TestCudaFuser) ... ok
test__softmax_function_half_to_float (__main__.TestCudaFuser) ... ok
test_addcmul_ops (__main__.TestCudaFuser) ... ok
test_alias_pass_fix (__main__.TestCudaFuser) ... ERROR
test_autocast_1 (__main__.TestCudaFuser) ... ok
test_autocast_1_bfloat (__main__.TestCudaFuser) ... skipped 'device does not support BFloat16'
test_autocast_2 (__main__.TestCudaFuser) ... ok
test_autocast_2_bfloat (__main__.TestCudaFuser) ... skipped 'device does not support BFloat16'
test_backward_type (__main__.TestCudaFuser) ... ok
import torch
from torchvision.models import regnet_y_128gf
def run(model, iters: int = 20, bs: int = 64, device="cuda") -> None:
print("Warm up ...")
with torch.no_grad():
for i in range(5):
model(torch.rand(bs, 3, 224, 224, device=device))
print("Start benchmarking...")
import torch
import torch.utils.jit.log_extract as log_extract
ir = """graph(%0 : Double(204, 204, 26, strides=[5304, 26, 1], requires_grad=0, device=cuda:0),
%1 : Double(204, 204, 26, strides=[5304, 26, 1], requires_grad=0, device=cuda:0),
%2 : Double(204, 204, 26, strides=[5304, 26, 1], requires_grad=0, device=cuda:0),
%3 : Double(204, 204, 26, strides=[5304, 26, 1], requires_grad=0, device=cuda:0),
%4 : Double(204, 204, 26, strides=[5304, 26, 1], requires_grad=0, device=cuda:0),
%5 : Double(204, 204, 26, strides=[5304, 26, 1], requires_grad=0, device=cuda:0),
%6 : Double(requires_grad=0, device=cuda:0),
@davidberard98
davidberard98 / nvfuser-microbenchmarks-apr19.csv
Last active April 19, 2022 21:01
nvfuser microbenchmark results from apr19
name eager (ms) nnc static (ms) nnc dynamic (ms) nvfuser
autogen-0 0.290 0.281 0.283 0.279
autogen-1 0.177 0.176 0.176 0.175
autogen-2 0.489 0.464 0.491 0.276
autogen-3 4.090 0.875 0.919 1.002
batchnorm-silu 0.289 0.285 0.285 0.221
autogen-4 0.372 0.372 0.368 0.431
autogen-5 0.599 0.597 0.619 0.313
autogen-6 5.152 1.212 1.169 1.384
autogen-7 0.185 0.185 0.183 0.184
srun: error: ioctl(TIOCGWINSZ): Inappropriate ioctl for device
srun: error: Not using a pseudo-terminal, disregarding --pty option
[DUMP graph_fuser.cpp:2323] Before Fusion:
[DUMP graph_fuser.cpp:2323] graph(%t1.1 : Tensor,
[DUMP graph_fuser.cpp:2323] %t2.1 : Tensor,
[DUMP graph_fuser.cpp:2323] %t3.1 : Tensor,
[DUMP graph_fuser.cpp:2323] %t4.1 : Tensor,
[DUMP graph_fuser.cpp:2323] %i1.1 : int,
[DUMP graph_fuser.cpp:2323] %i2.1 : int):
[DUMP graph_fuser.cpp:2323] %9 : int = prim::Constant[value=-1]() # /fsx/users/dberard/pytorch/33-repro.py:8:28
@davidberard98
davidberard98 / extremal.txt
Last active April 28, 2022 21:02
extremal nvfuser opinfo failures - logs
2022-04-28T18:21:31.8743911Z ======================================================================
2022-04-28T18:21:31.8745075Z FAIL [0.152s]: test_nvfuser_extremal_values_nn_functional_binary_cross_entropy_with_logits_cuda_bfloat16 (__main__.TestCudaFuserOpInfoCUDA)
2022-04-28T18:21:31.8746363Z ----------------------------------------------------------------------
2022-04-28T18:21:31.8746833Z Traceback (most recent call last):
2022-04-28T18:21:31.8747861Z File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py", line 1796, in wrapper
2022-04-28T18:21:31.8748515Z method(*args, **kwargs)
2022-04-28T18:21:31.8749414Z File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_utils.py", line 1796, in wrapper
2022-04-28T18:21:31.8749935Z method(*args, **kwargs)
2022-04-28T18:21:31.8750485Z File "/opt/conda/lib/python3.9/site-packages/torch/testing/_internal/common_device_type.py", line 376, in instantiated_test
2022-04-28T18:21:31.8751089Z result =
@davidberard98
davidberard98 / backup.sh
Last active April 27, 2022 03:21
scratch tools
while true
do
sleep 300
tar cf - --directory /scratch/$USER/local pytorch | lz4 - -f /scratch/$USER/local-pytorch.lz4 -q
mv /scratch/$USER/local-pytorch.lz4 /data/home/$USER/scratch_tools/
done