Skip to content

Instantly share code, notes, and snippets.

@davidberard98
Created March 11, 2025 02:00
Show Gist options
  • Save davidberard98/ecd9fefff91393b3a3fa0725dea96e22 to your computer and use it in GitHub Desktop.
Save davidberard98/ecd9fefff91393b3a3fa0725dea96e22 to your computer and use it in GitHub Desktop.
========= COMPUTE-SANITIZER
/home/dberard/local/triton-env2/pytorch/torch/backends/cudnn/__init__.py:108: UserWarning: PyTorch was compiled without cuDNN/MIOpen support. To use cuDNN/MIOpen, rebuild PyTorch making sure the library is visible to the build system.
warnings.warn(
test_triton_kernel_tma_descriptor_1d_dynamic_False_cuda (__main__.AOTInductorTestABICompatibleGpu) ... /home/dberard/local/triton-env2/pytorch/torch/backends/mkldnn/__init__.py:78: UserWarning: TF32 acceleration on top of oneDNN is available for Intel GPUs. The current Torch version does not have Intel GPU Support. (Triggered internally at /home/dberard/local/triton-env2/pytorch/aten/src/ATen/Context.cpp:148.)
torch._C._set_onednn_allow_tf32(_allow_tf32)
W0310 18:32:06.819000 1957237 torch/_export/__init__.py:67] +============================+
W0310 18:32:06.820000 1957237 torch/_export/__init__.py:68] | !!! WARNING !!! |
W0310 18:32:06.820000 1957237 torch/_export/__init__.py:69] +============================+
W0310 18:32:06.820000 1957237 torch/_export/__init__.py:70] torch._export.aot_compile()/torch._export.aot_load() is being deprecated, please switch to directly calling torch._inductor.aoti_compile_and_package(torch.export.export())/torch._inductor.aoti_load_package() instead.
========= Warp illegal address
========= at add_kernel_with_tma_1d+0x670 in /tmp/torchinductor_dberard/3w/c3wd64nniniqpvwxdltbjqy6ol2xsxnrdguqw3p4ebs43yodayte.py:31
========= by thread (0,0,0) in block (0,0,0)
========= Saved host backtrace up to driver entry point at kernel launch time
========= Host Frame: [0x2f16cf]
========= in /lib64/libcuda.so.1
========= Host Frame:launchKernel(CUfunc_st*, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, void**, CUstream_st*) [0x182d6]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:torch::aot_inductor::AOTInductorModel::run_impl(AtenTensorOpaque**, AtenTensorOpaque**, CUstream_st*, AOTIProxyExecutorOpaque*) [0x19d80]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:torch::aot_inductor::AOTInductorModelBase<torch::aot_inductor::AOTInductorModel>::run(AtenTensorOpaque**, AtenTensorOpaque**, CUstream_st*, AOTIProxyExecutorOpaque*) [0x1eb35]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:torch::aot_inductor::AOTInductorModelContainer::run(AtenTensorOpaque**, AtenTensorOpaque**, CUstream_st*, AOTIProxyExecutorOpaque*) [0x26c59]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:AOTInductorModelContainerRun [0x1ad8c]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:torch::inductor::AOTIModelContainerRunner::run_impl(std::vector<AtenTensorOpaque*, std::allocator<AtenTensorOpaque*> >&, void*) [0x4f99ee5]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_cpu.so
========= Host Frame:torch::inductor::AOTIModelContainerRunnerCuda::run_impl(std::vector<AtenTensorOpaque*, std::allocator<AtenTensorOpaque*> >&, void*) [0x3a4b32d]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_cuda.so
========= Host Frame:torch::inductor::AOTIModelContainerRunner::run(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*) [0x4f99ca4]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_cpu.so
========= Host Frame:pybind11::cpp_function::initialize<pybind11::cpp_function::initialize<std::vector<at::Tensor, std::allocator<at::Tensor> >, torch::inductor::AOTIModelContainerRunnerCuda, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg_v>(std::vector<at::Tensor, std::allocator<at::Tensor> > (torch::inductor::AOTIModelContainerRunnerCuda::*)(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg_v const&)::{lambda(torch::inductor::AOTIModelContainerRunnerCuda*, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*)#1}, std::vector<at::Tensor, std::allocator<at::Tensor> >, torch::inductor::AOTIModelContainerRunnerCuda*, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg_v>(pybind11::cpp_function::initialize<std::vector<at::Tensor, std::allocator<at::Tensor> >, torch::inductor::AOTIModelContainerRunnerCuda, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg_v>(std::vector<at::Tensor, std::allocator<at::Tensor> > (torch::inductor::AOTIModelContainerRunnerCuda::*)(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg_v const&)::{lambda(torch::inductor::AOTIModelContainerRunnerCuda*, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*)#1}&&, std::vector<at::Tensor, std::allocator<at::Tensor> > (*)(torch::inductor::AOTIModelContainerRunnerCuda*, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg_v const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call) [0x80f235]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_python.so
========= Host Frame:pybind11::cpp_function::dispatcher(_object*, _object*, _object*) [0x38803d]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_python.so
========= Host Frame:cfunction_call in /usr/local/src/conda/python-3.10.16/Objects/methodobject.c:543 [0xfdcf6]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109d6e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4181 [0xf2c55]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4181 [0xf2c55]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:PyObject_Call in /usr/local/src/conda/python-3.10.16/Objects/call.c:317 [0x10a5a7]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:61 [0x109d06]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:153 [0xf687c]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_init in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7734 [0x1075b7]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf74ca]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xf3801]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_Vector in /usr/local/src/conda/python-3.10.16/Python/ceval.c:5067 [0x1953a1]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:PyEval_EvalCode in /usr/local/src/conda/python-3.10.16/Python/ceval.c:1134 [0x1952e6]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:run_eval_code_obj in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1291 [0x1c6736]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:run_mod in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1312 [0x1c186f]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:pyrun_file.cold in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1208 [0x59838]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyRun_SimpleFileObject in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:456 [0x1bbdfe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyRun_AnyFileObject in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:90 [0x1bbb62]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:Py_RunMain in /usr/local/src/conda/python-3.10.16/Modules/main.c:674 [0x1b891c]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:Py_BytesMain in /usr/local/src/conda/python-3.10.16/Modules/main.c:1094 [0x1885d8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:__libc_start_call_main [0x295cf]
========= in /lib64/libc.so.6
========= Host Frame:__libc_start_main [0x2967f]
========= in /lib64/libc.so.6
========= Host Frame: [0x18848d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
=========
========= Warp illegal address
========= at add_kernel_with_tma_1d+0x670 in /tmp/torchinductor_dberard/3w/c3wd64nniniqpvwxdltbjqy6ol2xsxnrdguqw3p4ebs43yodayte.py:31
========= by thread (0,0,0) in block (1,0,0)
========= Saved host backtrace up to driver entry point at kernel launch time
========= Host Frame: [0x2f16cf]
========= in /lib64/libcuda.so.1
========= Host Frame:launchKernel(CUfunc_st*, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, void**, CUstream_st*) [0x182d6]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:torch::aot_inductor::AOTInductorModel::run_impl(AtenTensorOpaque**, AtenTensorOpaque**, CUstream_st*, AOTIProxyExecutorOpaque*) [0x19d80]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:torch::aot_inductor::AOTInductorModelBase<torch::aot_inductor::AOTInductorModel>::run(AtenTensorOpaque**, AtenTensorOpaque**, CUstream_st*, AOTIProxyExecutorOpaque*) [0x1eb35]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:torch::aot_inductor::AOTInductorModelContainer::run(AtenTensorOpaque**, AtenTensorOpaque**, CUstream_st*, AOTIProxyExecutorOpaque*) [0x26c59]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:AOTInductorModelContainerRun [0x1ad8c]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:torch::inductor::AOTIModelContainerRunner::run_impl(std::vector<AtenTensorOpaque*, std::allocator<AtenTensorOpaque*> >&, void*) [0x4f99ee5]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_cpu.so
========= Host Frame:torch::inductor::AOTIModelContainerRunnerCuda::run_impl(std::vector<AtenTensorOpaque*, std::allocator<AtenTensorOpaque*> >&, void*) [0x3a4b32d]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_cuda.so
========= Host Frame:torch::inductor::AOTIModelContainerRunner::run(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*) [0x4f99ca4]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_cpu.so
========= Host Frame:pybind11::cpp_function::initialize<pybind11::cpp_function::initialize<std::vector<at::Tensor, std::allocator<at::Tensor> >, torch::inductor::AOTIModelContainerRunnerCuda, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg_v>(std::vector<at::Tensor, std::allocator<at::Tensor> > (torch::inductor::AOTIModelContainerRunnerCuda::*)(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg_v const&)::{lambda(torch::inductor::AOTIModelContainerRunnerCuda*, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*)#1}, std::vector<at::Tensor, std::allocator<at::Tensor> >, torch::inductor::AOTIModelContainerRunnerCuda*, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg_v>(pybind11::cpp_function::initialize<std::vector<at::Tensor, std::allocator<at::Tensor> >, torch::inductor::AOTIModelContainerRunnerCuda, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg_v>(std::vector<at::Tensor, std::allocator<at::Tensor> > (torch::inductor::AOTIModelContainerRunnerCuda::*)(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg_v const&)::{lambda(torch::inductor::AOTIModelContainerRunnerCuda*, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*)#1}&&, std::vector<at::Tensor, std::allocator<at::Tensor> > (*)(torch::inductor::AOTIModelContainerRunnerCuda*, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg_v const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call) [0x80f235]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_python.so
========= Host Frame:pybind11::cpp_function::dispatcher(_object*, _object*, _object*) [0x38803d]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_python.so
========= Host Frame:cfunction_call in /usr/local/src/conda/python-3.10.16/Objects/methodobject.c:543 [0xfdcf6]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109d6e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4181 [0xf2c55]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4181 [0xf2c55]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:PyObject_Call in /usr/local/src/conda/python-3.10.16/Objects/call.c:317 [0x10a5a7]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:61 [0x109d06]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:153 [0xf687c]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_init in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7734 [0x1075b7]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf74ca]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xf3801]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_Vector in /usr/local/src/conda/python-3.10.16/Python/ceval.c:5067 [0x1953a1]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:PyEval_EvalCode in /usr/local/src/conda/python-3.10.16/Python/ceval.c:1134 [0x1952e6]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:run_eval_code_obj in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1291 [0x1c6736]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:run_mod in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1312 [0x1c186f]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:pyrun_file.cold in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1208 [0x59838]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyRun_SimpleFileObject in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:456 [0x1bbdfe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyRun_AnyFileObject in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:90 [0x1bbb62]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:Py_RunMain in /usr/local/src/conda/python-3.10.16/Modules/main.c:674 [0x1b891c]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:Py_BytesMain in /usr/local/src/conda/python-3.10.16/Modules/main.c:1094 [0x1885d8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:__libc_start_call_main [0x295cf]
========= in /lib64/libc.so.6
========= Host Frame:__libc_start_main [0x2967f]
========= in /lib64/libc.so.6
========= Host Frame: [0x18848d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
=========
========= Program hit CUDA_ERROR_ILLEGAL_ADDRESS (error 700) due to "an illegal memory access was encountered" on CUDA API call to cuLaunchKernel.
========= Saved host backtrace up to driver entry point at error
========= Host Frame: [0x2f1526]
========= in /lib64/libcuda.so.1
========= Host Frame:launchKernel(CUfunc_st*, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, void**, CUstream_st*) [0x182d6]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:torch::aot_inductor::AOTInductorModel::run_impl(AtenTensorOpaque**, AtenTensorOpaque**, CUstream_st*, AOTIProxyExecutorOpaque*) [0x19d80]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:torch::aot_inductor::AOTInductorModelBase<torch::aot_inductor::AOTInductorModel>::run(AtenTensorOpaque**, AtenTensorOpaque**, CUstream_st*, AOTIProxyExecutorOpaque*) [0x1eb35]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:torch::aot_inductor::AOTInductorModelContainer::run(AtenTensorOpaque**, AtenTensorOpaque**, CUstream_st*, AOTIProxyExecutorOpaque*) [0x26c59]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:AOTInductorModelContainerRun [0x1ad8c]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:torch::inductor::AOTIModelContainerRunner::run_impl(std::vector<AtenTensorOpaque*, std::allocator<AtenTensorOpaque*> >&, void*) [0x4f99ee5]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_cpu.so
========= Host Frame:torch::inductor::AOTIModelContainerRunnerCuda::run_impl(std::vector<AtenTensorOpaque*, std::allocator<AtenTensorOpaque*> >&, void*) [0x3a4b32d]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_cuda.so
========= Host Frame:torch::inductor::AOTIModelContainerRunner::run(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*) [0x4f99ca4]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_cpu.so
========= Host Frame:pybind11::cpp_function::initialize<pybind11::cpp_function::initialize<std::vector<at::Tensor, std::allocator<at::Tensor> >, torch::inductor::AOTIModelContainerRunnerCuda, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg_v>(std::vector<at::Tensor, std::allocator<at::Tensor> > (torch::inductor::AOTIModelContainerRunnerCuda::*)(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg_v const&)::{lambda(torch::inductor::AOTIModelContainerRunnerCuda*, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*)#1}, std::vector<at::Tensor, std::allocator<at::Tensor> >, torch::inductor::AOTIModelContainerRunnerCuda*, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg_v>(pybind11::cpp_function::initialize<std::vector<at::Tensor, std::allocator<at::Tensor> >, torch::inductor::AOTIModelContainerRunnerCuda, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*, pybind11::name, pybind11::is_method, pybind11::sibling, pybind11::arg, pybind11::arg_v>(std::vector<at::Tensor, std::allocator<at::Tensor> > (torch::inductor::AOTIModelContainerRunnerCuda::*)(std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg_v const&)::{lambda(torch::inductor::AOTIModelContainerRunnerCuda*, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*)#1}&&, std::vector<at::Tensor, std::allocator<at::Tensor> > (*)(torch::inductor::AOTIModelContainerRunnerCuda*, std::vector<at::Tensor, std::allocator<at::Tensor> > const&, void*), pybind11::name const&, pybind11::is_method const&, pybind11::sibling const&, pybind11::arg const&, pybind11::arg_v const&)::{lambda(pybind11::detail::function_call&)#3}::_FUN(pybind11::detail::function_call) [0x80f235]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_python.so
========= Host Frame:pybind11::cpp_function::dispatcher(_object*, _object*, _object*) [0x38803d]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_python.so
========= Host Frame:cfunction_call in /usr/local/src/conda/python-3.10.16/Objects/methodobject.c:543 [0xfdcf6]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109d6e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4181 [0xf2c55]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4181 [0xf2c55]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:PyObject_Call in /usr/local/src/conda/python-3.10.16/Objects/call.c:317 [0x10a5a7]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:61 [0x109d06]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:153 [0xf687c]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_init in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7734 [0x1075b7]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf74ca]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xf3801]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_Vector in /usr/local/src/conda/python-3.10.16/Python/ceval.c:5067 [0x1953a1]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:PyEval_EvalCode in /usr/local/src/conda/python-3.10.16/Python/ceval.c:1134 [0x1952e6]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:run_eval_code_obj in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1291 [0x1c6736]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:run_mod in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1312 [0x1c186f]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:pyrun_file.cold in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1208 [0x59838]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyRun_SimpleFileObject in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:456 [0x1bbdfe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyRun_AnyFileObject in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:90 [0x1bbb62]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:Py_RunMain in /usr/local/src/conda/python-3.10.16/Modules/main.c:674 [0x1b891c]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:Py_BytesMain in /usr/local/src/conda/python-3.10.16/Modules/main.c:1094 [0x1885d8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:__libc_start_call_main [0x295cf]
========= in /lib64/libc.so.6
========= Host Frame:__libc_start_main [0x2967f]
========= in /lib64/libc.so.6
========= Host Frame: [0x18848d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
=========
Error: CUDA driver error: an illegal memory access was encountered
ERROR
========= Program hit cudaErrorIllegalAddress (error 700) due to "an illegal memory access was encountered" on CUDA API call to cudaEventDestroy.
========= Saved host backtrace up to driver entry point at error
========= Host Frame: [0x445b05]
========= in /lib64/libcuda.so.1
========= Host Frame:cudaEventDestroy [0x53a10]
========= in /usr/local/cuda-12.6/lib64/libcudart.so.12
========= Host Frame:torch::aot_inductor::AOTInductorModelBase<torch::aot_inductor::AOTInductorModel>::~AOTInductorModelBase() [0x1d62f]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:AOTInductorModelContainerDelete [0x1880d]
========= in /tmp/torchinductor_dberard/cnaqnxfbucoh67qyjikbptjfxgj4aynyndekhhk4aiw2genv4htq/cnkzhrx5apdu2es5s7d6mhqnd6gb6e3qn7kthhesaolvkt7gyw75.wrapper.so
========= Host Frame:torch::inductor::AOTIModelContainerRunner::~AOTIModelContainerRunner() [0x4f9880d]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_cpu.so
========= Host Frame:torch::inductor::AOTIModelContainerRunnerCuda::~AOTIModelContainerRunnerCuda() [0x3a4b238]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_cuda.so
========= Host Frame:pybind11::class_<torch::inductor::AOTIModelContainerRunnerCuda>::dealloc(pybind11::detail::value_and_holder&) [0x80bd21]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_python.so
========= Host Frame:pybind11::detail::clear_instance(_object*) [0x38652f]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_python.so
========= Host Frame:pybind11_object_dealloc [0x386b70]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_python.so
========= Host Frame:cell_dealloc in /usr/local/src/conda/python-3.10.16/Objects/cellobject.c:82 [0x1141c2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:tupledealloc in /usr/local/src/conda/python-3.10.16/Objects/tupleobject.c:276 [0xea28a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:func_dealloc in /usr/local/src/conda/python-3.10.16/Objects/funcobject.c:648 [0xfc017]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:frame_dealloc in /usr/local/src/conda/python-3.10.16/Objects/frameobject.c:591 [0xf7087]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:tb_dealloc in /usr/local/src/conda/python-3.10.16/Python/traceback.c:168 [0x107067]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:tb_dealloc in /usr/local/src/conda/python-3.10.16/Python/traceback.c:167 [0x1070fe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:tb_dealloc in /usr/local/src/conda/python-3.10.16/Python/traceback.c:167 [0x1070fe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:tb_dealloc in /usr/local/src/conda/python-3.10.16/Python/traceback.c:167 [0x1070fe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:tb_dealloc in /usr/local/src/conda/python-3.10.16/Python/traceback.c:167 [0x1070fe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:tb_dealloc in /usr/local/src/conda/python-3.10.16/Python/traceback.c:167 [0x1070fe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:tb_dealloc in /usr/local/src/conda/python-3.10.16/Python/traceback.c:167 [0x1070fe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:tb_dealloc in /usr/local/src/conda/python-3.10.16/Python/traceback.c:167 [0x1070fe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:tb_dealloc in /usr/local/src/conda/python-3.10.16/Python/traceback.c:167 [0x1070fe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:BaseException_dealloc in /usr/local/src/conda/python-3.10.16/Objects/exceptions.c:93 [0x10cbca]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:tupledealloc in /usr/local/src/conda/python-3.10.16/Objects/tupleobject.c:276 [0xea28a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:tupledealloc in /usr/local/src/conda/python-3.10.16/Objects/tupleobject.c:276 [0xea28a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_list_clear in /usr/local/src/conda/python-3.10.16/Objects/listobject.c:612 [0x14c3a2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:list_clear in /usr/local/src/conda/python-3.10.16/Objects/clinic/listobject.c.h:61 [0x1e0d25]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall_NOARGS in /usr/local/src/conda/python-3.10.16/Objects/descrobject.c:432 [0x100b13]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:153 [0xf687c]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_init in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7734 [0x1075b7]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf74ca]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xf3801]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_Vector in /usr/local/src/conda/python-3.10.16/Python/ceval.c:5067 [0x1953a1]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:PyEval_EvalCode in /usr/local/src/conda/python-3.10.16/Python/ceval.c:1134 [0x1952e6]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:run_eval_code_obj in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1291 [0x1c6736]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:run_mod in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1312 [0x1c186f]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:pyrun_file.cold in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1208 [0x59838]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyRun_SimpleFileObject in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:456 [0x1bbdfe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyRun_AnyFileObject in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:90 [0x1bbb62]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:Py_RunMain in /usr/local/src/conda/python-3.10.16/Modules/main.c:674 [0x1b891c]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:Py_BytesMain in /usr/local/src/conda/python-3.10.16/Modules/main.c:1094 [0x1885d8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:__libc_start_call_main [0x295cf]
========= in /lib64/libc.so.6
========= Host Frame:__libc_start_main [0x2967f]
========= in /lib64/libc.so.6
========= Host Frame: [0x18848d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
=========
Failed to destroy CUDA event in AOTInductor model: an illegal memory access was encountered
========= Program hit cudaErrorIllegalAddress (error 700) due to "an illegal memory access was encountered" on CUDA API call to cudaDeviceSynchronize.
========= Saved host backtrace up to driver entry point at error
========= Host Frame: [0x445b05]
========= in /lib64/libcuda.so.1
========= Host Frame:cudaDeviceSynchronize [0x4af09]
========= in /usr/local/cuda-12.6/lib64/libcudart.so.12
========= Host Frame:c10::cuda::device_synchronize() [0x550c7]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libc10_cuda.so
========= Host Frame:THCPModule_cudaSynchronize(_object*, _object*) [0xbd0ca4]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_python.so
========= Host Frame:cfunction_vectorcall_NOARGS in /usr/local/src/conda/python-3.10.16/Objects/methodobject.c:489 [0xfbc94]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4181 [0xf2c55]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4181 [0xf2c55]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:153 [0xf687c]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_init in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7734 [0x1075b7]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf74ca]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xf3801]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_Vector in /usr/local/src/conda/python-3.10.16/Python/ceval.c:5067 [0x1953a1]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:PyEval_EvalCode in /usr/local/src/conda/python-3.10.16/Python/ceval.c:1134 [0x1952e6]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:run_eval_code_obj in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1291 [0x1c6736]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:run_mod in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1312 [0x1c186f]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:pyrun_file.cold in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1208 [0x59838]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyRun_SimpleFileObject in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:456 [0x1bbdfe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyRun_AnyFileObject in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:90 [0x1bbb62]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:Py_RunMain in /usr/local/src/conda/python-3.10.16/Modules/main.c:674 [0x1b891c]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:Py_BytesMain in /usr/local/src/conda/python-3.10.16/Modules/main.c:1094 [0x1885d8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:__libc_start_call_main [0x295cf]
========= in /lib64/libc.so.6
========= Host Frame:__libc_start_main [0x2967f]
========= in /lib64/libc.so.6
========= Host Frame: [0x18848d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
=========
========= Program hit cudaErrorIllegalAddress (error 700) due to "an illegal memory access was encountered" on CUDA API call to cudaGetLastError.
========= Saved host backtrace up to driver entry point at error
========= Host Frame: [0x445b05]
========= in /lib64/libcuda.so.1
========= Host Frame:cudaGetLastError [0x4dd26]
========= in /usr/local/cuda-12.6/lib64/libcudart.so.12
========= Host Frame:c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) [0x54c0c]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libc10_cuda.so
========= Host Frame:c10::cuda::device_synchronize() [0x550e7]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libc10_cuda.so
========= Host Frame:THCPModule_cudaSynchronize(_object*, _object*) [0xbd0ca4]
========= in /home/dberard/local/triton-env2/pytorch/torch/lib/libtorch_python.so
========= Host Frame:cfunction_vectorcall_NOARGS in /usr/local/src/conda/python-3.10.16/Objects/methodobject.c:489 [0xfbc94]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4181 [0xf2c55]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4181 [0xf2c55]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:53 [0x109a7d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:method_vectorcall in /usr/local/src/conda/python-3.10.16/Objects/classobject.c:83 [0x109bd5]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4277 [0xf0ca8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:142 [0xf67cc]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_Call_Prepend in /usr/local/src/conda/python-3.10.16/Objects/call.c:431 [0x107f65]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_call in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7494 [0x1d0152]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf747a]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xf2f0d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4198 [0xee860]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_FastCallDictTstate in /usr/local/src/conda/python-3.10.16/Objects/call.c:153 [0xf687c]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:slot_tp_init in /usr/local/src/conda/python-3.10.16/Objects/typeobject.c:7734 [0x1075b7]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyObject_MakeTpCall in /usr/local/src/conda/python-3.10.16/Objects/call.c:215 [0xf74ca]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xf3801]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4213 [0xee44e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyFunction_Vectorcall in /usr/local/src/conda/python-3.10.16/Objects/call.c:342 [0xfe13e]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_EvalFrameDefault in /usr/local/src/conda/python-3.10.16/Python/ceval.c:4231 [0xef4e2]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyEval_Vector in /usr/local/src/conda/python-3.10.16/Python/ceval.c:5067 [0x1953a1]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:PyEval_EvalCode in /usr/local/src/conda/python-3.10.16/Python/ceval.c:1134 [0x1952e6]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:run_eval_code_obj in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1291 [0x1c6736]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:run_mod in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1312 [0x1c186f]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:pyrun_file.cold in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:1208 [0x59838]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyRun_SimpleFileObject in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:456 [0x1bbdfe]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:_PyRun_AnyFileObject in /usr/local/src/conda/python-3.10.16/Python/pythonrun.c:90 [0x1bbb62]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:Py_RunMain in /usr/local/src/conda/python-3.10.16/Modules/main.c:674 [0x1b891c]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:Py_BytesMain in /usr/local/src/conda/python-3.10.16/Modules/main.c:1094 [0x1885d8]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
========= Host Frame:__libc_start_call_main [0x295cf]
========= in /lib64/libc.so.6
========= Host Frame:__libc_start_main [0x2967f]
========= in /lib64/libc.so.6
========= Host Frame: [0x18848d]
========= in /home/dberard/local/miniconda3/envs/triton-env2/bin/python
=========
TEST SUITE EARLY TERMINATION due to torch.cuda.synchronize() failure
CUDA error: an illegal memory access was encountered
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
======================================================================
ERROR: test_triton_kernel_tma_descriptor_1d_dynamic_False_cuda (__main__.AOTInductorTestABICompatibleGpu)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/dberard/local/triton-env2/pytorch/torch/testing/_internal/common_utils.py", line 3150, in wrapper
method(*args, **kwargs)
File "/home/dberard/local/triton-env2/pytorch/test/inductor/test_torchinductor.py", line 12849, in new_test
return value(self)
File "/home/dberard/local/triton-env2/pytorch/torch/testing/_internal/common_utils.py", line 552, in instantiated_test
test(self, **param_kwargs)
File "/home/dberard/local/triton-env2/pytorch/test/inductor/test_aot_inductor.py", line 2568, in test_triton_kernel_tma_descriptor_1d
self.check_model(
File "/home/dberard/local/triton-env2/pytorch/test/inductor/test_aot_inductor_utils.py", line 198, in check_model
actual = AOTIRunnerUtil.run(
File "/home/dberard/local/triton-env2/pytorch/test/inductor/test_aot_inductor_utils.py", line 139, in run
return optimized(*example_inputs)
File "/home/dberard/local/triton-env2/pytorch/torch/_export/__init__.py", line 178, in optimized
flat_outputs = runner.run(flat_inputs) # type: ignore[attr-defined]
RuntimeError: run_func_( container_handle_, input_handles.data(), input_handles.size(), output_handles.data(), output_handles.size(), reinterpret_cast<AOTInductorStreamHandle>(stream_handle), proxy_executor_handle_) API call failed at /home/dberard/local/triton-env2/pytorch/torch/csrc/inductor/aoti_runner/model_container_runner.cpp, line 106
To execute this test, run the following from the base repo dir:
python test/inductor/test_aot_inductor.py AOTInductorTestABICompatibleGpu.test_triton_kernel_tma_descriptor_1d_dynamic_False_cuda
This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
----------------------------------------------------------------------
Ran 1 test in 7.552s
FAILED (errors=1)
inline_call []
unimplemented []
stats [('calls_captured', 2), ('unique_graphs', 1)]
inductor [('extern_calls', 4), ('async_compile_cache_miss', 2), ('benchmarking.InductorBenchmarker.benchmark_gpu', 2), ('pattern_matcher_count', 1), ('pattern_matcher_nodes', 1)]
graph_break []
aten_mm_info []
========= Target application returned an error
========= ERROR SUMMARY: 6 errors
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment