Skip to content

Instantly share code, notes, and snippets.

@valgur
Created August 25, 2024 15:59
Show Gist options
  • Save valgur/106bde07689750cc4f5b2cfb43fcddea to your computer and use it in GitHub Desktop.
Save valgur/106bde07689750cc4f5b2cfb43fcddea to your computer and use it in GitHub Desktop.
  • CCTag-1.0.3.tar.gz
    • CCTag-1.0.3/src/cctag/cuda/assist.cu
    • CCTag-1.0.3/src/cctag/cuda/cctag_cuda_runtime.h
    • CCTag-1.0.3/src/cctag/cuda/cmp_list.cu
    • CCTag-1.0.3/src/cctag/cuda/debug_image.cu
    • CCTag-1.0.3/src/cctag/cuda/debug_is_on_edge.cu
    • CCTag-1.0.3/src/cctag/cuda/frame.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_01_tex.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_02_gaussian.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_03_magmap.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_04_hyst.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_05_thin.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_06_graddesc.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_07_vote.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_07_vote.h
    • CCTag-1.0.3/src/cctag/cuda/frame_07a_vote_line.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_07b_vote_sort_uniq_thrust.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_07c_eval.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_07d_vote_if.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_07e_download.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_alloc.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_debug.cu
    • CCTag-1.0.3/src/cctag/cuda/frame_export.cu
    • CCTag-1.0.3/src/cctag/cuda/framemeta.cu
    • CCTag-1.0.3/src/cctag/cuda/frameparam.cu
    • CCTag-1.0.3/src/cctag/cuda/geom_ellipse.cu
    • CCTag-1.0.3/src/cctag/cuda/geom_matrix.cu
    • CCTag-1.0.3/src/cctag/cuda/keep_time.cu
    • CCTag-1.0.3/src/cctag/cuda/pinned_counters.cu
    • CCTag-1.0.3/src/cctag/cuda/ptrstep.cu
    • CCTag-1.0.3/src/cctag/cuda/recursive_sweep.cu
    • CCTag-1.0.3/src/cctag/cuda/tag.cu
    • CCTag-1.0.3/src/cctag/cuda/tag_identify.cu
    • CCTag-1.0.3/src/cctag/cuda/tag_threads.cu
    • CCTag-1.0.3/src/cctag/cuda/triple_point.cu
  • GraphBLAS-9.2.0.tar.gz
    • GraphBLAS-9.2.0/CUDA/GB_cuda_AxB_dot3_branch.cpp
    • GraphBLAS-9.2.0/CUDA/GB_cuda_get_device_count.cu
    • GraphBLAS-9.2.0/CUDA/GB_cuda_get_device_properties.cu
    • GraphBLAS-9.2.0/CUDA/GB_cuda_warmup.cu
    • GraphBLAS-9.2.0/CUDA/JitKernels/GB_jit_kernel_cuda_AxB_dot3.cu
    • GraphBLAS-9.2.0/CUDA/JitKernels/GB_jit_kernel_cuda_apply_bind1st.cu
    • GraphBLAS-9.2.0/CUDA/JitKernels/GB_jit_kernel_cuda_apply_bind2nd.cu
    • GraphBLAS-9.2.0/CUDA/JitKernels/GB_jit_kernel_cuda_apply_unop.cu
    • GraphBLAS-9.2.0/CUDA/JitKernels/GB_jit_kernel_cuda_colscale.cu
    • GraphBLAS-9.2.0/CUDA/JitKernels/GB_jit_kernel_cuda_reduce.cu
    • GraphBLAS-9.2.0/CUDA/JitKernels/GB_jit_kernel_cuda_rowscale.cu
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_atomics.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_ek_slice.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_jit_AxB_dot3_dense_phase1.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_jit_AxB_dot3_phase1.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_jit_AxB_dot3_phase2.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_jit_AxB_dot3_phase2end.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_jit_AxB_dot3_phase3_dndn.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_jit_AxB_dot3_phase3_mp.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_jit_AxB_dot3_phase3_spdn.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_jit_AxB_dot3_phase3_vsdn.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_jit_AxB_dot3_phase3_vsvs.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_kernel.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_shfl_down.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GB_cuda_timer.hpp
    • GraphBLAS-9.2.0/CUDA/Template/GB_h_subset.cuh
    • GraphBLAS-9.2.0/CUDA/Template/GraphBLAS_h_subset.cuh
    • GraphBLAS-9.2.0/CUDA/unused/GB_cuda_cumsum.cu
    • GraphBLAS-9.2.0/rmm_wrap/rmm_wrap.h
    • GraphBLAS-9.2.0/rmm_wrap/rmm_wrap.hpp
  • HiGHS-1.7.2.tar.gz
    • HiGHS-1.7.2/src/pdlp/cupdlp/glbopts.h
  • LightGBM-4.3.0.tar.gz
    • LightGBM-4.3.0/include/LightGBM/cuda/cuda_algorithms.hpp
    • LightGBM-4.3.0/include/LightGBM/cuda/cuda_random.hpp
    • LightGBM-4.3.0/include/LightGBM/cuda/cuda_utils.hu
    • LightGBM-4.3.0/include/LightGBM/cuda/vector_cudahost.h
    • LightGBM-4.3.0/src/boosting/cuda/cuda_score_updater.cu
    • LightGBM-4.3.0/src/cuda/cuda_algorithms.cu
    • LightGBM-4.3.0/src/io/cuda/cuda_column_data.cu
    • LightGBM-4.3.0/src/io/cuda/cuda_tree.cu
    • LightGBM-4.3.0/src/metric/cuda/cuda_pointwise_metric.cu
    • LightGBM-4.3.0/src/objective/cuda/cuda_binary_objective.cu
    • LightGBM-4.3.0/src/objective/cuda/cuda_multiclass_objective.cu
    • LightGBM-4.3.0/src/objective/cuda/cuda_rank_objective.cu
    • LightGBM-4.3.0/src/objective/cuda/cuda_regression_objective.cu
    • LightGBM-4.3.0/src/treelearner/cuda/cuda_best_split_finder.cu
    • LightGBM-4.3.0/src/treelearner/cuda/cuda_data_partition.cu
    • LightGBM-4.3.0/src/treelearner/cuda/cuda_gradient_discretizer.cu
    • LightGBM-4.3.0/src/treelearner/cuda/cuda_histogram_constructor.cu
    • LightGBM-4.3.0/src/treelearner/cuda/cuda_leaf_splits.cu
    • LightGBM-4.3.0/src/treelearner/cuda/cuda_single_gpu_tree_learner.cu
    • LightGBM-4.3.0/src/treelearner/kernels/histogram_16_64_256.cu
  • NvCloth-1.1.6.zip
    • NvCloth-1.1.6/NvCloth/samples/SampleBase/scene/SceneController.h
    • NvCloth-1.1.6/NvCloth/samples/SampleBase/utils/CallbackImplementations.h
    • NvCloth-1.1.6/NvCloth/src/cuda/CuCheckSuccess.h
    • NvCloth-1.1.6/NvCloth/src/cuda/CuFactory.cpp
    • NvCloth-1.1.6/NvCloth/src/cuda/CuFactory.h
    • NvCloth-1.1.6/NvCloth/src/cuda/CuSolverKernel.cu
  • OpenSubdiv-3_6_0.tar.gz
    • OpenSubdiv-3_6_0/opensubdiv/osd/cudaD3D11VertexBuffer.cpp
    • OpenSubdiv-3_6_0/opensubdiv/osd/cudaEvaluator.cpp
    • OpenSubdiv-3_6_0/opensubdiv/osd/cudaGLVertexBuffer.cpp
    • OpenSubdiv-3_6_0/opensubdiv/osd/cudaGLVertexBuffer.h
    • OpenSubdiv-3_6_0/opensubdiv/osd/cudaKernel.cu
    • OpenSubdiv-3_6_0/opensubdiv/osd/cudaPatchTable.cpp
    • OpenSubdiv-3_6_0/opensubdiv/osd/cudaVertexBuffer.cpp
  • PhysX-a2c0428acab643e60618c681b501e86f7fd558cc.zip
    • PhysX-a2c0428acab643e60618c681b501e86f7fd558cc/physx/samples/sampleframework/renderer/src/d3d11/D3D11RendererIndexBuffer.cpp
    • PhysX-a2c0428acab643e60618c681b501e86f7fd558cc/physx/samples/sampleframework/renderer/src/d3d11/D3D11RendererInstanceBuffer.cpp
    • PhysX-a2c0428acab643e60618c681b501e86f7fd558cc/physx/samples/sampleframework/renderer/src/d3d11/D3D11RendererVertexBuffer.cpp
    • PhysX-a2c0428acab643e60618c681b501e86f7fd558cc/physx/samples/sampleframework/renderer/src/d3d9/D3D9RendererIndexBuffer.cpp
    • PhysX-a2c0428acab643e60618c681b501e86f7fd558cc/physx/samples/sampleframework/renderer/src/d3d9/D3D9RendererInstanceBuffer.cpp
    • PhysX-a2c0428acab643e60618c681b501e86f7fd558cc/physx/samples/sampleframework/renderer/src/d3d9/D3D9RendererVertexBuffer.cpp
  • SuiteSparse-7.7.0.tar.gz
    • SuiteSparse-7.7.0/CHOLMOD/Config/cholmod.h.in
    • SuiteSparse-7.7.0/CHOLMOD/GPU/cholmod_gpu.c
    • SuiteSparse-7.7.0/CHOLMOD/GPU/cholmod_gpu_kernels.cu
    • SuiteSparse-7.7.0/CHOLMOD/GPU/t_cholmod_gpu.c
    • SuiteSparse-7.7.0/CHOLMOD/Include/cholmod.h
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/GB_cuda_AxB_dot3_branch.cpp
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/GB_cuda_get_device_count.cu
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/GB_cuda_get_device_properties.cu
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/GB_cuda_warmup.cu
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/JitKernels/GB_jit_kernel_cuda_AxB_dot3.cu
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/JitKernels/GB_jit_kernel_cuda_reduce.cu
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_atomics.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_ek_slice.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_jit_AxB_dot3_dense_phase1.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_jit_AxB_dot3_phase1.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_jit_AxB_dot3_phase2.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_jit_AxB_dot3_phase2end.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_jit_AxB_dot3_phase3_dndn.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_jit_AxB_dot3_phase3_mp.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_jit_AxB_dot3_phase3_spdn.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_jit_AxB_dot3_phase3_vsdn.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_jit_AxB_dot3_phase3_vsvs.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_kernel.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_shfl_down.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_cuda_timer.hpp
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GB_h_subset.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/Template/GraphBLAS_h_subset.cuh
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/unused/GB_cuda_cumsum.cu
    • SuiteSparse-7.7.0/GraphBLAS/CUDA/unused/GB_search_for_vector_device.cuh
    • SuiteSparse-7.7.0/GraphBLAS/rmm_wrap/rmm_wrap.h
    • SuiteSparse-7.7.0/GraphBLAS/rmm_wrap/rmm_wrap.hpp
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Apply/block_apply.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Apply/block_apply_1.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Apply/block_apply_1_by_1.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Apply/block_apply_2.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Apply/block_apply_2_by_1.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Apply/block_apply_3.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Apply/block_apply_3_by_1.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Apply/block_apply_chunk.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Apply/cevta_tile.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Apply/pipelined_rearrange.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Assemble/packAssemble.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Assemble/sAssemble.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Factorize/factorize_3_by_1.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Factorize/factorize_vt.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Factorize/factorize_vt_1_by_1.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Factorize/factorize_vt_1_by_1_edge.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Factorize/factorize_vt_2_by_1.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Factorize/factorize_vt_2_by_1_edge.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Factorize/factorize_vt_3_by_1.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/Factorize/factorize_vt_3_by_1_edge.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/qrKernel.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Include/Kernel/uberKernel.cu
    • SuiteSparse-7.7.0/SPQR/GPUQREngine/Source/GPUQREngine_UberKernel.cu
  • amgcl-1.4.3.tar.gz
    • amgcl-1.4.3/tutorial/1.poisson3Db/poisson3Db_cuda.cu
  • apache-arrow-16.1.0.tar.gz
    • apache-arrow-16.1.0/cpp/src/arrow/gpu/cuda_context.h
    • apache-arrow-16.1.0/cpp/src/arrow/gpu/cuda_internal.h
    • apache-arrow-16.1.0/cpp/src/arrow/gpu/cuda_memory.cc
  • boost_1_85_0.tar.gz
    • boost_1_85_0/boost/fiber/cuda/waitfor.hpp
    • boost_1_85_0/libs/compute/perf/perf_thrust_accumulate.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_count.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_exclusive_scan.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_find.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_inner_product.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_merge.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_partial_sum.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_partition.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_reduce_by_key.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_reverse.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_reverse_copy.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_rotate.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_saxpy.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_set_difference.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_sort.cu
    • boost_1_85_0/libs/compute/perf/perf_thrust_unique.cu
  • cccl-2.4.0.tar.gz
    • cccl-2.4.0/cub/cub/agent/agent_adjacent_difference.cuh
    • cccl-2.4.0/cub/cub/agent/agent_batch_memcpy.cuh
    • cccl-2.4.0/cub/cub/agent/agent_for.cuh
    • cccl-2.4.0/cub/cub/agent/agent_histogram.cuh
    • cccl-2.4.0/cub/cub/agent/agent_merge_sort.cuh
    • cccl-2.4.0/cub/cub/agent/agent_radix_sort_downsweep.cuh
    • cccl-2.4.0/cub/cub/agent/agent_radix_sort_histogram.cuh
    • cccl-2.4.0/cub/cub/agent/agent_radix_sort_onesweep.cuh
    • cccl-2.4.0/cub/cub/agent/agent_radix_sort_upsweep.cuh
    • cccl-2.4.0/cub/cub/agent/agent_reduce.cuh
    • cccl-2.4.0/cub/cub/agent/agent_reduce_by_key.cuh
    • cccl-2.4.0/cub/cub/agent/agent_rle.cuh
    • cccl-2.4.0/cub/cub/agent/agent_scan.cuh
    • cccl-2.4.0/cub/cub/agent/agent_scan_by_key.cuh
    • cccl-2.4.0/cub/cub/agent/agent_segment_fixup.cuh
    • cccl-2.4.0/cub/cub/agent/agent_segmented_radix_sort.cuh
    • cccl-2.4.0/cub/cub/agent/agent_select_if.cuh
    • cccl-2.4.0/cub/cub/agent/agent_spmv_orig.cuh
    • cccl-2.4.0/cub/cub/agent/agent_sub_warp_merge_sort.cuh
    • cccl-2.4.0/cub/cub/agent/agent_three_way_partition.cuh
    • cccl-2.4.0/cub/cub/agent/agent_unique_by_key.cuh
    • cccl-2.4.0/cub/cub/agent/single_pass_scan_operators.cuh
    • cccl-2.4.0/cub/cub/block/block_adjacent_difference.cuh
    • cccl-2.4.0/cub/cub/block/block_discontinuity.cuh
    • cccl-2.4.0/cub/cub/block/block_exchange.cuh
    • cccl-2.4.0/cub/cub/block/block_histogram.cuh
    • cccl-2.4.0/cub/cub/block/block_load.cuh
    • cccl-2.4.0/cub/cub/block/block_merge_sort.cuh
    • cccl-2.4.0/cub/cub/block/block_radix_rank.cuh
    • cccl-2.4.0/cub/cub/block/block_radix_sort.cuh
    • cccl-2.4.0/cub/cub/block/block_raking_layout.cuh
    • cccl-2.4.0/cub/cub/block/block_reduce.cuh
    • cccl-2.4.0/cub/cub/block/block_run_length_decode.cuh
    • cccl-2.4.0/cub/cub/block/block_scan.cuh
    • cccl-2.4.0/cub/cub/block/block_shuffle.cuh
    • cccl-2.4.0/cub/cub/block/block_store.cuh
    • cccl-2.4.0/cub/cub/block/radix_rank_sort_operations.cuh
    • cccl-2.4.0/cub/cub/block/specializations/block_histogram_atomic.cuh
    • cccl-2.4.0/cub/cub/block/specializations/block_histogram_sort.cuh
    • cccl-2.4.0/cub/cub/block/specializations/block_reduce_raking.cuh
    • cccl-2.4.0/cub/cub/block/specializations/block_reduce_raking_commutative_only.cuh
    • cccl-2.4.0/cub/cub/block/specializations/block_reduce_warp_reductions.cuh
    • cccl-2.4.0/cub/cub/block/specializations/block_scan_raking.cuh
    • cccl-2.4.0/cub/cub/block/specializations/block_scan_warp_scans.cuh
    • cccl-2.4.0/cub/cub/config.cuh
    • cccl-2.4.0/cub/cub/cub.cuh
    • cccl-2.4.0/cub/cub/detail/choose_offset.cuh
    • cccl-2.4.0/cub/cub/detail/cpp_compatibility.cuh
    • cccl-2.4.0/cub/cub/detail/detect_cuda_runtime.cuh
    • cccl-2.4.0/cub/cub/detail/device_double_buffer.cuh
    • cccl-2.4.0/cub/cub/detail/device_synchronize.cuh
    • cccl-2.4.0/cub/cub/detail/strong_load.cuh
    • cccl-2.4.0/cub/cub/detail/strong_store.cuh
    • cccl-2.4.0/cub/cub/detail/temporary_storage.cuh
    • cccl-2.4.0/cub/cub/detail/type_traits.cuh
    • cccl-2.4.0/cub/cub/detail/uninitialized_copy.cuh
    • cccl-2.4.0/cub/cub/device/device_adjacent_difference.cuh
    • cccl-2.4.0/cub/cub/device/device_copy.cuh
    • cccl-2.4.0/cub/cub/device/device_for.cuh
    • cccl-2.4.0/cub/cub/device/device_histogram.cuh
    • cccl-2.4.0/cub/cub/device/device_memcpy.cuh
    • cccl-2.4.0/cub/cub/device/device_merge_sort.cuh
    • cccl-2.4.0/cub/cub/device/device_partition.cuh
    • cccl-2.4.0/cub/cub/device/device_radix_sort.cuh
    • cccl-2.4.0/cub/cub/device/device_reduce.cuh
    • cccl-2.4.0/cub/cub/device/device_run_length_encode.cuh
    • cccl-2.4.0/cub/cub/device/device_scan.cuh
    • cccl-2.4.0/cub/cub/device/device_segmented_radix_sort.cuh
    • cccl-2.4.0/cub/cub/device/device_segmented_reduce.cuh
    • cccl-2.4.0/cub/cub/device/device_segmented_sort.cuh
    • cccl-2.4.0/cub/cub/device/device_select.cuh
    • cccl-2.4.0/cub/cub/device/device_spmv.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_adjacent_difference.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_batch_memcpy.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_for.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_histogram.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_merge_sort.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_radix_sort.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_reduce.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_reduce_by_key.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_rle.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_scan.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_scan_by_key.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_segmented_sort.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_select_if.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_spmv_orig.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_three_way_partition.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/dispatch_unique_by_key.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/tuning/tuning_for.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/tuning/tuning_histogram.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/tuning/tuning_reduce_by_key.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/tuning/tuning_run_length_encode.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/tuning/tuning_scan.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/tuning/tuning_scan_by_key.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/tuning/tuning_select_if.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/tuning/tuning_three_way_partition.cuh
    • cccl-2.4.0/cub/cub/device/dispatch/tuning/tuning_unique_by_key.cuh
    • cccl-2.4.0/cub/cub/grid/grid_barrier.cuh
    • cccl-2.4.0/cub/cub/grid/grid_even_share.cuh
    • cccl-2.4.0/cub/cub/grid/grid_mapping.cuh
    • cccl-2.4.0/cub/cub/grid/grid_queue.cuh
    • cccl-2.4.0/cub/cub/host/mutex.cuh
    • cccl-2.4.0/cub/cub/iterator/arg_index_input_iterator.cuh
    • cccl-2.4.0/cub/cub/iterator/cache_modified_input_iterator.cuh
    • cccl-2.4.0/cub/cub/iterator/cache_modified_output_iterator.cuh
    • cccl-2.4.0/cub/cub/iterator/constant_input_iterator.cuh
    • cccl-2.4.0/cub/cub/iterator/counting_input_iterator.cuh
    • cccl-2.4.0/cub/cub/iterator/discard_output_iterator.cuh
    • cccl-2.4.0/cub/cub/iterator/tex_obj_input_iterator.cuh
    • cccl-2.4.0/cub/cub/iterator/tex_ref_input_iterator.cuh
    • cccl-2.4.0/cub/cub/iterator/transform_input_iterator.cuh
    • cccl-2.4.0/cub/cub/thread/thread_load.cuh
    • cccl-2.4.0/cub/cub/thread/thread_operators.cuh
    • cccl-2.4.0/cub/cub/thread/thread_reduce.cuh
    • cccl-2.4.0/cub/cub/thread/thread_scan.cuh
    • cccl-2.4.0/cub/cub/thread/thread_search.cuh
    • cccl-2.4.0/cub/cub/thread/thread_sort.cuh
    • cccl-2.4.0/cub/cub/thread/thread_store.cuh
    • cccl-2.4.0/cub/cub/util_allocator.cuh
    • cccl-2.4.0/cub/cub/util_arch.cuh
    • cccl-2.4.0/cub/cub/util_compiler.cuh
    • cccl-2.4.0/cub/cub/util_cpp_dialect.cuh
    • cccl-2.4.0/cub/cub/util_debug.cuh
    • cccl-2.4.0/cub/cub/util_deprecated.cuh
    • cccl-2.4.0/cub/cub/util_device.cuh
    • cccl-2.4.0/cub/cub/util_macro.cuh
    • cccl-2.4.0/cub/cub/util_math.cuh
    • cccl-2.4.0/cub/cub/util_namespace.cuh
    • cccl-2.4.0/cub/cub/util_ptx.cuh
    • cccl-2.4.0/cub/cub/util_temporary_storage.cuh
    • cccl-2.4.0/cub/cub/util_type.cuh
    • cccl-2.4.0/cub/cub/version.cuh
    • cccl-2.4.0/cub/cub/warp/specializations/warp_exchange_shfl.cuh
    • cccl-2.4.0/cub/cub/warp/specializations/warp_exchange_smem.cuh
    • cccl-2.4.0/cub/cub/warp/specializations/warp_reduce_shfl.cuh
    • cccl-2.4.0/cub/cub/warp/specializations/warp_reduce_smem.cuh
    • cccl-2.4.0/cub/cub/warp/specializations/warp_scan_shfl.cuh
    • cccl-2.4.0/cub/cub/warp/specializations/warp_scan_smem.cuh
    • cccl-2.4.0/cub/cub/warp/warp_exchange.cuh
    • cccl-2.4.0/cub/cub/warp/warp_load.cuh
    • cccl-2.4.0/cub/cub/warp/warp_merge_sort.cuh
    • cccl-2.4.0/cub/cub/warp/warp_reduce.cuh
    • cccl-2.4.0/cub/cub/warp/warp_scan.cuh
    • cccl-2.4.0/cub/cub/warp/warp_store.cuh
    • cccl-2.4.0/libcudacxx/docs/extended_api/asynchronous_operations/memcpy_async.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/asynchronous_operations/memcpy_async_tx.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/functional/proclaim_return_type.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/memory_access_properties/access_property.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/memory_access_properties/associate_access_property.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/memory_access_properties/discard_memory.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/shapes/aligned_size_t.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/atomic.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/atomic/atomic_thread_fence.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/atomic/fetch_max.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/atomic/fetch_min.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/atomic_ref.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/barrier.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/barrier/barrier_arrive_tx.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/barrier/barrier_expect_tx.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/barrier/barrier_native_handle.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/barrier/init.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/binary_semaphore.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/counting_semaphore.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/latch.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/make_pipeline.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/pipeline.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/pipeline_consumer_wait_prior.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/pipeline_producer_commit.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/pipeline_shared_state.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/synchronization_primitives/pipeline_shared_state/constructor.md
    • cccl-2.4.0/libcudacxx/docs/extended_api/thread_groups.md
    • cccl-2.4.0/libcudacxx/docs/overview.md
    • cccl-2.4.0/libcudacxx/docs/ptx/instructions/mbarrier.arrive.md
    • cccl-2.4.0/libcudacxx/include/cuda/memory_resource
    • cccl-2.4.0/libcudacxx/include/cuda/std/detail/libcxx/include/__type_traits/promote.h
    • cccl-2.4.0/libcudacxx/include/cuda/stream_ref
    • cccl-2.4.0/thrust/cmake/detect_compute_archs.cu
    • cccl-2.4.0/thrust/thrust/detail/config/config.h
    • cccl-2.4.0/thrust/thrust/detail/config/cpp_compatibility.h
    • cccl-2.4.0/thrust/thrust/detail/type_traits.h
    • cccl-2.4.0/thrust/thrust/detail/type_traits/has_trivial_assign.h
    • cccl-2.4.0/thrust/thrust/device_new_allocator.h
    • cccl-2.4.0/thrust/thrust/iterator/detail/tuple_of_iterator_references.h
    • cccl-2.4.0/thrust/thrust/pair.h
    • cccl-2.4.0/thrust/thrust/tuple.h
    • cccl-2.4.0/thrust/thrust/type_traits/integer_sequence.h
    • cccl-2.4.0/thrust/thrust/version.h
  • cminpack-1.3.8.tar.gz
    • cminpack-1.3.8/cuda/chkder.cu
    • cminpack-1.3.8/cuda/covar.cu
    • cminpack-1.3.8/cuda/covar1.cu
    • cminpack-1.3.8/cuda/dogleg.cu
    • cminpack-1.3.8/cuda/dpmpar.cu
    • cminpack-1.3.8/cuda/enorm.cu
    • cminpack-1.3.8/cuda/fdjac1.cu
    • cminpack-1.3.8/cuda/fdjac2.cu
    • cminpack-1.3.8/cuda/hybrd.cu
    • cminpack-1.3.8/cuda/hybrd1.cu
    • cminpack-1.3.8/cuda/hybrj.cu
    • cminpack-1.3.8/cuda/hybrj1.cu
    • cminpack-1.3.8/cuda/lmder.cu
    • cminpack-1.3.8/cuda/lmder1.cu
    • cminpack-1.3.8/cuda/lmdif.cu
    • cminpack-1.3.8/cuda/lmdif1.cu
    • cminpack-1.3.8/cuda/lmpar.cu
    • cminpack-1.3.8/cuda/lmstr.cu
    • cminpack-1.3.8/cuda/lmstr1.cu
    • cminpack-1.3.8/cuda/qform.cu
    • cminpack-1.3.8/cuda/qrfac.cu
    • cminpack-1.3.8/cuda/qrsolv.cu
    • cminpack-1.3.8/cuda/r1mpyq.cu
    • cminpack-1.3.8/cuda/r1updt.cu
    • cminpack-1.3.8/cuda/rwupdt.cu
  • colmap-3.9.1.tar.gz
    • colmap-3.9.1/cmake/SelectCudaComputeArch.cmake
    • colmap-3.9.1/src/colmap/mvs/cuda_flip.h
    • colmap-3.9.1/src/colmap/mvs/cuda_rotate.h
    • colmap-3.9.1/src/colmap/mvs/cuda_texture.h
    • colmap-3.9.1/src/colmap/mvs/cuda_transpose.h
    • colmap-3.9.1/src/colmap/mvs/gpu_mat.h
    • colmap-3.9.1/src/colmap/mvs/gpu_mat_prng.cu
    • colmap-3.9.1/src/colmap/mvs/gpu_mat_ref_image.cu
    • colmap-3.9.1/src/colmap/mvs/patch_match_cuda.cu
    • colmap-3.9.1/src/colmap/mvs/patch_match_cuda.h
    • colmap-3.9.1/src/colmap/util/cuda.cc
    • colmap-3.9.1/src/colmap/util/cudacc.h
    • colmap-3.9.1/src/thirdparty/SiftGPU/CuTexImage.cpp
    • colmap-3.9.1/src/thirdparty/SiftGPU/CuTexImage.h
    • colmap-3.9.1/src/thirdparty/SiftGPU/ProgramCU.cu
    • colmap-3.9.1/src/thirdparty/SiftGPU/SiftMatchCU.cpp
  • cuda-api-wrappers-0.7.0-b2.tar.gz
    • cuda-api-wrappers-0.7.0-b2/src/cuda/api/array.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/api/current_device.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/api/device.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/api/device_properties.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/api/error.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/api/graph/template.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/api/kernel_launch.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/api/kernels/apriori_compiled.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/api/memory.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/api/multi_wrapper_impls/kernel_launch.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/api/multi_wrapper_impls/memory.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/api/types.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/api/virtual_memory.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/nvtx/profiling.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/rtc/error.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/rtc/types.hpp
    • cuda-api-wrappers-0.7.0-b2/src/cuda/rtc/versions.hpp
  • cuda-kat-0.2.tar.gz
    • cuda-kat-0.2/src/cuda-kat.cuh
    • cuda-kat-0.2/src/kat/detail/pointers.cuh
    • cuda-kat-0.2/src/kat/on_device/atomics.cuh
    • cuda-kat-0.2/src/kat/on_device/builtins.cuh
    • cuda-kat-0.2/src/kat/on_device/c_standard_library/string.cuh
    • cuda-kat-0.2/src/kat/on_device/collaboration/block.cuh
    • cuda-kat-0.2/src/kat/on_device/collaboration/grid.cuh
    • cuda-kat-0.2/src/kat/on_device/collaboration/warp.cuh
    • cuda-kat-0.2/src/kat/on_device/common.cuh
    • cuda-kat-0.2/src/kat/on_device/constexpr_math.cuh
    • cuda-kat-0.2/src/kat/on_device/detail/atomics.cuh
    • cuda-kat-0.2/src/kat/on_device/detail/atomics/missing_in_cuda.cuh
    • cuda-kat-0.2/src/kat/on_device/detail/builtins.cuh
    • cuda-kat-0.2/src/kat/on_device/detail/itoa.cuh
    • cuda-kat-0.2/src/kat/on_device/detail/shuffle.cuh
    • cuda-kat-0.2/src/kat/on_device/grid_info.cuh
    • cuda-kat-0.2/src/kat/on_device/math.cuh
    • cuda-kat-0.2/src/kat/on_device/miscellany.cuh
    • cuda-kat-0.2/src/kat/on_device/non-builtins.cuh
    • cuda-kat-0.2/src/kat/on_device/ptx.cuh
    • cuda-kat-0.2/src/kat/on_device/ptx/detail/define_macros.cuh
    • cuda-kat-0.2/src/kat/on_device/ptx/detail/undefine_macros.cuh
    • cuda-kat-0.2/src/kat/on_device/ptx/miscellany.cuh
    • cuda-kat-0.2/src/kat/on_device/ptx/special_registers.cuh
    • cuda-kat-0.2/src/kat/on_device/ptx/video_instructions.cuh
    • cuda-kat-0.2/src/kat/on_device/ranges.cuh
    • cuda-kat-0.2/src/kat/on_device/sequence_ops/block.cuh
    • cuda-kat-0.2/src/kat/on_device/sequence_ops/common.cuh
    • cuda-kat-0.2/src/kat/on_device/sequence_ops/grid.cuh
    • cuda-kat-0.2/src/kat/on_device/sequence_ops/warp.cuh
    • cuda-kat-0.2/src/kat/on_device/shared_memory.cuh
    • cuda-kat-0.2/src/kat/on_device/shared_memory/basic.cuh
    • cuda-kat-0.2/src/kat/on_device/shared_memory/operations.cuh
    • cuda-kat-0.2/src/kat/on_device/shuffle.cuh
    • cuda-kat-0.2/src/kat/on_device/time.cuh
  • cuda-samples-12.2.tar.gz
    • cuda-samples-12.2/Common/UtilNPP/ImageAllocatorsNPP.h
    • cuda-samples-12.2/Common/UtilNPP/ImagesNPP.h
    • cuda-samples-12.2/Common/UtilNPP/SignalAllocatorsNPP.h
    • cuda-samples-12.2/Common/UtilNPP/SignalsNPP.h
    • cuda-samples-12.2/Common/helper_cusolver.h
    • cuda-samples-12.2/Common/nvrtc_helper.h
  • darknet-61c9d02ec461e30d55762ec7669d6a1d3c356fb2.tar.gz
    • darknet-61c9d02ec461e30d55762ec7669d6a1d3c356fb2/src/activation_kernels.cu
    • darknet-61c9d02ec461e30d55762ec7669d6a1d3c356fb2/src/avgpool_layer_kernels.cu
    • darknet-61c9d02ec461e30d55762ec7669d6a1d3c356fb2/src/blas_kernels.cu
    • darknet-61c9d02ec461e30d55762ec7669d6a1d3c356fb2/src/col2im_kernels.cu
    • darknet-61c9d02ec461e30d55762ec7669d6a1d3c356fb2/src/convolutional_kernels.cu
    • darknet-61c9d02ec461e30d55762ec7669d6a1d3c356fb2/src/crop_layer_kernels.cu
    • darknet-61c9d02ec461e30d55762ec7669d6a1d3c356fb2/src/deconvolutional_kernels.cu
    • darknet-61c9d02ec461e30d55762ec7669d6a1d3c356fb2/src/dropout_layer_kernels.cu
    • darknet-61c9d02ec461e30d55762ec7669d6a1d3c356fb2/src/im2col_kernels.cu
    • darknet-61c9d02ec461e30d55762ec7669d6a1d3c356fb2/src/maxpool_layer_kernels.cu
  • dlib-19.24.2.tar.gz
    • dlib-19.24.2/dlib/cuda/cuda_dlib.cu
    • dlib-19.24.2/dlib/cuda/cuda_utils.h
    • dlib-19.24.2/dlib/cuda/cusolver_dlibapi.cu
    • dlib-19.24.2/dlib/cuda/gpu_data.cpp
  • ffmpeg-6.1.1.tar.bz2
    • ffmpeg-6.1.1/libavfilter/vf_chromakey_cuda.cu
    • ffmpeg-6.1.1/libavfilter/vf_yadif_cuda.cu
    • ffmpeg-6.1.1/libavfilter/vf_scale_cuda.cu
    • ffmpeg-6.1.1/libavfilter/cuda/vector_helpers.cuh
    • ffmpeg-6.1.1/libavfilter/vf_thumbnail_cuda.cu
    • ffmpeg-6.1.1/libavfilter/vf_bwdif_cuda.cu
    • ffmpeg-6.1.1/libavfilter/vf_overlay_cuda.cu
    • ffmpeg-6.1.1/libavfilter/vf_colorspace_cuda.cu
    • ffmpeg-6.1.1/libavfilter/vf_bilateral_cuda.cu
    • ffmpeg-6.1.1/libavutil/hwcontext_cuda.h
  • ginkgo-1.7.0.tar.gz
    • ginkgo-1.7.0/core/base/extended_float.hpp
    • ginkgo-1.7.0/cuda/base/batch_multi_vector_kernels.cu
    • ginkgo-1.7.0/cuda/base/cusparse_bindings.hpp
    • ginkgo-1.7.0/cuda/base/cusparse_block_bindings.hpp
    • ginkgo-1.7.0/cuda/base/cusparse_handle.hpp
    • ginkgo-1.7.0/cuda/base/device.cpp
    • ginkgo-1.7.0/cuda/base/device_matrix_data_kernels.cu
    • ginkgo-1.7.0/cuda/base/exception.cpp
    • ginkgo-1.7.0/cuda/base/executor.cpp
    • ginkgo-1.7.0/cuda/base/kernel_config.hpp
    • ginkgo-1.7.0/cuda/base/kernel_launch.cuh
    • ginkgo-1.7.0/cuda/base/kernel_launch_reduction.cuh
    • ginkgo-1.7.0/cuda/base/kernel_launch_solver.cuh
    • ginkgo-1.7.0/cuda/base/memory.cpp
    • ginkgo-1.7.0/cuda/base/nvtx.cpp
    • ginkgo-1.7.0/cuda/base/pointer_mode_guard.hpp
    • ginkgo-1.7.0/cuda/base/scoped_device_id.cpp
    • ginkgo-1.7.0/cuda/base/stream.cpp
    • ginkgo-1.7.0/cuda/base/thrust.cuh
    • ginkgo-1.7.0/cuda/base/timer.cpp
    • ginkgo-1.7.0/cuda/base/types.hpp
    • ginkgo-1.7.0/cuda/components/atomic.cuh
    • ginkgo-1.7.0/cuda/components/cooperative_groups.cuh
    • ginkgo-1.7.0/cuda/components/diagonal_block_manipulation.cuh
    • ginkgo-1.7.0/cuda/components/format_conversion.cuh
    • ginkgo-1.7.0/cuda/components/intrinsics.cuh
    • ginkgo-1.7.0/cuda/components/memory.cuh
    • ginkgo-1.7.0/cuda/components/merging.cuh
    • ginkgo-1.7.0/cuda/components/prefix_sum.cuh
    • ginkgo-1.7.0/cuda/components/prefix_sum_kernels.cu
    • ginkgo-1.7.0/cuda/components/reduction.cuh
    • ginkgo-1.7.0/cuda/components/searching.cuh
    • ginkgo-1.7.0/cuda/components/segment_scan.cuh
    • ginkgo-1.7.0/cuda/components/sorting.cuh
    • ginkgo-1.7.0/cuda/components/syncfree.cuh
    • ginkgo-1.7.0/cuda/components/thread_ids.cuh
    • ginkgo-1.7.0/cuda/components/warp_blas.cuh
    • ginkgo-1.7.0/cuda/distributed/matrix_kernels.cu
    • ginkgo-1.7.0/cuda/distributed/partition_helpers_kernels.cu
    • ginkgo-1.7.0/cuda/distributed/partition_kernels.cu
    • ginkgo-1.7.0/cuda/distributed/vector_kernels.cu
    • ginkgo-1.7.0/cuda/factorization/cholesky_kernels.cu
    • ginkgo-1.7.0/cuda/factorization/factorization_kernels.cu
    • ginkgo-1.7.0/cuda/factorization/ic_kernels.cu
    • ginkgo-1.7.0/cuda/factorization/ilu_kernels.cu
    • ginkgo-1.7.0/cuda/factorization/lu_kernels.cu
    • ginkgo-1.7.0/cuda/factorization/par_ic_kernels.cu
    • ginkgo-1.7.0/cuda/factorization/par_ict_kernels.cu
    • ginkgo-1.7.0/cuda/factorization/par_ilu_kernels.cu
    • ginkgo-1.7.0/cuda/factorization/par_ilut_approx_filter_kernel.cu
    • ginkgo-1.7.0/cuda/factorization/par_ilut_filter_kernel.cu
    • ginkgo-1.7.0/cuda/factorization/par_ilut_select_common.cu
    • ginkgo-1.7.0/cuda/factorization/par_ilut_select_common.cuh
    • ginkgo-1.7.0/cuda/factorization/par_ilut_select_kernel.cu
    • ginkgo-1.7.0/cuda/factorization/par_ilut_spgeam_kernel.cu
    • ginkgo-1.7.0/cuda/factorization/par_ilut_sweep_kernel.cu
    • ginkgo-1.7.0/cuda/log/batch_logger.cuh
    • ginkgo-1.7.0/cuda/matrix/batch_dense_kernels.cu
    • ginkgo-1.7.0/cuda/matrix/batch_ell_kernels.cu
    • ginkgo-1.7.0/cuda/matrix/coo_kernels.cu
    • ginkgo-1.7.0/cuda/matrix/csr_kernels.instantiate.cu
    • ginkgo-1.7.0/cuda/matrix/csr_kernels.template.cu
    • ginkgo-1.7.0/cuda/matrix/dense_kernels.cu
    • ginkgo-1.7.0/cuda/matrix/diagonal_kernels.cu
    • ginkgo-1.7.0/cuda/matrix/ell_kernels.cu
    • ginkgo-1.7.0/cuda/matrix/fbcsr_kernels.instantiate.cu
    • ginkgo-1.7.0/cuda/matrix/fbcsr_kernels.template.cu
    • ginkgo-1.7.0/cuda/matrix/fft_kernels.cu
    • ginkgo-1.7.0/cuda/matrix/sellp_kernels.cu
    • ginkgo-1.7.0/cuda/matrix/sparsity_csr_kernels.cu
    • ginkgo-1.7.0/cuda/multigrid/pgm_kernels.cu
    • ginkgo-1.7.0/cuda/preconditioner/batch_preconditioners.cuh
    • ginkgo-1.7.0/cuda/preconditioner/isai_kernels.cu
    • ginkgo-1.7.0/cuda/preconditioner/jacobi_advanced_apply_instantiate.inc.cu
    • ginkgo-1.7.0/cuda/preconditioner/jacobi_advanced_apply_kernel.cu
    • ginkgo-1.7.0/cuda/preconditioner/jacobi_generate_instantiate.inc.cu
    • ginkgo-1.7.0/cuda/preconditioner/jacobi_generate_kernel.cu
    • ginkgo-1.7.0/cuda/preconditioner/jacobi_kernels.cu
    • ginkgo-1.7.0/cuda/preconditioner/jacobi_simple_apply_instantiate.inc.cu
    • ginkgo-1.7.0/cuda/preconditioner/jacobi_simple_apply_kernel.cu
    • ginkgo-1.7.0/cuda/reorder/rcm_kernels.cu
    • ginkgo-1.7.0/cuda/solver/batch_bicgstab_kernels.cu
    • ginkgo-1.7.0/cuda/solver/cb_gmres_kernels.cu
    • ginkgo-1.7.0/cuda/solver/common_trs_kernels.cuh
    • ginkgo-1.7.0/cuda/solver/idr_kernels.cu
    • ginkgo-1.7.0/cuda/solver/lower_trs_kernels.cu
    • ginkgo-1.7.0/cuda/solver/multigrid_kernels.cu
    • ginkgo-1.7.0/cuda/solver/upper_trs_kernels.cu
    • ginkgo-1.7.0/cuda/stop/batch_criteria.cuh
    • ginkgo-1.7.0/cuda/stop/criterion_kernels.cu
    • ginkgo-1.7.0/cuda/stop/residual_norm_kernels.cu
    • ginkgo-1.7.0/dev_tools/oneapi/fake_interface/cooperative_groups.cuh
    • ginkgo-1.7.0/dev_tools/oneapi/working_directory/trick/reduction.hpp
    • ginkgo-1.7.0/third_party/identify_stream_usage/identify_stream_usage.cpp
  • gst-plugins-bad-1.19.1.tar.bz2
    • gst-plugins-bad-1.19.1/sys/nvcodec/cuviddec.h
    • gst-plugins-bad-1.19.1/sys/nvcodec/gstcudacontext.h
  • gtsam_points-1.0.0.tar.gz
    • gtsam_points-1.0.0/include/gtsam_points/cuda/check_error.cuh
    • gtsam_points-1.0.0/include/gtsam_points/cuda/check_error_curand.cuh
    • gtsam_points-1.0.0/include/gtsam_points/cuda/check_error_cusolver.cuh
    • gtsam_points-1.0.0/include/gtsam_points/cuda/cuda_graph.cuh
    • gtsam_points-1.0.0/include/gtsam_points/cuda/kernels/inlier_access_kernel.cuh
    • gtsam_points-1.0.0/include/gtsam_points/cuda/kernels/linearized_system.cuh
    • gtsam_points-1.0.0/include/gtsam_points/cuda/kernels/lookup_voxels.cuh
    • gtsam_points-1.0.0/include/gtsam_points/cuda/kernels/pose.cuh
    • gtsam_points-1.0.0/include/gtsam_points/cuda/kernels/untie.cuh
    • gtsam_points-1.0.0/include/gtsam_points/cuda/kernels/vector3_hash.cuh
    • gtsam_points-1.0.0/include/gtsam_points/cuda/kernels/vgicp_derivatives.cuh
    • gtsam_points-1.0.0/include/gtsam_points/factors/integrated_vgicp_derivatives.cuh
    • gtsam_points-1.0.0/src/gtsam_points/cuda/check_error.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/check_error_curand.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/check_error_cusolver.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/cuda_buffer.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/cuda_device_prop.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/cuda_device_sync.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/cuda_graph.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/cuda_graph_exec.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/cuda_memory.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/cuda_stream.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/gl_buffer_map.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/nonlinear_factor_set_gpu.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/nonlinear_factor_set_gpu_create.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/stream_roundrobin.cu
    • gtsam_points-1.0.0/src/gtsam_points/cuda/stream_temp_buffer_roundrobin.cu
    • gtsam_points-1.0.0/src/gtsam_points/factors/integrated_vgicp_derivatives.cu
    • gtsam_points-1.0.0/src/gtsam_points/factors/integrated_vgicp_derivatives_compute.cu
    • gtsam_points-1.0.0/src/gtsam_points/factors/integrated_vgicp_derivatives_inliers.cu
    • gtsam_points-1.0.0/src/gtsam_points/factors/integrated_vgicp_derivatives_linearize.cu
    • gtsam_points-1.0.0/src/gtsam_points/factors/integrated_vgicp_factor_gpu.cu
    • gtsam_points-1.0.0/src/gtsam_points/types/gaussian_voxelmap_gpu.cu
    • gtsam_points-1.0.0/src/gtsam_points/types/gaussian_voxelmap_gpu_funcs.cu
    • gtsam_points-1.0.0/src/gtsam_points/types/point_cloud.cu
    • gtsam_points-1.0.0/src/gtsam_points/types/point_cloud_gpu.cu
    • gtsam_points-1.0.0/src/gtsam_points/util/easy_profiler_cuda.cu
  • hwloc-2.10.0.tar.bz2
    • hwloc-2.10.0/hwloc/topology-cuda.c
    • hwloc-2.10.0/configure
    • hwloc-2.10.0/config/hwloc.m4
    • hwloc-2.10.0/include/hwloc/cudart.h
    • hwloc-2.10.0/include/hwloc/cuda.h
  • libfabric-1.18.1.tar.bz2
    • libfabric-1.18.1/prov/efa/src/efa_mr.c
    • libfabric-1.18.1/prov/psm3/psm3/psm_user.h
    • libfabric-1.18.1/src/hmem_cuda.c
    • libfabric-1.18.1/include/ofi_hmem.h
  • libfreenect2-0.2.1.tar.gz
    • libfreenect2-0.2.1/src/cuda_depth_packet_processor.cu
    • libfreenect2-0.2.1/src/cuda_kde_depth_packet_processor.cu
  • librealsense-2.53.1.tar.gz
    • librealsense-2.53.1/src/cuda/cuda-conversion.cu
    • librealsense-2.53.1/src/cuda/cuda-conversion.cuh
    • librealsense-2.53.1/src/cuda/cuda-pointcloud.cu
    • librealsense-2.53.1/src/cuda/cuda-pointcloud.cuh
    • librealsense-2.53.1/src/cuda/rscuda_utils.cuh
    • librealsense-2.53.1/src/proc/cuda/cuda-align.cu
    • librealsense-2.53.1/src/proc/cuda/cuda-align.cuh
  • llama.cpp-b3040.tar.gz
    • llama.cpp-b3040/ggml-common.h
    • llama.cpp-b3040/ggml-cuda.cu
    • llama.cpp-b3040/ggml-cuda/acc.cu
    • llama.cpp-b3040/ggml-cuda/acc.cuh
    • llama.cpp-b3040/ggml-cuda/arange.cu
    • llama.cpp-b3040/ggml-cuda/arange.cuh
    • llama.cpp-b3040/ggml-cuda/argsort.cu
    • llama.cpp-b3040/ggml-cuda/argsort.cuh
    • llama.cpp-b3040/ggml-cuda/binbcast.cu
    • llama.cpp-b3040/ggml-cuda/binbcast.cuh
    • llama.cpp-b3040/ggml-cuda/clamp.cu
    • llama.cpp-b3040/ggml-cuda/clamp.cuh
    • llama.cpp-b3040/ggml-cuda/common.cuh
    • llama.cpp-b3040/ggml-cuda/concat.cu
    • llama.cpp-b3040/ggml-cuda/concat.cuh
    • llama.cpp-b3040/ggml-cuda/convert.cu
    • llama.cpp-b3040/ggml-cuda/convert.cuh
    • llama.cpp-b3040/ggml-cuda/cpy.cu
    • llama.cpp-b3040/ggml-cuda/cpy.cuh
    • llama.cpp-b3040/ggml-cuda/dequantize.cuh
    • llama.cpp-b3040/ggml-cuda/diagmask.cu
    • llama.cpp-b3040/ggml-cuda/diagmask.cuh
    • llama.cpp-b3040/ggml-cuda/dmmv.cu
    • llama.cpp-b3040/ggml-cuda/dmmv.cuh
    • llama.cpp-b3040/ggml-cuda/fattn-common.cuh
    • llama.cpp-b3040/ggml-cuda/fattn-tile-f16.cu
    • llama.cpp-b3040/ggml-cuda/fattn-tile-f16.cuh
    • llama.cpp-b3040/ggml-cuda/fattn-tile-f32.cu
    • llama.cpp-b3040/ggml-cuda/fattn-tile-f32.cuh
    • llama.cpp-b3040/ggml-cuda/fattn-vec-f16.cu
    • llama.cpp-b3040/ggml-cuda/fattn-vec-f16.cuh
    • llama.cpp-b3040/ggml-cuda/fattn-vec-f32.cu
    • llama.cpp-b3040/ggml-cuda/fattn-vec-f32.cuh
    • llama.cpp-b3040/ggml-cuda/fattn.cu
    • llama.cpp-b3040/ggml-cuda/fattn.cuh
    • llama.cpp-b3040/ggml-cuda/getrows.cu
    • llama.cpp-b3040/ggml-cuda/getrows.cuh
    • llama.cpp-b3040/ggml-cuda/im2col.cu
    • llama.cpp-b3040/ggml-cuda/im2col.cuh
    • llama.cpp-b3040/ggml-cuda/mmq.cu
    • llama.cpp-b3040/ggml-cuda/mmq.cuh
    • llama.cpp-b3040/ggml-cuda/mmvq.cu
    • llama.cpp-b3040/ggml-cuda/mmvq.cuh
    • llama.cpp-b3040/ggml-cuda/norm.cu
    • llama.cpp-b3040/ggml-cuda/norm.cuh
    • llama.cpp-b3040/ggml-cuda/pad.cu
    • llama.cpp-b3040/ggml-cuda/pad.cuh
    • llama.cpp-b3040/ggml-cuda/pool2d.cu
    • llama.cpp-b3040/ggml-cuda/pool2d.cuh
    • llama.cpp-b3040/ggml-cuda/quantize.cu
    • llama.cpp-b3040/ggml-cuda/quantize.cuh
    • llama.cpp-b3040/ggml-cuda/rope.cu
    • llama.cpp-b3040/ggml-cuda/rope.cuh
    • llama.cpp-b3040/ggml-cuda/scale.cu
    • llama.cpp-b3040/ggml-cuda/scale.cuh
    • llama.cpp-b3040/ggml-cuda/softmax.cu
    • llama.cpp-b3040/ggml-cuda/softmax.cuh
    • llama.cpp-b3040/ggml-cuda/sumrows.cu
    • llama.cpp-b3040/ggml-cuda/sumrows.cuh
    • llama.cpp-b3040/ggml-cuda/tsembd.cu
    • llama.cpp-b3040/ggml-cuda/tsembd.cuh
    • llama.cpp-b3040/ggml-cuda/unary.cu
    • llama.cpp-b3040/ggml-cuda/unary.cuh
    • llama.cpp-b3040/ggml-cuda/upscale.cu
    • llama.cpp-b3040/ggml-cuda/upscale.cuh
    • llama.cpp-b3040/ggml-cuda/vecdotq.cuh
  • nsimd-3.0.1.tar.gz
    • nsimd-3.0.1/include/nsimd/nsimd.h
  • oneDNN-373e65b660c0ba274631cf30c422f10606de1618.tar.gz
    • oneDNN-373e65b660c0ba274631cf30c422f10606de1618/src/gpu/nvidia/sycl_cuda_compat.hpp
    • oneDNN-373e65b660c0ba274631cf30c422f10606de1618/src/gpu/nvidia/sycl_cuda_stream.hpp
    • oneDNN-373e65b660c0ba274631cf30c422f10606de1618/src/gpu/nvidia/sycl_cuda_utils.hpp
  • oneDNN-37f48519b87cf8b5e5ef2209340a1948c3e87d72.tar.gz
    • oneDNN-37f48519b87cf8b5e5ef2209340a1948c3e87d72/src/gpu/nvidia/sycl_cuda_compat.hpp
    • oneDNN-37f48519b87cf8b5e5ef2209340a1948c3e87d72/src/gpu/nvidia/sycl_cuda_stream.hpp
    • oneDNN-37f48519b87cf8b5e5ef2209340a1948c3e87d72/src/gpu/nvidia/sycl_cuda_utils.hpp
  • onnxruntime-1.16.3.tar.gz
    • onnxruntime-1.16.3/include/onnxruntime/core/providers/cuda/cuda_context.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/activation/activations_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/add_bias_transpose.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/attention_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/attention_impl.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/attention_kv_cache.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/attention_prepare_qkv.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/attention_softmax.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/attention_strided_copy.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/attention_transpose.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/bert_padding.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/bert_padding.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm50.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm70.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm75.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/fmha_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/cutlass_fmha/memory_efficient_attention.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/decoder_attention_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/embed_layer_norm_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/fast_gelu_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/fastertransformer_decoder_attention/decoder_masked_multihead_attention_128.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/fastertransformer_decoder_attention/decoder_masked_multihead_attention_32.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/fastertransformer_decoder_attention/decoder_masked_multihead_attention_64.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/fastertransformer_decoder_attention/decoder_masked_multihead_attention_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_hdim128_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_hdim160_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_hdim192_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_hdim224_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_hdim256_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_hdim32_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_hdim64_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_hdim96_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim128_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim160_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim192_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim224_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim256_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim32_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim64_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/flash_fwd_split_hdim96_fp16_sm80.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/flash_attention/utils.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/group_query_attention_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/group_query_attention_impl.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/layer_norm.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/longformer_attention_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/longformer_attention_softmax.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/longformer_global_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/ngram_repeat_block_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/packed_attention_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/packed_attention_impl.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/packed_multihead_attention_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/packed_multihead_attention_impl.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/relative_attn_bias_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/rotary_embedding_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/skip_layer_norm_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/tensorrt_fused_multihead_attention/cudaDriverWrapper.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/tensorrt_fused_multihead_attention/flash_attention/sharedCubinLoader.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/tensorrt_fused_multihead_attention/fused_multihead_attention.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/tensorrt_fused_multihead_attention/mha_runner.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/bert/utils.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/diffusion/bias_add_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/diffusion/bias_add_impl.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/diffusion/bias_split_gelu_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/diffusion/bias_split_gelu_impl.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/diffusion/group_norm_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/diffusion/group_norm_impl.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/grid_sample_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/math/bias_dropout_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/math/bias_gelu_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/math/bias_softmax_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/math/complex_mul_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/math/fft_ops_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/math/isfinite.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/math/isfinite_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/math/isfinite_impl.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/attention_quantization_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/attention_quantization_impl.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/dequantize_blockwise.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/dequantize_blockwise.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/dequantize_blockwise_bnb4.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/dequantize_blockwise_bnb4.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/matmul_bnb4.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/matmul_bnb4.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/matmul_nbits.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/matmul_nbits.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/qordered_ops/qordered_attention_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/qordered_ops/qordered_common.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/qordered_ops/qordered_layer_norm_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/qordered_ops/qordered_qdq_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/quantization/qordered_ops/qordered_unary_ops_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/tensor/crop_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/tensor/image_scaler_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/transformers/beam_search_topk.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/transformers/beam_search_topk.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/transformers/dump_cuda_tensor.cc
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/transformers/generation_cuda_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/transformers/generation_cuda_impl.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/transformers/generation_device_helper.cc
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/transformers/greedy_search_top_one.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/cuda/transformers/greedy_search_top_one.h
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/attention.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/attention_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/batched_gemm_permute_pipelines.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/batched_gemm_softmax_gemm_permute_ck_impl/impl.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/batched_gemm_softmax_gemm_permute_ck_impl/impl_fp16.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/batched_gemm_softmax_gemm_permute_ck_impl/impl_fp16_biased.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/batched_gemm_softmax_gemm_permute_ck_impl/impl_fp16_biased_biased.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/batched_gemm_softmax_gemm_permute_pipelines.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/elementwise_impl/impl.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/elementwise_impl/impl_fastgelu.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/elementwise_impl/impl_gelu.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/elementwise_impl/impl_relu.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/gemm_fast_gelu_ck.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/gemm_fast_gelu_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/gemm_fast_gelu_tunable.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/layer_norm.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/multihead_attention.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/bert/skip_layer_norm_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/diffusion/group_norm_ck.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/diffusion/group_norm_ck_impl/impl.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/diffusion/group_norm_ck_impl/impl_fp16.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/diffusion/group_norm_ck_impl/impl_fp32.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/diffusion/group_norm_impl.cu
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/diffusion/group_norm_impl_kernel.cuh
    • onnxruntime-1.16.3/onnxruntime/contrib_ops/rocm/diffusion/group_norm_triton.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/activation/activations_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/atomic/common.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/cu_inc/binary_elementwise_impl.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/cu_inc/bitmask.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/cu_inc/common.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/cu_inc/elementwise_impl.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/cu_inc/unary_elementwise_impl.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/cu_inc/variadic_elementwise_impl.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/cuda_graph.cc
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/cuda_pch.h
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/cuda_utils.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/fpgeneric.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/generator/random_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/generator/range_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/binary_elementwise_ops_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/binary_elementwise_ops_impl_functors.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/clip_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/cumsum_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/einsum_utils/einsum_auxiliary_ops_diagonal.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/matmul_integer.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/matmul_integer.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/softmax_blockwise_impl.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/softmax_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/softmax_warpwise_impl.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/topk_impl.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/topk_impl_f16.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/topk_impl_f32.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/topk_impl_f64.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/topk_impl_i32.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/topk_impl_i64.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/unary_elementwise_ops_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/unary_elementwise_ops_impl.h
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/math/variadic_elementwise_ops_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/multi_tensor/common.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/nn/dropout_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/nn/instance_norm_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/nn/layer_norm_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/nn/max_pool_with_index.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/nn/shrink_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/object_detection/non_max_suppression_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/object_detection/roialign_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/reduction/reduction_functions.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/reduction/reduction_utils.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/rnn/rnn_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/shared_inc/accumulation_type.h
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/shared_inc/cuda_utils.h
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/shared_inc/fast_divmod.h
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/cast_op.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/compress_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/concat_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/expand_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/eye_like_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/gather_elements_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/gather_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/gather_nd_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/nonzero_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/onehot.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/pad_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/quantize_linear.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/quantize_linear.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/resize_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/reverse_sequence_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/scatter_nd_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/slice_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/split_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/tile_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/transpose_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/trilu_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/upsample_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tensor/where_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/triton_kernel.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/triton_kernel.h
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tunable/cuda_tunable.h
    • onnxruntime-1.16.3/onnxruntime/core/providers/cuda/tunable/util.h
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/atomic/common.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/cu_inc/common.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/fpgeneric.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/math/einsum_utils/einsum_auxiliary_ops_diagonal.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/math/softmax_ck.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/math/softmax_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/math/softmax_triton.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/math/softmax_tunable_op.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/math/softmax_warpwise_impl.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/nn/conv_impl.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/rocm_utils.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/tunable/gemm.cu
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/tunable/gemm_ck.cuh
    • onnxruntime-1.16.3/onnxruntime/core/providers/rocm/tunable/gemm_tunable.cuh
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernel_explorer_interface.h
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/cuda/dequant_blockwise_bnb4.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/cuda/dequant_blockwise_int4.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/cuda/gemm.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/cuda/matmul_4bits.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/cuda/matmul_bnb4.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/rocm/elementwise.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/rocm/gemm_ck.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/rocm/gemm_fast_gelu_ck.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/rocm/gemm_fast_gelu_hipblaslt.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/rocm/gemm_fast_gelu_tunable.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/rocm/gemm_fast_gelu_unfused.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/rocm/gemm_hipblaslt.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/rocm/gemm_softmax_gemm_permute.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/rocm/gemm_tunable.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/rocm/group_norm.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/rocm/skip_layer_norm.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/rocm/softmax.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/vector_add.cu
    • onnxruntime-1.16.3/onnxruntime/python/tools/kernel_explorer/kernels/vector_add_kernel.cuh
    • onnxruntime-1.16.3/orttraining/orttraining/python/training/ortmodule/torch_cpp_extensions/cuda/fused_ops/multi_tensor_adam.cu
    • onnxruntime-1.16.3/orttraining/orttraining/python/training/ortmodule/torch_cpp_extensions/cuda/fused_ops/multi_tensor_apply.cuh
    • onnxruntime-1.16.3/orttraining/orttraining/python/training/ortmodule/torch_cpp_extensions/cuda/fused_ops/multi_tensor_axpby_kernel.cu
    • onnxruntime-1.16.3/orttraining/orttraining/python/training/ortmodule/torch_cpp_extensions/cuda/fused_ops/multi_tensor_l2norm_kernel.cu
    • onnxruntime-1.16.3/orttraining/orttraining/python/training/ortmodule/torch_cpp_extensions/cuda/fused_ops/multi_tensor_scale_kernel.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/activation/activations_grad_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/activation/bias_gelu_grad_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/activation/gelu_grad_impl_common.cuh
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/gist/gist_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/gist/gist_impl.h
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/loss/softmax_cross_entropy_loss_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/loss/softmaxcrossentropy_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/math/bias_softmax_dropout_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/math/div_grad_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/math/isfinite_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/math/isfinite_impl.h
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/math/mixed_precision_scale_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/math/mixed_precision_scale_impl.h
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/math/scale_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/math/scale_impl.h
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/math/softmax_dropout_grad_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/math/softmax_grad_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/nn/dropout_grad_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/nn/layer_norm_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/optimizer/adam_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/optimizer/adam_impl.h
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/optimizer/adamw/adamw_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/optimizer/clip_grad_norm/clip_grad_norm_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/optimizer/common.cuh
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/optimizer/gradient_control_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/optimizer/gradient_control_impl.h
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/optimizer/lamb_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/optimizer/lamb_impl.h
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/optimizer/sg_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/optimizer/sg_impl.h
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/optimizer/sgd/sgd_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/quantization/fake_quant_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/reduction/all_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/reduction/all_impl.h
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/reduction/reduction_all_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/reduction/reduction_all_impl.h
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/tensor/gather_grad_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/tensor/gather_nd_grad_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/tensor/gather_nd_grad_impl.h
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/tensor/pad_and_unflatten_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/cuda/tensor/resize_grad_impl.cu
    • onnxruntime-1.16.3/orttraining/orttraining/training_ops/rocm/activation/gelu_grad_impl_common.cuh
  • opencv-4.10.0.tar.gz
    • opencv-4.10.0/cmake/checks/OpenCVDetectCudaArch.cu
    • opencv-4.10.0/modules/core/include/opencv2/core/cuda/common.hpp
    • opencv-4.10.0/modules/core/include/opencv2/core/cuda_stream_accessor.hpp
    • opencv-4.10.0/modules/core/src/cuda/gpu_mat.cu
    • opencv-4.10.0/modules/core/src/cuda/gpu_mat_nd.cu
    • opencv-4.10.0/modules/dnn/src/cuda/activation_eltwise.cu
    • opencv-4.10.0/modules/dnn/src/cuda/activations.cu
    • opencv-4.10.0/modules/dnn/src/cuda/array.hpp
    • opencv-4.10.0/modules/dnn/src/cuda/atomics.hpp
    • opencv-4.10.0/modules/dnn/src/cuda/bbox_utils.hpp
    • opencv-4.10.0/modules/dnn/src/cuda/bias_activation.cu
    • opencv-4.10.0/modules/dnn/src/cuda/bias_activation_eltwise.cu
    • opencv-4.10.0/modules/dnn/src/cuda/bias_eltwise_activation.cu
    • opencv-4.10.0/modules/dnn/src/cuda/block_stride_range.hpp
    • opencv-4.10.0/modules/dnn/src/cuda/concat.cu
    • opencv-4.10.0/modules/dnn/src/cuda/crop_and_resize.cu
    • opencv-4.10.0/modules/dnn/src/cuda/detection_output.cu
    • opencv-4.10.0/modules/dnn/src/cuda/eltwise_activation.cu
    • opencv-4.10.0/modules/dnn/src/cuda/eltwise_ops.cu
    • opencv-4.10.0/modules/dnn/src/cuda/execution.hpp
    • opencv-4.10.0/modules/dnn/src/cuda/fill_copy.cu
    • opencv-4.10.0/modules/dnn/src/cuda/fp_conversion.cu
    • opencv-4.10.0/modules/dnn/src/cuda/functors.hpp
    • opencv-4.10.0/modules/dnn/src/cuda/grid_nms.cu
    • opencv-4.10.0/modules/dnn/src/cuda/grid_stride_range.hpp
    • opencv-4.10.0/modules/dnn/src/cuda/index_helpers.hpp
    • opencv-4.10.0/modules/dnn/src/cuda/limits.hpp
    • opencv-4.10.0/modules/dnn/src/cuda/math.hpp
    • opencv-4.10.0/modules/dnn/src/cuda/max_unpooling.cu
    • opencv-4.10.0/modules/dnn/src/cuda/memory.hpp
    • opencv-4.10.0/modules/dnn/src/cuda/mvn.cu
    • opencv-4.10.0/modules/dnn/src/cuda/normalize.cu
    • opencv-4.10.0/modules/dnn/src/cuda/padding.cu
    • opencv-4.10.0/modules/dnn/src/cuda/permute.cu
    • opencv-4.10.0/modules/dnn/src/cuda/prior_box.cu
    • opencv-4.10.0/modules/dnn/src/cuda/region.cu
    • opencv-4.10.0/modules/dnn/src/cuda/resize.cu
    • opencv-4.10.0/modules/dnn/src/cuda/roi_pooling.cu
    • opencv-4.10.0/modules/dnn/src/cuda/scale_shift.cu
    • opencv-4.10.0/modules/dnn/src/cuda/shortcut.cu
    • opencv-4.10.0/modules/dnn/src/cuda/slice.cu
    • opencv-4.10.0/modules/dnn/src/cuda/vector_traits.hpp
    • opencv-4.10.0/modules/dnn/src/cuda4dnn/csl/error.hpp
    • opencv-4.10.0/modules/dnn/src/cuda4dnn/csl/event.hpp
    • opencv-4.10.0/modules/dnn/src/cuda4dnn/csl/memory.hpp
    • opencv-4.10.0/modules/dnn/src/cuda4dnn/csl/nvcc_defs.hpp
    • opencv-4.10.0/modules/dnn/src/cuda4dnn/csl/pointer.hpp
    • opencv-4.10.0/modules/dnn/src/cuda4dnn/csl/stream.hpp
    • opencv-4.10.0/modules/dnn/src/cuda4dnn/init.hpp
    • opencv-4.10.0/modules/photo/src/cuda/nlm.cu
    • opencv-4.10.0/modules/stitching/src/cuda/build_warp_maps.cu
    • opencv-4.10.0/modules/stitching/src/cuda/multiband_blend.cu
    • opencv-4.10.0/samples/cpp/tutorial_code/gpu/gpu-thrust-interop/main.cu
  • openmpi-4.1.6.tar.bz2
    • openmpi-4.1.6/opal/mca/common/cuda/common_cuda.c
  • openvdb-11.0.0.tar.gz
    • openvdb-11.0.0/nanovdb/nanovdb/util/GridStats.h
    • openvdb-11.0.0/nanovdb/nanovdb/util/cuda/CudaAddBlindData.cuh
    • openvdb-11.0.0/nanovdb/nanovdb/util/cuda/CudaGridChecksum.cuh
    • openvdb-11.0.0/nanovdb/nanovdb/util/cuda/CudaGridHandle.cuh
    • openvdb-11.0.0/nanovdb/nanovdb/util/cuda/CudaGridStats.cuh
    • openvdb-11.0.0/nanovdb/nanovdb/util/cuda/CudaIndexToGrid.cuh
    • openvdb-11.0.0/nanovdb/nanovdb/util/cuda/CudaNodeManager.cuh
    • openvdb-11.0.0/nanovdb/nanovdb/util/cuda/CudaPointsToGrid.cuh
    • openvdb-11.0.0/nanovdb/nanovdb/util/cuda/CudaSignedFloodFill.cuh
    • openvdb-11.0.0/nanovdb/nanovdb/util/cuda/CudaUtils.h
    • openvdb-11.0.0/nanovdb/nanovdb/util/cuda/GpuTimer.h
  • pcl-pcl-1.14.1.tar.gz
    • pcl-pcl-1.14.1/cuda/common/include/pcl/cuda/common/point_type_rgb.h
    • pcl-pcl-1.14.1/cuda/common/include/pcl/cuda/cutil_inline.h
    • pcl-pcl-1.14.1/cuda/common/include/pcl/cuda/point_types.h
    • pcl-pcl-1.14.1/cuda/common/include/pcl/cuda/time_gpu.h
    • pcl-pcl-1.14.1/cuda/features/src/normal_3d.cu
    • pcl-pcl-1.14.1/cuda/io/src/cloud_from_pcl.cu
    • pcl-pcl-1.14.1/cuda/io/src/debayering.cu
    • pcl-pcl-1.14.1/cuda/io/src/disparity_to_cloud.cu
    • pcl-pcl-1.14.1/cuda/io/src/extract_indices.cu
    • pcl-pcl-1.14.1/cuda/io/src/host_device.cu
    • pcl-pcl-1.14.1/cuda/io/src/kinect_smoothing.cu
    • pcl-pcl-1.14.1/cuda/sample_consensus/src/multi_ransac.cu
    • pcl-pcl-1.14.1/cuda/sample_consensus/src/ransac.cu
    • pcl-pcl-1.14.1/cuda/sample_consensus/src/sac_model.cu
    • pcl-pcl-1.14.1/cuda/sample_consensus/src/sac_model_1point_plane.cu
    • pcl-pcl-1.14.1/cuda/sample_consensus/src/sac_model_plane.cu
    • pcl-pcl-1.14.1/cuda/segmentation/src/connected_components.cu
    • pcl-pcl-1.14.1/gpu/containers/src/device_memory.cpp
    • pcl-pcl-1.14.1/gpu/features/src/centroid.cu
    • pcl-pcl-1.14.1/gpu/features/src/fpfh.cu
    • pcl-pcl-1.14.1/gpu/features/src/internal.hpp
    • pcl-pcl-1.14.1/gpu/features/src/normal_3d.cu
    • pcl-pcl-1.14.1/gpu/features/src/pfh.cu
    • pcl-pcl-1.14.1/gpu/features/src/ppf.cu
    • pcl-pcl-1.14.1/gpu/features/src/principal_curvatures.cu
    • pcl-pcl-1.14.1/gpu/features/src/spinimages.cu
    • pcl-pcl-1.14.1/gpu/features/src/uniq_inds.cu
    • pcl-pcl-1.14.1/gpu/features/src/vfh.cu
    • pcl-pcl-1.14.1/gpu/kinfu/src/cuda/bilateral_pyrdown.cu
    • pcl-pcl-1.14.1/gpu/kinfu/src/cuda/colors.cu
    • pcl-pcl-1.14.1/gpu/kinfu/src/cuda/coresp.cu
    • pcl-pcl-1.14.1/gpu/kinfu/src/cuda/estimate_combined.cu
    • pcl-pcl-1.14.1/gpu/kinfu/src/cuda/estimate_tranform.cu
    • pcl-pcl-1.14.1/gpu/kinfu/src/cuda/extract.cu
    • pcl-pcl-1.14.1/gpu/kinfu/src/cuda/image_generator.cu
    • pcl-pcl-1.14.1/gpu/kinfu/src/cuda/maps.cu
    • pcl-pcl-1.14.1/gpu/kinfu/src/cuda/marching_cubes.cu
    • pcl-pcl-1.14.1/gpu/kinfu/src/cuda/normals_eigen.cu
    • pcl-pcl-1.14.1/gpu/kinfu/src/cuda/ray_caster.cu
    • pcl-pcl-1.14.1/gpu/kinfu/src/cuda/tsdf_volume.cu
    • pcl-pcl-1.14.1/gpu/kinfu/src/cuda/utils.hpp
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/include/pcl/gpu/kinfu_large_scale/cyclical_buffer.h
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/include/pcl/gpu/kinfu_large_scale/tsdf_buffer.h
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/bilateral_pyrdown.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/colors.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/coresp.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/estimate_combined.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/estimate_tranform.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/extract.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/image_generator.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/maps.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/marching_cubes.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/normals_eigen.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/pointer_shift.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/push.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/ray_caster.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/tsdf_volume.cu
    • pcl-pcl-1.14.1/gpu/kinfu_large_scale/src/cuda/utils.hpp
    • pcl-pcl-1.14.1/gpu/octree/src/cuda/approx_nsearch.cu
    • pcl-pcl-1.14.1/gpu/octree/src/cuda/bfrs.cu
    • pcl-pcl-1.14.1/gpu/octree/src/cuda/knn_search.cu
    • pcl-pcl-1.14.1/gpu/octree/src/cuda/octree_builder.cu
    • pcl-pcl-1.14.1/gpu/octree/src/cuda/octree_host.cu
    • pcl-pcl-1.14.1/gpu/octree/src/cuda/radius_search.cu
    • pcl-pcl-1.14.1/gpu/people/include/pcl/gpu/people/face_detector.h
    • pcl-pcl-1.14.1/gpu/people/include/pcl/gpu/people/label_common.h
    • pcl-pcl-1.14.1/gpu/people/src/cuda/device.h
    • pcl-pcl-1.14.1/gpu/people/src/cuda/elec.cu
    • pcl-pcl-1.14.1/gpu/people/src/cuda/multi_tree.cu
    • pcl-pcl-1.14.1/gpu/people/src/cuda/nvidia/NCV.cu
    • pcl-pcl-1.14.1/gpu/people/src/cuda/nvidia/NCV.hpp
    • pcl-pcl-1.14.1/gpu/people/src/cuda/nvidia/NCVHaarObjectDetection.cu
    • pcl-pcl-1.14.1/gpu/people/src/cuda/nvidia/NCVPyramid.cu
    • pcl-pcl-1.14.1/gpu/people/src/cuda/nvidia/NPP_staging.cu
    • pcl-pcl-1.14.1/gpu/people/src/cuda/prob.cu
    • pcl-pcl-1.14.1/gpu/people/src/cuda/shs.cu
    • pcl-pcl-1.14.1/gpu/people/src/cuda/smooth.cu
    • pcl-pcl-1.14.1/gpu/people/src/cuda/utils.cu
    • pcl-pcl-1.14.1/gpu/people/src/face_detector.cpp
    • pcl-pcl-1.14.1/gpu/surface/src/cuda/convex_hull.cu
    • pcl-pcl-1.14.1/gpu/surface/src/internal.h
    • pcl-pcl-1.14.1/gpu/tracking/src/cuda/particle_filter.cu
    • pcl-pcl-1.14.1/gpu/utils/include/pcl/gpu/utils/safe_call.hpp
    • pcl-pcl-1.14.1/gpu/utils/include/pcl/gpu/utils/timers_cuda.hpp
    • pcl-pcl-1.14.1/gpu/utils/src/repacks.cu
  • rmm-24.04.00a.tar.gz
    • rmm-24.04.00a/include/rmm/cuda_device.hpp
    • rmm-24.04.00a/include/rmm/cuda_stream.hpp
    • rmm-24.04.00a/include/rmm/cuda_stream_view.hpp
    • rmm-24.04.00a/include/rmm/detail/dynamic_load_runtime.hpp
    • rmm-24.04.00a/include/rmm/detail/error.hpp
    • rmm-24.04.00a/include/rmm/device_buffer.hpp
    • rmm-24.04.00a/include/rmm/device_uvector.hpp
    • rmm-24.04.00a/include/rmm/mr/device/arena_memory_resource.hpp
    • rmm-24.04.00a/include/rmm/mr/device/binning_memory_resource.hpp
    • rmm-24.04.00a/include/rmm/mr/device/cuda_async_memory_resource.hpp
    • rmm-24.04.00a/include/rmm/mr/device/cuda_async_view_memory_resource.hpp
    • rmm-24.04.00a/include/rmm/mr/device/detail/arena.hpp
    • rmm-24.04.00a/include/rmm/mr/device/detail/stream_ordered_memory_resource.hpp
    • rmm-24.04.00a/include/rmm/mr/device/device_memory_resource.hpp
    • rmm-24.04.00a/include/rmm/mr/device/fixed_size_memory_resource.hpp
    • rmm-24.04.00a/include/rmm/mr/device/pool_memory_resource.hpp
    • rmm-24.04.00a/include/rmm/mr/device/thrust_allocator_adaptor.hpp
    • rmm-24.04.00a/include/rmm/mr/host/host_memory_resource.hpp
    • rmm-24.04.00a/python/rmm/_lib/_torch_allocator.cpp
  • root-6-22-06.zip
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cuda.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cuda/ActivationFunctions.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cuda/Arithmetic.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cuda/CudaMatrix.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cuda/CudaTensor.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cuda/Dropout.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cuda/Initialization.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cuda/Kernels.cuh
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cuda/LossFunctions.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cuda/OutputFunctions.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cuda/Propagation.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cuda/RecurrentPropagation.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cuda/Regularization.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cudnn.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cudnn/ActivationFunctions.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cudnn/Arithmetic.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cudnn/Dropout.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cudnn/Initialization.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cudnn/LossFunctions.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cudnn/OutputFunctions.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cudnn/Propagate.cu
    • root-6-22-06/tmva/tmva/src/DNN/Architectures/Cudnn/RecurrentPropagation.cu
  • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb.zip
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/cmake/cuda/compute_capability.cpp
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/atomic.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/bitset.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/cuda/atomic.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/cuda/impl/atomic_detail.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/cuda/impl/device.cpp
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/cuda/impl/error.h
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/deque.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/impl/atomic_detail.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/impl/bitset_detail.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/impl/deque_detail.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/impl/mutex_detail.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/impl/queue_detail.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/impl/stack_detail.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/impl/unordered_base.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/impl/unordered_base_detail.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/impl/unordered_map_detail.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/impl/unordered_set_detail.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/impl/vector_detail.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/mutex.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/queue.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/stack.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/unordered_map.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/unordered_set.cuh
    • stdgpu-b70405bc62958f1f1334a936a20cd025351816fb/src/stdgpu/vector.cuh
  • sundials-6.7.0.tar.gz
    • sundials-6.7.0/include/nvector/nvector_cuda.h
    • sundials-6.7.0/include/sunlinsol/sunlinsol_cusolversp_batchqr.h
    • sundials-6.7.0/include/sunmatrix/sunmatrix_cusparse.h
    • sundials-6.7.0/include/sunmemory/sunmemory_cuda.h
    • sundials-6.7.0/src/nvector/cuda/VectorArrayKernels.cuh
    • sundials-6.7.0/src/nvector/cuda/VectorKernels.cuh
    • sundials-6.7.0/src/nvector/cuda/nvector_cuda.cu
    • sundials-6.7.0/src/sundials/sundials_cuda.h
    • sundials-6.7.0/src/sundials/sundials_cuda_kernels.cuh
    • sundials-6.7.0/src/sundials/sundials_cusolver.h
    • sundials-6.7.0/src/sundials/sundials_cusparse.h
    • sundials-6.7.0/src/sunlinsol/cusolversp/sunlinsol_cusolversp_batchqr.cu
    • sundials-6.7.0/src/sunmatrix/cusparse/cusparse_kernels.cuh
    • sundials-6.7.0/src/sunmatrix/cusparse/sunmatrix_cusparse.cu
    • sundials-6.7.0/src/sunmatrix/magmadense/dense_cuda_kernels.cuh
    • sundials-6.7.0/src/sunmemory/cuda/sundials_cuda_memory.cu
    • sundials-6.7.0/src/cvode/cvode_fused_gpu.cpp
  • taskflow-3.7.0.tar.gz
    • taskflow-3.7.0/taskflow/cuda/cuda_error.hpp
  • tensorflow-2.12.0.tar.gz
    • tensorflow-2.12.0/tensorflow/compiler/xla/service/gpu/buffer_comparator.cc
  • tensorpipe-c54fdda499bbf78f277ea78c1e4d1f6d635b9a98.zip
    • tensorpipe-c54fdda499bbf78f277ea78c1e4d1f6d635b9a98/tensorpipe/channel/cuda_basic/channel_impl.cc
    • tensorpipe-c54fdda499bbf78f277ea78c1e4d1f6d635b9a98/tensorpipe/channel/cuda_gdr/context_impl.cc
    • tensorpipe-c54fdda499bbf78f277ea78c1e4d1f6d635b9a98/tensorpipe/channel/cuda_ipc/channel_impl.cc
    • tensorpipe-c54fdda499bbf78f277ea78c1e4d1f6d635b9a98/tensorpipe/channel/cuda_ipc/channel_impl.h
    • tensorpipe-c54fdda499bbf78f277ea78c1e4d1f6d635b9a98/tensorpipe/channel/cuda_xth/channel_impl.cc
    • tensorpipe-c54fdda499bbf78f277ea78c1e4d1f6d635b9a98/tensorpipe/common/cuda.h
    • tensorpipe-c54fdda499bbf78f277ea78c1e4d1f6d635b9a98/tensorpipe/common/cuda_buffer.h
    • tensorpipe-c54fdda499bbf78f277ea78c1e4d1f6d635b9a98/tensorpipe/common/cuda_host_allocator.cc
    • tensorpipe-c54fdda499bbf78f277ea78c1e4d1f6d635b9a98/tensorpipe/common/cuda_lib.h
    • tensorpipe-c54fdda499bbf78f277ea78c1e4d1f6d635b9a98/tensorpipe/common/cuda_loop.h
  • vilib-7a12fe1db78ca755876469979394076f10fe577c.zip
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/include/vilib/cuda_common.h
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/include/vilib/preprocess/conv_filter.h
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/include/vilib/preprocess/pyramid.h
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/include/vilib/storage/opencv.h
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/include/vilib/storage/ros.h
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/include/vilib/storage/subframe.h
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/include/vilib/timergpu.h
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/src/feature_detection/detector_base_gpu_cuda_tools.cu
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/src/feature_detection/fast/fast_gpu_cuda_tools.cu
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/src/feature_detection/harris/harris_gpu_cuda_tools.cu
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/src/feature_tracker/feature_tracker_cuda_tools.cu
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/src/preprocess/conv_filter_col.cu
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/src/preprocess/conv_filter_row.cu
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/src/preprocess/pyramid_gpu.cu
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/src/storage/subframe.cpp
    • vilib-7a12fe1db78ca755876469979394076f10fe577c/visual_lib/utility/determine_cc.cpp
  • whisper.cpp-1.6.2.tar.gz
    • whisper.cpp-1.6.2/bindings/ruby/ext/ggml-common.h
    • whisper.cpp-1.6.2/ggml-common.h
    • whisper.cpp-1.6.2/ggml-cuda.cu
    • whisper.cpp-1.6.2/ggml-cuda/acc.cu
    • whisper.cpp-1.6.2/ggml-cuda/acc.cuh
    • whisper.cpp-1.6.2/ggml-cuda/arange.cu
    • whisper.cpp-1.6.2/ggml-cuda/arange.cuh
    • whisper.cpp-1.6.2/ggml-cuda/argsort.cu
    • whisper.cpp-1.6.2/ggml-cuda/argsort.cuh
    • whisper.cpp-1.6.2/ggml-cuda/binbcast.cu
    • whisper.cpp-1.6.2/ggml-cuda/binbcast.cuh
    • whisper.cpp-1.6.2/ggml-cuda/clamp.cu
    • whisper.cpp-1.6.2/ggml-cuda/clamp.cuh
    • whisper.cpp-1.6.2/ggml-cuda/common.cuh
    • whisper.cpp-1.6.2/ggml-cuda/concat.cu
    • whisper.cpp-1.6.2/ggml-cuda/concat.cuh
    • whisper.cpp-1.6.2/ggml-cuda/convert.cu
    • whisper.cpp-1.6.2/ggml-cuda/convert.cuh
    • whisper.cpp-1.6.2/ggml-cuda/cpy.cu
    • whisper.cpp-1.6.2/ggml-cuda/cpy.cuh
    • whisper.cpp-1.6.2/ggml-cuda/dequantize.cuh
    • whisper.cpp-1.6.2/ggml-cuda/diagmask.cu
    • whisper.cpp-1.6.2/ggml-cuda/diagmask.cuh
    • whisper.cpp-1.6.2/ggml-cuda/dmmv.cu
    • whisper.cpp-1.6.2/ggml-cuda/dmmv.cuh
    • whisper.cpp-1.6.2/ggml-cuda/fattn-common.cuh
    • whisper.cpp-1.6.2/ggml-cuda/fattn-vec-f16.cu
    • whisper.cpp-1.6.2/ggml-cuda/fattn-vec-f16.cuh
    • whisper.cpp-1.6.2/ggml-cuda/fattn-vec-f32.cu
    • whisper.cpp-1.6.2/ggml-cuda/fattn-vec-f32.cuh
    • whisper.cpp-1.6.2/ggml-cuda/fattn.cu
    • whisper.cpp-1.6.2/ggml-cuda/fattn.cuh
    • whisper.cpp-1.6.2/ggml-cuda/getrows.cu
    • whisper.cpp-1.6.2/ggml-cuda/getrows.cuh
    • whisper.cpp-1.6.2/ggml-cuda/im2col.cu
    • whisper.cpp-1.6.2/ggml-cuda/im2col.cuh
    • whisper.cpp-1.6.2/ggml-cuda/mmq.cu
    • whisper.cpp-1.6.2/ggml-cuda/mmq.cuh
    • whisper.cpp-1.6.2/ggml-cuda/mmvq.cu
    • whisper.cpp-1.6.2/ggml-cuda/mmvq.cuh
    • whisper.cpp-1.6.2/ggml-cuda/norm.cu
    • whisper.cpp-1.6.2/ggml-cuda/norm.cuh
    • whisper.cpp-1.6.2/ggml-cuda/pad.cu
    • whisper.cpp-1.6.2/ggml-cuda/pad.cuh
    • whisper.cpp-1.6.2/ggml-cuda/pool2d.cu
    • whisper.cpp-1.6.2/ggml-cuda/pool2d.cuh
    • whisper.cpp-1.6.2/ggml-cuda/quantize.cu
    • whisper.cpp-1.6.2/ggml-cuda/quantize.cuh
    • whisper.cpp-1.6.2/ggml-cuda/rope.cu
    • whisper.cpp-1.6.2/ggml-cuda/rope.cuh
    • whisper.cpp-1.6.2/ggml-cuda/scale.cu
    • whisper.cpp-1.6.2/ggml-cuda/scale.cuh
    • whisper.cpp-1.6.2/ggml-cuda/softmax.cu
    • whisper.cpp-1.6.2/ggml-cuda/softmax.cuh
    • whisper.cpp-1.6.2/ggml-cuda/sumrows.cu
    • whisper.cpp-1.6.2/ggml-cuda/sumrows.cuh
    • whisper.cpp-1.6.2/ggml-cuda/tsembd.cu
    • whisper.cpp-1.6.2/ggml-cuda/tsembd.cuh
    • whisper.cpp-1.6.2/ggml-cuda/unary.cu
    • whisper.cpp-1.6.2/ggml-cuda/unary.cuh
    • whisper.cpp-1.6.2/ggml-cuda/upscale.cu
    • whisper.cpp-1.6.2/ggml-cuda/upscale.cuh
    • whisper.cpp-1.6.2/ggml-cuda/vecdotq.cuh
  • xgboost-2.0.3.tar.gz
    • xgboost-2.0.3/include/xgboost/span.h
    • xgboost-2.0.3/jvm-packages/xgboost4j-gpu/src/native/xgboost4j-gpu.cu
    • xgboost-2.0.3/src/c_api/c_api.cu
    • xgboost-2.0.3/src/collective/communicator-inl.cuh
    • xgboost-2.0.3/src/collective/communicator.cu
    • xgboost-2.0.3/src/collective/device_communicator.cuh
    • xgboost-2.0.3/src/collective/device_communicator_adapter.cuh
    • xgboost-2.0.3/src/collective/nccl_device_communicator.cu
    • xgboost-2.0.3/src/collective/nccl_device_communicator.cuh
    • xgboost-2.0.3/src/common/algorithm.cuh
    • xgboost-2.0.3/src/common/common.cu
    • xgboost-2.0.3/src/common/cuda_context.cuh
    • xgboost-2.0.3/src/common/deterministic.cuh
    • xgboost-2.0.3/src/common/device_helpers.cuh
    • xgboost-2.0.3/src/common/hist_util.cu
    • xgboost-2.0.3/src/common/hist_util.cuh
    • xgboost-2.0.3/src/common/host_device_vector.cu
    • xgboost-2.0.3/src/common/linalg_op.cuh
    • xgboost-2.0.3/src/common/numeric.cu
    • xgboost-2.0.3/src/common/quantile.cu
    • xgboost-2.0.3/src/common/quantile.cuh
    • xgboost-2.0.3/src/common/ranking_utils.cu
    • xgboost-2.0.3/src/common/ranking_utils.cuh
    • xgboost-2.0.3/src/common/stats.cu
    • xgboost-2.0.3/src/common/stats.cuh
    • xgboost-2.0.3/src/common/threading_utils.cuh
    • xgboost-2.0.3/src/context.cu
    • xgboost-2.0.3/src/data/array_interface.cu
    • xgboost-2.0.3/src/data/data.cu
    • xgboost-2.0.3/src/data/device_adapter.cuh
    • xgboost-2.0.3/src/data/ellpack_page.cu
    • xgboost-2.0.3/src/data/ellpack_page.cuh
    • xgboost-2.0.3/src/data/ellpack_page_raw_format.cu
    • xgboost-2.0.3/src/data/ellpack_page_source.cu
    • xgboost-2.0.3/src/data/gradient_index.cu
    • xgboost-2.0.3/src/data/iterative_dmatrix.cu
    • xgboost-2.0.3/src/data/proxy_dmatrix.cu
    • xgboost-2.0.3/src/data/proxy_dmatrix.cuh
    • xgboost-2.0.3/src/data/simple_dmatrix.cu
    • xgboost-2.0.3/src/data/simple_dmatrix.cuh
    • xgboost-2.0.3/src/data/sparse_page_dmatrix.cu
    • xgboost-2.0.3/src/data/sparse_page_source.cu
    • xgboost-2.0.3/src/gbm/gbtree.cu
    • xgboost-2.0.3/src/linear/updater_gpu_coordinate.cu
    • xgboost-2.0.3/src/metric/auc.cu
    • xgboost-2.0.3/src/metric/elementwise_metric.cu
    • xgboost-2.0.3/src/metric/multiclass_metric.cu
    • xgboost-2.0.3/src/metric/rank_metric.cu
    • xgboost-2.0.3/src/metric/survival_metric.cu
    • xgboost-2.0.3/src/objective/adaptive.cu
    • xgboost-2.0.3/src/objective/aft_obj.cu
    • xgboost-2.0.3/src/objective/hinge.cu
    • xgboost-2.0.3/src/objective/lambdarank_obj.cu
    • xgboost-2.0.3/src/objective/lambdarank_obj.cuh
    • xgboost-2.0.3/src/objective/multiclass_obj.cu
    • xgboost-2.0.3/src/objective/quantile_obj.cu
    • xgboost-2.0.3/src/objective/regression_obj.cu
    • xgboost-2.0.3/src/predictor/gpu_predictor.cu
    • xgboost-2.0.3/src/tree/constraints.cu
    • xgboost-2.0.3/src/tree/constraints.cuh
    • xgboost-2.0.3/src/tree/fit_stump.cu
    • xgboost-2.0.3/src/tree/gpu_hist/evaluate_splits.cu
    • xgboost-2.0.3/src/tree/gpu_hist/evaluate_splits.cuh
    • xgboost-2.0.3/src/tree/gpu_hist/evaluator.cu
    • xgboost-2.0.3/src/tree/gpu_hist/expand_entry.cuh
    • xgboost-2.0.3/src/tree/gpu_hist/feature_groups.cu
    • xgboost-2.0.3/src/tree/gpu_hist/feature_groups.cuh
    • xgboost-2.0.3/src/tree/gpu_hist/gradient_based_sampler.cu
    • xgboost-2.0.3/src/tree/gpu_hist/gradient_based_sampler.cuh
    • xgboost-2.0.3/src/tree/gpu_hist/histogram.cu
    • xgboost-2.0.3/src/tree/gpu_hist/histogram.cuh
    • xgboost-2.0.3/src/tree/gpu_hist/row_partitioner.cu
    • xgboost-2.0.3/src/tree/gpu_hist/row_partitioner.cuh
    • xgboost-2.0.3/src/tree/updater_gpu_common.cuh
    • xgboost-2.0.3/src/tree/updater_gpu_hist.cu
  • zfp-1.0.1.tar.gz
    • zfp-1.0.1/src/cuda_zfp/cuZFP.cu
    • zfp-1.0.1/src/cuda_zfp/decode.cuh
    • zfp-1.0.1/src/cuda_zfp/decode1.cuh
    • zfp-1.0.1/src/cuda_zfp/decode2.cuh
    • zfp-1.0.1/src/cuda_zfp/decode3.cuh
    • zfp-1.0.1/src/cuda_zfp/encode.cuh
    • zfp-1.0.1/src/cuda_zfp/encode1.cuh
    • zfp-1.0.1/src/cuda_zfp/encode2.cuh
    • zfp-1.0.1/src/cuda_zfp/encode3.cuh
    • zfp-1.0.1/src/cuda_zfp/pointers.cuh
    • zfp-1.0.1/src/cuda_zfp/type_info.cuh
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment