CUAEV benchmark: https://github.com/roitberg-group/torchani_sandbox/blob/ed90fa65a7f07e59a95e75c962371a37ffb69ba0/tools/aev-benchmark-size.py
intrinsics on: python setup.py develop --ext --cuaev-opt
use_fast_math need nvcc args: -use_fast_math
File: small.pdb, Molecule size: 264 / 264, Species: [1, 6, 7, 8]
Original TorchANI:
GPU Memory Cached (pytorch) : 134.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 982.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.37 s
Speed: 1.87 ms/it
CUaev:
GPU Memory Cached (pytorch) : 128.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 976.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.07 s
Speed: 0.36 ms/it
aev_error: 3.10e-06
Speed up: 5.14 X
----------------------------------------------------------------------
File: 1hz5.pdb, Molecule size: 973 / 973, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 186.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1034.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.36 s
Speed: 1.81 ms/it
CUaev:
GPU Memory Cached (pytorch) : 128.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 976.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.07 s
Speed: 0.35 ms/it
aev_error: 4.77e-06
Speed up: 5.15 X
----------------------------------------------------------------------
File: 6W8H.pdb, Molecule size: 3410 / 3410, Species: [6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 496.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1344.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.50 s
Speed: 2.52 ms/it
CUaev:
GPU Memory Cached (pytorch) : 188.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1036.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.08 s
Speed: 0.38 ms/it
aev_error: 3.10e-06
Speed up: 6.59 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 6000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 1166.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 2014.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 1.03 s
Speed: 5.17 ms/it
CUaev:
GPU Memory Cached (pytorch) : 244.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1092.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.15 s
Speed: 0.76 ms/it
aev_error: 5.48e-06
Speed up: 6.84 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 10000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 3028.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 3876.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 2.22 s
Speed: 11.08 ms/it
CUaev:
GPU Memory Cached (pytorch) : 334.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1182.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.23 s
Speed: 1.15 ms/it
aev_error: 5.72e-06
Speed up: 9.61 X
----------------------------------------------------------------------
Add Backward
File: small.pdb, Molecule size: 264 / 264, Species: [1, 6, 7, 8]
Original TorchANI:
GPU Memory Cached (pytorch) : 190.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1038.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.69 s
Speed: 3.44 ms/it
CUaev:
GPU Memory Cached (pytorch) : 190.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1038.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.10 s
Speed: 0.49 ms/it
aev_error: 2.86e-06
force_error: 3.43e-05
Speed up: 7.00 X
----------------------------------------------------------------------
File: 1hz5.pdb, Molecule size: 973 / 973, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 418.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1266.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.70 s
Speed: 3.51 ms/it
CUaev:
GPU Memory Cached (pytorch) : 392.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1240.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.11 s
Speed: 0.53 ms/it
aev_error: 5.01e-06
force_error: 3.81e-05
Speed up: 6.57 X
----------------------------------------------------------------------
File: 6W8H.pdb, Molecule size: 3410 / 3410, Species: [6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 782.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1630.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.85 s
Speed: 4.26 ms/it
CUaev:
GPU Memory Cached (pytorch) : 486.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1334.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.11 s
Speed: 0.57 ms/it
aev_error: 3.10e-06
force_error: 9.06e-05
Speed up: 7.52 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 6000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 3242.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 4090.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 1.81 s
Speed: 9.03 ms/it
CUaev:
GPU Memory Cached (pytorch) : 2390.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 3238.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.26 s
Speed: 1.32 ms/it
aev_error: 5.48e-06
force_error: 4.58e-05
Speed up: 6.86 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 10000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 6674.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 7522.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 3.44 s
Speed: 17.19 ms/it
CUaev:
GPU Memory Cached (pytorch) : 3994.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 4842.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.40 s
Speed: 1.99 ms/it
aev_error: 5.48e-06
force_error: 5.34e-05
Speed up: 8.63 X
----------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------
RUN PDB Size forward backward Others Total Total(200) Speedup GPU
-----------------------------------------------------------------------------------------------------------------------------------------
01 py aev fd small.pdb 264 1.8 ms 0.0 ms 0.1 ms 1.9 ms 373.5 ms - 982.0MB
02 cu aev fd small.pdb 264 0.4 ms 0.0 ms 0.0 ms 0.4 ms 72.7 ms 5.14 976.0MB
03 py aev fd 1hz5.pdb 973 1.8 ms 0.0 ms 0.0 ms 1.8 ms 362.6 ms - 1034.0MB
04 cu aev fd 1hz5.pdb 973 0.3 ms 0.0 ms 0.0 ms 0.4 ms 70.4 ms 5.15 976.0MB
05 py aev fd 6W8H.pdb 3410 2.5 ms 0.0 ms 0.0 ms 2.5 ms 503.5 ms - 1344.0MB
06 cu aev fd 6W8H.pdb 3410 0.4 ms 0.0 ms 0.0 ms 0.4 ms 76.4 ms 6.59 1036.0MB
07 py aev fd 1C17.pdb 6000 5.2 ms 0.0 ms 0.0 ms 5.2 ms 1.034 sec - 2014.0MB
08 cu aev fd 1C17.pdb 6000 0.7 ms 0.0 ms 0.0 ms 0.8 ms 151.1 ms 6.84 1092.0MB
09 py aev fd 1C17.pdb 10000 11.1 ms 0.0 ms 0.0 ms 11.1 ms 2.216 sec - 3876.0MB
10 cu aev fd 1C17.pdb 10000 1.1 ms 0.0 ms 0.0 ms 1.2 ms 230.7 ms 9.61 1182.0MB
-----------------------------------------------------------------------------------------------------------------------------------------
11 py aev fd+bd small.pdb 264 1.8 ms 1.6 ms 0.0 ms 3.4 ms 687.1 ms - 1038.0MB
12 cu aev fd+bd small.pdb 264 0.3 ms 0.1 ms 0.0 ms 0.5 ms 98.2 ms 7.00 1038.0MB
13 py aev fd+bd 1hz5.pdb 973 1.8 ms 1.6 ms 0.0 ms 3.5 ms 701.4 ms - 1266.0MB
14 cu aev fd+bd 1hz5.pdb 973 0.3 ms 0.2 ms 0.0 ms 0.5 ms 106.8 ms 6.57 1240.0MB
15 py aev fd+bd 6W8H.pdb 3410 2.6 ms 1.6 ms 0.0 ms 4.3 ms 851.4 ms - 1630.0MB
16 cu aev fd+bd 6W8H.pdb 3410 0.4 ms 0.2 ms 0.0 ms 0.6 ms 113.3 ms 7.52 1334.0MB
17 py aev fd+bd 1C17.pdb 6000 5.2 ms 3.8 ms 0.0 ms 9.0 ms 1.807 sec - 4090.0MB
18 cu aev fd+bd 1C17.pdb 6000 0.7 ms 0.6 ms 0.0 ms 1.3 ms 263.4 ms 6.86 3238.0MB
19 py aev fd+bd 1C17.pdb 10000 11.1 ms 6.1 ms 0.0 ms 17.2 ms 3.437 sec - 7522.0MB
20 cu aev fd+bd 1C17.pdb 10000 1.1 ms 0.8 ms 0.0 ms 2.0 ms 398.5 ms 8.63 4842.0MB
-----------------------------------------------------------------------------------------------------------------------------------------
File: small.pdb, Molecule size: 264 / 264, Species: [1, 6, 7, 8]
Original TorchANI:
GPU Memory Cached (pytorch) : 134.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 982.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.37 s
Speed: 1.85 ms/it
CUaev:
GPU Memory Cached (pytorch) : 128.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 976.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.08 s
Speed: 0.40 ms/it
aev_error: 1.19e-06
Speed up: 4.64 X
----------------------------------------------------------------------
File: 1hz5.pdb, Molecule size: 973 / 973, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 186.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1034.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.36 s
Speed: 1.79 ms/it
CUaev:
GPU Memory Cached (pytorch) : 128.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 976.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.08 s
Speed: 0.39 ms/it
aev_error: 1.67e-06
Speed up: 4.63 X
----------------------------------------------------------------------
File: 6W8H.pdb, Molecule size: 3410 / 3410, Species: [6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 496.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1344.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.50 s
Speed: 2.50 ms/it
CUaev:
GPU Memory Cached (pytorch) : 188.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1036.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.09 s
Speed: 0.46 ms/it
aev_error: 1.19e-06
Speed up: 5.48 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 6000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 1166.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 2014.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 1.03 s
Speed: 5.15 ms/it
CUaev:
GPU Memory Cached (pytorch) : 244.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1092.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.19 s
Speed: 0.96 ms/it
aev_error: 2.15e-06
Speed up: 5.39 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 10000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 3028.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 3876.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 2.22 s
Speed: 11.09 ms/it
CUaev:
GPU Memory Cached (pytorch) : 334.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1182.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.30 s
Speed: 1.49 ms/it
aev_error: 2.38e-06
Speed up: 7.42 X
----------------------------------------------------------------------
Add Backward
File: small.pdb, Molecule size: 264 / 264, Species: [1, 6, 7, 8]
Original TorchANI:
GPU Memory Cached (pytorch) : 190.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1038.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.69 s
Speed: 3.45 ms/it
CUaev:
GPU Memory Cached (pytorch) : 190.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1038.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.12 s
Speed: 0.58 ms/it
aev_error: 1.19e-06
force_error: 1.53e-05
Speed up: 5.93 X
----------------------------------------------------------------------
File: 1hz5.pdb, Molecule size: 973 / 973, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 418.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1266.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.70 s
Speed: 3.48 ms/it
CUaev:
GPU Memory Cached (pytorch) : 392.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1240.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.13 s
Speed: 0.66 ms/it
aev_error: 1.91e-06
force_error: 2.29e-05
Speed up: 5.30 X
----------------------------------------------------------------------
File: 6W8H.pdb, Molecule size: 3410 / 3410, Species: [6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 782.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1630.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.85 s
Speed: 4.24 ms/it
CUaev:
GPU Memory Cached (pytorch) : 486.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 1334.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.14 s
Speed: 0.68 ms/it
aev_error: 1.19e-06
force_error: 1.81e-05
Speed up: 6.26 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 6000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 3242.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 4090.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 1.81 s
Speed: 9.03 ms/it
CUaev:
GPU Memory Cached (pytorch) : 2412.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 3260.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.39 s
Speed: 1.97 ms/it
aev_error: 2.15e-06
force_error: 2.48e-05
Speed up: 4.59 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 10000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 6674.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 7522.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 3.43 s
Speed: 17.15 ms/it
CUaev:
GPU Memory Cached (pytorch) : 3994.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
GPU Memory Used (nvidia-smi): 4842.0MB / 81251.2MB (NVIDIA A100-SXM4-80GB)
Duration: 0.61 s
Speed: 3.03 ms/it
aev_error: 2.62e-06
force_error: 2.67e-05
Speed up: 5.65 X
----------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------
RUN PDB Size forward backward Others Total Total(200) Speedup GPU
-----------------------------------------------------------------------------------------------------------------------------------------
01 py aev fd small.pdb 264 1.8 ms 0.0 ms 0.1 ms 1.9 ms 370.8 ms - 982.0MB
02 cu aev fd small.pdb 264 0.4 ms 0.0 ms 0.0 ms 0.4 ms 79.8 ms 4.64 976.0MB
03 py aev fd 1hz5.pdb 973 1.8 ms 0.0 ms 0.0 ms 1.8 ms 358.3 ms - 1034.0MB
04 cu aev fd 1hz5.pdb 973 0.4 ms 0.0 ms 0.0 ms 0.4 ms 77.4 ms 4.63 976.0MB
05 py aev fd 6W8H.pdb 3410 2.5 ms 0.0 ms 0.0 ms 2.5 ms 500.1 ms - 1344.0MB
06 cu aev fd 6W8H.pdb 3410 0.4 ms 0.0 ms 0.0 ms 0.5 ms 91.2 ms 5.48 1036.0MB
07 py aev fd 1C17.pdb 6000 5.1 ms 0.0 ms 0.0 ms 5.2 ms 1.031 sec - 2014.0MB
08 cu aev fd 1C17.pdb 6000 0.9 ms 0.0 ms 0.0 ms 1.0 ms 191.1 ms 5.39 1092.0MB
09 py aev fd 1C17.pdb 10000 11.1 ms 0.0 ms 0.0 ms 11.1 ms 2.218 sec - 3876.0MB
10 cu aev fd 1C17.pdb 10000 1.5 ms 0.0 ms 0.0 ms 1.5 ms 298.7 ms 7.42 1182.0MB
-----------------------------------------------------------------------------------------------------------------------------------------
11 py aev fd+bd small.pdb 264 1.8 ms 1.6 ms 0.0 ms 3.5 ms 690.6 ms - 1038.0MB
12 cu aev fd+bd small.pdb 264 0.4 ms 0.2 ms 0.0 ms 0.6 ms 116.5 ms 5.93 1038.0MB
13 py aev fd+bd 1hz5.pdb 973 1.8 ms 1.6 ms 0.0 ms 3.5 ms 697.0 ms - 1266.0MB
14 cu aev fd+bd 1hz5.pdb 973 0.4 ms 0.3 ms 0.0 ms 0.7 ms 131.6 ms 5.30 1240.0MB
15 py aev fd+bd 6W8H.pdb 3410 2.6 ms 1.6 ms 0.0 ms 4.2 ms 847.3 ms - 1630.0MB
16 cu aev fd+bd 6W8H.pdb 3410 0.4 ms 0.3 ms 0.0 ms 0.7 ms 135.4 ms 6.26 1334.0MB
17 py aev fd+bd 1C17.pdb 6000 5.2 ms 3.8 ms 0.0 ms 9.0 ms 1.806 sec - 4090.0MB
18 cu aev fd+bd 1C17.pdb 6000 0.9 ms 1.0 ms 0.0 ms 2.0 ms 393.8 ms 4.59 3260.0MB
19 py aev fd+bd 1C17.pdb 10000 11.1 ms 6.1 ms 0.0 ms 17.2 ms 3.431 sec - 7522.0MB
20 cu aev fd+bd 1C17.pdb 10000 1.5 ms 1.6 ms 0.0 ms 3.0 ms 606.9 ms 5.65 4842.0MB
-----------------------------------------------------------------------------------------------------------------------------------------
python aev-benchmark-size.py
Check args: Namespace(N=200, backward=0, infer_model=0, mnp=0, nsight=False, plot=0, run_energy=0, single_nn=0, use_cell_list=False, use_cuaev_interface=False)
/blue/roitberg/apps/lammps-ani/external/torchani_sandbox/torchani/models.py:99: UserWarning: The default is now to accept atomic numbers as indexes, do not set periodic_table_index=True. if you need to accept raw indices set periodic_table_index=False
warnings.warn("The default is now to accept atomic numbers as indexes,"
aev-benchmark-size.py:204: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484683044/work/torch/csrc/utils/tensor_new.cpp:201.)
species = torch.tensor([mol.get_atomic_numbers()], device=device)
File: small.pdb, Molecule size: 264 / 264, Species: [1, 6, 7, 8]
Original TorchANI:
GPU Memory Cached (pytorch) : 134.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 666.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.38 s
Speed: 1.88 ms/it
CUaev:
GPU Memory Cached (pytorch) : 128.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 660.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.07 s
Speed: 0.33 ms/it
aev_error: 3.10e-06
Speed up: 5.62 X
----------------------------------------------------------------------
File: 1hz5.pdb, Molecule size: 973 / 973, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 186.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 718.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.40 s
Speed: 1.98 ms/it
CUaev:
GPU Memory Cached (pytorch) : 128.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 660.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.07 s
Speed: 0.37 ms/it
aev_error: 4.77e-06
Speed up: 5.32 X
----------------------------------------------------------------------
File: 6W8H.pdb, Molecule size: 3410 / 3410, Species: [6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 496.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 1028.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.76 s
Speed: 3.82 ms/it
CUaev:
GPU Memory Cached (pytorch) : 188.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 720.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.09 s
Speed: 0.44 ms/it
aev_error: 3.10e-06
Speed up: 8.76 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 6000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 1166.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 1698.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 2.21 s
Speed: 11.05 ms/it
CUaev:
GPU Memory Cached (pytorch) : 244.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 776.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.24 s
Speed: 1.19 ms/it
aev_error: 5.72e-06
Speed up: 9.28 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 10000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 3028.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 3560.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 5.12 s
Speed: 25.59 ms/it
CUaev:
GPU Memory Cached (pytorch) : 334.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 866.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.39 s
Speed: 1.97 ms/it
aev_error: 6.20e-06
Speed up: 12.96 X
----------------------------------------------------------------------
Add Backward
File: small.pdb, Molecule size: 264 / 264, Species: [1, 6, 7, 8]
Original TorchANI:
GPU Memory Cached (pytorch) : 190.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 722.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.84 s
Speed: 4.18 ms/it
CUaev:
GPU Memory Cached (pytorch) : 190.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 722.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.09 s
Speed: 0.47 ms/it
aev_error: 2.98e-06
force_error: 3.43e-05
Speed up: 8.91 X
----------------------------------------------------------------------
File: 1hz5.pdb, Molecule size: 973 / 973, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 418.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 950.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.89 s
Speed: 4.45 ms/it
CUaev:
GPU Memory Cached (pytorch) : 392.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 924.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.12 s
Speed: 0.62 ms/it
aev_error: 4.77e-06
force_error: 3.81e-05
Speed up: 7.19 X
----------------------------------------------------------------------
File: 6W8H.pdb, Molecule size: 3410 / 3410, Species: [6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 782.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 1314.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 1.27 s
Speed: 6.34 ms/it
CUaev:
GPU Memory Cached (pytorch) : 486.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 1018.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.15 s
Speed: 0.75 ms/it
aev_error: 3.10e-06
force_error: 8.96e-05
Speed up: 8.45 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 6000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 3242.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 3774.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 4.02 s
Speed: 20.12 ms/it
CUaev:
GPU Memory Cached (pytorch) : 2390.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 2922.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.44 s
Speed: 2.18 ms/it
aev_error: 5.48e-06
force_error: 4.96e-05
Speed up: 9.22 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 10000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 6674.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 7206.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 8.06 s
Speed: 40.30 ms/it
CUaev:
GPU Memory Cached (pytorch) : 3994.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 4526.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.71 s
Speed: 3.57 ms/it
aev_error: 5.72e-06
force_error: 4.58e-05
Speed up: 11.30 X
----------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------
RUN PDB Size forward backward Others Total Total(200) Speedup GPU
-----------------------------------------------------------------------------------------------------------------------------------------
01 py aev fd small.pdb 264 1.8 ms 0.0 ms 0.1 ms 1.9 ms 376.1 ms - 666.9MB
02 cu aev fd small.pdb 264 0.3 ms 0.0 ms 0.0 ms 0.3 ms 66.9 ms 5.62 660.9MB
03 py aev fd 1hz5.pdb 973 2.0 ms 0.0 ms 0.0 ms 2.0 ms 396.6 ms - 718.9MB
04 cu aev fd 1hz5.pdb 973 0.4 ms 0.0 ms 0.0 ms 0.4 ms 74.5 ms 5.32 660.9MB
05 py aev fd 6W8H.pdb 3410 3.8 ms 0.0 ms 0.0 ms 3.8 ms 763.5 ms - 1028.9MB
06 cu aev fd 6W8H.pdb 3410 0.4 ms 0.0 ms 0.0 ms 0.4 ms 87.2 ms 8.76 720.9MB
07 py aev fd 1C17.pdb 6000 11.0 ms 0.0 ms 0.0 ms 11.0 ms 2.210 sec - 1698.9MB
08 cu aev fd 1C17.pdb 6000 1.2 ms 0.0 ms 0.0 ms 1.2 ms 238.1 ms 9.28 776.9MB
09 py aev fd 1C17.pdb 10000 25.6 ms 0.0 ms 0.0 ms 25.6 ms 5.117 sec - 3560.9MB
10 cu aev fd 1C17.pdb 10000 2.0 ms 0.0 ms 0.0 ms 2.0 ms 394.9 ms 12.96 866.9MB
-----------------------------------------------------------------------------------------------------------------------------------------
11 py aev fd+bd small.pdb 264 2.1 ms 2.1 ms 0.0 ms 4.2 ms 836.7 ms - 722.9MB
12 cu aev fd+bd small.pdb 264 0.3 ms 0.1 ms 0.0 ms 0.5 ms 93.9 ms 8.91 722.9MB
13 py aev fd+bd 1hz5.pdb 973 2.2 ms 2.3 ms 0.0 ms 4.5 ms 890.8 ms - 950.9MB
14 cu aev fd+bd 1hz5.pdb 973 0.4 ms 0.2 ms 0.0 ms 0.6 ms 123.9 ms 7.19 924.9MB
15 py aev fd+bd 6W8H.pdb 3410 4.1 ms 2.2 ms 0.0 ms 6.3 ms 1.269 sec - 1314.9MB
16 cu aev fd+bd 6W8H.pdb 3410 0.4 ms 0.3 ms 0.0 ms 0.8 ms 150.1 ms 8.45 1018.9MB
17 py aev fd+bd 1C17.pdb 6000 11.2 ms 8.9 ms 0.0 ms 20.1 ms 4.024 sec - 3774.9MB
18 cu aev fd+bd 1C17.pdb 6000 1.2 ms 0.9 ms 0.0 ms 2.2 ms 436.6 ms 9.22 2922.9MB
19 py aev fd+bd 1C17.pdb 10000 25.7 ms 14.6 ms 0.0 ms 40.3 ms 8.060 sec - 7206.9MB
20 cu aev fd+bd 1C17.pdb 10000 2.1 ms 1.5 ms 0.0 ms 3.6 ms 713.1 ms 11.30 4526.9MB
-----------------------------------------------------------------------------------------------------------------------------------------
python aev-benchmark-size.py
Check args: Namespace(N=200, backward=0, infer_model=0, mnp=0, nsight=False, plot=0, run_energy=0, single_nn=0, use_cell_list=False, use_cuaev_interface=False)
/blue/roitberg/apps/lammps-ani/external/torchani_sandbox/torchani/models.py:99: UserWarning: The default is now to accept atomic numbers as indexes, do not set periodic_table_index=True. if you need to accept raw indices set periodic_table_index=False
warnings.warn("The default is now to accept atomic numbers as indexes,"
aev-benchmark-size.py:204: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484683044/work/torch/csrc/utils/tensor_new.cpp:201.)
species = torch.tensor([mol.get_atomic_numbers()], device=device)
File: small.pdb, Molecule size: 264 / 264, Species: [1, 6, 7, 8]
Original TorchANI:
GPU Memory Cached (pytorch) : 134.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 666.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.38 s
Speed: 1.88 ms/it
CUaev:
GPU Memory Cached (pytorch) : 128.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 660.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.07 s
Speed: 0.34 ms/it
aev_error: 2.86e-06
Speed up: 5.55 X
----------------------------------------------------------------------
File: 1hz5.pdb, Molecule size: 973 / 973, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 186.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 718.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.40 s
Speed: 1.98 ms/it
CUaev:
GPU Memory Cached (pytorch) : 128.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 660.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.08 s
Speed: 0.38 ms/it
aev_error: 4.77e-06
Speed up: 5.24 X
----------------------------------------------------------------------
File: 6W8H.pdb, Molecule size: 3410 / 3410, Species: [6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 496.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 1028.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.77 s
Speed: 3.84 ms/it
CUaev:
GPU Memory Cached (pytorch) : 188.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 720.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.09 s
Speed: 0.45 ms/it
aev_error: 3.34e-06
Speed up: 8.54 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 6000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 1166.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 1698.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 2.21 s
Speed: 11.06 ms/it
CUaev:
GPU Memory Cached (pytorch) : 244.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 776.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.24 s
Speed: 1.22 ms/it
aev_error: 6.20e-06
Speed up: 9.04 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 10000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 3028.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 3560.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 5.12 s
Speed: 25.59 ms/it
CUaev:
GPU Memory Cached (pytorch) : 334.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 866.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.41 s
Speed: 2.07 ms/it
aev_error: 5.72e-06
Speed up: 12.37 X
----------------------------------------------------------------------
Add Backward
File: small.pdb, Molecule size: 264 / 264, Species: [1, 6, 7, 8]
Original TorchANI:
GPU Memory Cached (pytorch) : 190.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 722.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.85 s
Speed: 4.23 ms/it
CUaev:
GPU Memory Cached (pytorch) : 190.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 722.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.10 s
Speed: 0.49 ms/it
aev_error: 2.86e-06
force_error: 2.57e-05
Speed up: 8.73 X
----------------------------------------------------------------------
File: 1hz5.pdb, Molecule size: 973 / 973, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 418.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 950.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.89 s
Speed: 4.46 ms/it
CUaev:
GPU Memory Cached (pytorch) : 392.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 924.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.13 s
Speed: 0.64 ms/it
aev_error: 5.01e-06
force_error: 3.43e-05
Speed up: 6.97 X
----------------------------------------------------------------------
File: 6W8H.pdb, Molecule size: 3410 / 3410, Species: [6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 782.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 1314.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 1.27 s
Speed: 6.36 ms/it
CUaev:
GPU Memory Cached (pytorch) : 486.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 1018.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.15 s
Speed: 0.77 ms/it
aev_error: 2.86e-06
force_error: 8.20e-05
Speed up: 8.23 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 6000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 3242.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 3774.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 4.02 s
Speed: 20.11 ms/it
CUaev:
GPU Memory Cached (pytorch) : 2390.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 2922.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.46 s
Speed: 2.31 ms/it
aev_error: 5.25e-06
force_error: 4.20e-05
Speed up: 8.71 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 10000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 6674.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 7206.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 8.06 s
Speed: 40.29 ms/it
CUaev:
GPU Memory Cached (pytorch) : 3994.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 4526.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.76 s
Speed: 3.80 ms/it
aev_error: 5.25e-06
force_error: 3.43e-05
Speed up: 10.59 X
----------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------
RUN PDB Size forward backward Others Total Total(200) Speedup GPU
-----------------------------------------------------------------------------------------------------------------------------------------
01 py aev fd small.pdb 264 1.8 ms 0.0 ms 0.1 ms 1.9 ms 376.9 ms - 666.9MB
02 cu aev fd small.pdb 264 0.3 ms 0.0 ms 0.0 ms 0.3 ms 68.0 ms 5.55 660.9MB
03 py aev fd 1hz5.pdb 973 2.0 ms 0.0 ms 0.0 ms 2.0 ms 395.5 ms - 718.9MB
04 cu aev fd 1hz5.pdb 973 0.4 ms 0.0 ms 0.0 ms 0.4 ms 75.5 ms 5.24 660.9MB
05 py aev fd 6W8H.pdb 3410 3.8 ms 0.0 ms 0.0 ms 3.8 ms 767.5 ms - 1028.9MB
06 cu aev fd 6W8H.pdb 3410 0.4 ms 0.0 ms 0.0 ms 0.4 ms 89.9 ms 8.54 720.9MB
07 py aev fd 1C17.pdb 6000 11.0 ms 0.0 ms 0.0 ms 11.1 ms 2.211 sec - 1698.9MB
08 cu aev fd 1C17.pdb 6000 1.2 ms 0.0 ms 0.0 ms 1.2 ms 244.7 ms 9.04 776.9MB
09 py aev fd 1C17.pdb 10000 25.6 ms 0.0 ms 0.0 ms 25.6 ms 5.119 sec - 3560.9MB
10 cu aev fd 1C17.pdb 10000 2.1 ms 0.0 ms 0.0 ms 2.1 ms 413.7 ms 12.37 866.9MB
-----------------------------------------------------------------------------------------------------------------------------------------
11 py aev fd+bd small.pdb 264 2.1 ms 2.1 ms 0.0 ms 4.2 ms 846.9 ms - 722.9MB
12 cu aev fd+bd small.pdb 264 0.3 ms 0.2 ms 0.0 ms 0.5 ms 97.0 ms 8.73 722.9MB
13 py aev fd+bd 1hz5.pdb 973 2.2 ms 2.3 ms 0.0 ms 4.5 ms 892.6 ms - 950.9MB
14 cu aev fd+bd 1hz5.pdb 973 0.4 ms 0.2 ms 0.0 ms 0.6 ms 128.1 ms 6.97 924.9MB
15 py aev fd+bd 6W8H.pdb 3410 4.1 ms 2.3 ms 0.0 ms 6.4 ms 1.271 sec - 1314.9MB
16 cu aev fd+bd 6W8H.pdb 3410 0.5 ms 0.3 ms 0.0 ms 0.8 ms 154.5 ms 8.23 1018.9MB
17 py aev fd+bd 1C17.pdb 6000 11.2 ms 8.9 ms 0.0 ms 20.1 ms 4.022 sec - 3774.9MB
18 cu aev fd+bd 1C17.pdb 6000 1.3 ms 1.0 ms 0.0 ms 2.3 ms 461.5 ms 8.71 2922.9MB
19 py aev fd+bd 1C17.pdb 10000 25.7 ms 14.6 ms 0.0 ms 40.3 ms 8.058 sec - 7206.9MB
20 cu aev fd+bd 1C17.pdb 10000 2.2 ms 1.6 ms 0.0 ms 3.8 ms 760.8 ms 10.59 4526.9MB
-----------------------------------------------------------------------------------------------------------------------------------------
python aev-benchmark-size.py
Check args: Namespace(N=200, backward=0, infer_model=0, mnp=0, nsight=False, plot=0, run_energy=0, single_nn=0, use_cell_list=False, use_cuaev_interface=False)
/blue/roitberg/apps/lammps-ani/external/torchani_sandbox/torchani/models.py:99: UserWarning: The default is now to accept atomic numbers as indexes, do not set periodic_table_index=True. if you need to accept raw indices set periodic_table_index=False
warnings.warn("The default is now to accept atomic numbers as indexes,"
aev-benchmark-size.py:204: UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor. (Triggered internally at /opt/conda/conda-bld/pytorch_1659484683044/work/torch/csrc/utils/tensor_new.cpp:201.)
species = torch.tensor([mol.get_atomic_numbers()], device=device)
File: small.pdb, Molecule size: 264 / 264, Species: [1, 6, 7, 8]
Original TorchANI:
GPU Memory Cached (pytorch) : 134.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 666.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.37 s
Speed: 1.86 ms/it
CUaev:
GPU Memory Cached (pytorch) : 128.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 660.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.07 s
Speed: 0.37 ms/it
aev_error: 1.19e-06
Speed up: 5.09 X
----------------------------------------------------------------------
File: 1hz5.pdb, Molecule size: 973 / 973, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 186.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 718.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.39 s
Speed: 1.95 ms/it
CUaev:
GPU Memory Cached (pytorch) : 128.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 660.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.08 s
Speed: 0.39 ms/it
aev_error: 1.91e-06
Speed up: 5.02 X
----------------------------------------------------------------------
File: 6W8H.pdb, Molecule size: 3410 / 3410, Species: [6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 496.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 1028.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.76 s
Speed: 3.82 ms/it
CUaev:
GPU Memory Cached (pytorch) : 188.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 720.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.09 s
Speed: 0.46 ms/it
aev_error: 9.54e-07
Speed up: 8.23 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 6000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 1166.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 1698.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 2.21 s
Speed: 11.05 ms/it
CUaev:
GPU Memory Cached (pytorch) : 244.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 776.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.26 s
Speed: 1.29 ms/it
aev_error: 2.62e-06
Speed up: 8.58 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 10000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 3028.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 3560.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 5.12 s
Speed: 25.60 ms/it
CUaev:
GPU Memory Cached (pytorch) : 334.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 866.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.44 s
Speed: 2.20 ms/it
aev_error: 2.38e-06
Speed up: 11.62 X
----------------------------------------------------------------------
Add Backward
File: small.pdb, Molecule size: 264 / 264, Species: [1, 6, 7, 8]
Original TorchANI:
GPU Memory Cached (pytorch) : 190.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 722.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.84 s
Speed: 4.19 ms/it
CUaev:
GPU Memory Cached (pytorch) : 190.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 722.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.11 s
Speed: 0.54 ms/it
aev_error: 1.01e-06
force_error: 1.62e-05
Speed up: 7.74 X
----------------------------------------------------------------------
File: 1hz5.pdb, Molecule size: 973 / 973, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 418.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 950.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.89 s
Speed: 4.46 ms/it
CUaev:
GPU Memory Cached (pytorch) : 392.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 924.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.15 s
Speed: 0.73 ms/it
aev_error: 1.91e-06
force_error: 2.29e-05
Speed up: 6.12 X
----------------------------------------------------------------------
File: 6W8H.pdb, Molecule size: 3410 / 3410, Species: [6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 782.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 1314.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 1.28 s
Speed: 6.38 ms/it
CUaev:
GPU Memory Cached (pytorch) : 486.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 1018.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.17 s
Speed: 0.85 ms/it
aev_error: 1.19e-06
force_error: 1.72e-05
Speed up: 7.52 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 6000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 3242.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 3774.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 4.03 s
Speed: 20.14 ms/it
CUaev:
GPU Memory Cached (pytorch) : 2390.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 2922.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.57 s
Speed: 2.86 ms/it
aev_error: 2.38e-06
force_error: 2.67e-05
Speed up: 7.04 X
----------------------------------------------------------------------
File: 1C17.pdb, Molecule size: 10000 / 16649, Species: [1, 6, 7, 8, 16]
Original TorchANI:
GPU Memory Cached (pytorch) : 6674.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 7206.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 8.07 s
Speed: 40.35 ms/it
CUaev:
GPU Memory Cached (pytorch) : 3994.0MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
GPU Memory Used (nvidia-smi): 4526.9MB / 11019.4MB (NVIDIA GeForce RTX 2080 Ti)
Duration: 0.94 s
Speed: 4.71 ms/it
aev_error: 2.38e-06
force_error: 2.91e-05
Speed up: 8.56 X
----------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------
RUN PDB Size forward backward Others Total Total(200) Speedup GPU
-----------------------------------------------------------------------------------------------------------------------------------------
01 py aev fd small.pdb 264 1.8 ms 0.0 ms 0.1 ms 1.9 ms 372.6 ms - 666.9MB
02 cu aev fd small.pdb 264 0.4 ms 0.0 ms 0.0 ms 0.4 ms 73.2 ms 5.09 660.9MB
03 py aev fd 1hz5.pdb 973 1.9 ms 0.0 ms 0.0 ms 2.0 ms 390.7 ms - 718.9MB
04 cu aev fd 1hz5.pdb 973 0.4 ms 0.0 ms 0.0 ms 0.4 ms 77.8 ms 5.02 660.9MB
05 py aev fd 6W8H.pdb 3410 3.8 ms 0.0 ms 0.0 ms 3.8 ms 763.7 ms - 1028.9MB
06 cu aev fd 6W8H.pdb 3410 0.5 ms 0.0 ms 0.0 ms 0.5 ms 92.8 ms 8.23 720.9MB
07 py aev fd 1C17.pdb 6000 11.0 ms 0.0 ms 0.0 ms 11.0 ms 2.210 sec - 1698.9MB
08 cu aev fd 1C17.pdb 6000 1.3 ms 0.0 ms 0.0 ms 1.3 ms 257.6 ms 8.58 776.9MB
09 py aev fd 1C17.pdb 10000 25.6 ms 0.0 ms 0.0 ms 25.6 ms 5.120 sec - 3560.9MB
10 cu aev fd 1C17.pdb 10000 2.2 ms 0.0 ms 0.0 ms 2.2 ms 440.8 ms 11.62 866.9MB
-----------------------------------------------------------------------------------------------------------------------------------------
11 py aev fd+bd small.pdb 264 2.1 ms 2.1 ms 0.0 ms 4.2 ms 837.1 ms - 722.9MB
12 cu aev fd+bd small.pdb 264 0.3 ms 0.2 ms 0.0 ms 0.5 ms 108.2 ms 7.74 722.9MB
13 py aev fd+bd 1hz5.pdb 973 2.2 ms 2.3 ms 0.0 ms 4.5 ms 893.0 ms - 950.9MB
14 cu aev fd+bd 1hz5.pdb 973 0.4 ms 0.3 ms 0.0 ms 0.7 ms 146.0 ms 6.12 924.9MB
15 py aev fd+bd 6W8H.pdb 3410 4.1 ms 2.3 ms 0.0 ms 6.4 ms 1.276 sec - 1314.9MB
16 cu aev fd+bd 6W8H.pdb 3410 0.5 ms 0.4 ms 0.0 ms 0.8 ms 169.8 ms 7.52 1018.9MB
17 py aev fd+bd 1C17.pdb 6000 11.2 ms 8.9 ms 0.0 ms 20.1 ms 4.027 sec - 3774.9MB
18 cu aev fd+bd 1C17.pdb 6000 1.3 ms 1.5 ms 0.0 ms 2.9 ms 572.1 ms 7.04 2922.9MB
19 py aev fd+bd 1C17.pdb 10000 25.8 ms 14.6 ms 0.0 ms 40.4 ms 8.070 sec - 7206.9MB
20 cu aev fd+bd 1C17.pdb 10000 2.3 ms 2.4 ms 0.0 ms 4.7 ms 943.0 ms 8.56 4526.9MB
-----------------------------------------------------------------------------------------------------------------------------------------