Skip to content

Instantly share code, notes, and snippets.

@rwightman
Created March 6, 2021 06:22
Show Gist options
  • Save rwightman/bb59f9e245162cee0e38bd66bd8cd77f to your computer and use it in GitHub Desktop.
Save rwightman/bb59f9e245162cee0e38bd66bd8cd77f to your computer and use it in GitHub Desktop.
PyTorch Bench (1.8, 1.7.1, NGC 21.02, NGC 20.12)
model gpu env cl infer_samples_per_sec infer_step_time infer_batch_size train_samples_per_sec train_step_time train_batch_size param_count img_size
efficientnet_b0 rtx3090 ngc2102 True 7179.22 0.139 512 1628.51 0.609 256 5.29 224
efficientnet_b0 rtx3090 ngc2012 True 6527.77 0.153 512 1504.58 0.654 256 5.29 224
efficientnet_b0 v100_32 ngc2102 True 6496.56 0.154 512 1556.66 0.638 512 5.29 224
efficientnet_b0 rtx3090 1.7.1cu11.0 True 6020.3 0.166 512 1266.03 0.785 512 5.29 224
efficientnet_b0 rtx3090 1.8cu11.1 True 5979.7 0.167 512 1286.76 0.775 512 5.29 224
efficientnet_b0 v100_32 ngc2012 True 5666.05 0.176 512 1459.05 0.676 512 5.29 224
efficientnet_b0 v100_32 1.8cu11.1 True 5529.09 0.181 512 1444.02 0.688 512 5.29 224
efficientnet_b0 v100_32 1.7.1cu11.0 True 5526.07 0.181 512 1425.38 0.691 512 5.29 224
efficientnet_b0 titanrtx ngc2102 True 5118.38 0.195 512 1156.83 0.862 512 5.29 224
efficientnet_b0 rtx3090 ngc2102 False 4780.86 0.209 512 1128.97 0.881 256 5.29 224
efficientnet_b0 rtx3090 ngc2012 False 4770.98 0.21 512 1087.72 0.908 256 5.29 224
efficientnet_b0 rtx3090 1.8cu11.1 False 4674.49 0.214 512 996.63 0.999 256 5.29 224
efficientnet_b0 titanrtx ngc2012 True 4651.09 0.215 512 1111.87 0.894 512 5.29 224
efficientnet_b0 rtx3090 1.7.1cu11.0 False 4641.62 0.215 512 1046.28 0.951 512 5.29 224
efficientnet_b0 titanrtx 1.8cu11.1 True 4496.45 0.222 512 1090.66 0.914 512 5.29 224
efficientnet_b0 v100_32 ngc2012 False 4453.39 0.225 512 975.08 1.016 512 5.29 224
efficientnet_b0 v100_32 ngc2102 False 4446.88 0.225 512 984.15 1.012 512 5.29 224
efficientnet_b0 v100_32 1.7.1cu11.0 False 4438.73 0.225 512 968.9 1.022 512 5.29 224
efficientnet_b0 v100_32 1.8cu11.1 False 4412.53 0.227 512 977.09 1.019 512 5.29 224
efficientnet_b0 titanrtx ngc2102 False 3770.31 0.265 512 829.25 1.203 512 5.29 224
efficientnet_b0 titanrtx 1.8cu11.1 False 3765.34 0.266 512 835.61 1.194 512 5.29 224
efficientnet_b0 titanrtx ngc2012 False 3758.4 0.266 512 841.22 1.183 512 5.29 224
efficientnet_b0 titanrtx 1.7.1cu10.2 False 3689.5 0.271 512 799.42 1.246 512 5.29 224
resnet50d rtx3090 ngc2012 True 3049.81 0.328 512 996.77 0.995 256 25.58 224
resnet50d v100_32 ngc2102 True 3000.19 0.333 512 1021.26 0.976 512 25.58 224
resnet50d rtx3090 ngc2102 True 2942.05 0.34 512 865.85 1.151 256 25.58 224
resnet50d v100_32 ngc2012 True 2936.48 0.341 512 1002.62 0.99 512 25.58 224
resnet50d rtx3090 1.7.1cu11.0 True 2807.85 0.356 512 497.35 2.002 256 25.58 224
resnet50d v100_32 1.8cu11.1 True 2797.34 0.357 512 970.56 1.027 512 25.58 224
resnet50d v100_32 1.7.1cu11.0 True 2796.9 0.358 512 962.26 1.031 512 25.58 224
resnet50d rtx3090 1.8cu11.1 True 2787.26 0.359 512 512.65 1.947 256 25.58 224
resnet50d v100_32 ngc2102 False 2550.52 0.392 512 701.28 1.423 512 25.58 224
resnet50d v100_32 ngc2012 False 2542.68 0.393 512 723.99 1.374 512 25.58 224
resnet50d v100_32 1.8cu11.1 False 2499.41 0.4 512 717.03 1.391 512 25.58 224
resnet50d v100_32 1.7.1cu11.0 False 2497.69 0.4 512 711.89 1.397 512 25.58 224
resnet50d titanrtx ngc2102 True 2479.47 0.403 512 719.32 1.386 256 25.58 224
resnet50d titanrtx ngc2012 True 2415.57 0.414 512 704.36 1.411 256 25.58 224
resnet50d rtx3090 ngc2102 False 2346.05 0.426 512 691.15 1.443 256 25.58 224
resnet50d rtx3090 ngc2012 False 2321.56 0.431 512 719.45 1.381 256 25.58 224
resnet50d titanrtx 1.8cu11.1 True 2235.28 0.447 512 655.21 1.522 256 25.58 224
resnet50d rtx3090 1.7.1cu11.0 False 2209.95 0.452 512 524.91 1.897 256 25.58 224
resnet50d rtx3090 1.8cu11.1 False 2196.35 0.455 512 539.46 1.85 256 25.58 224
regnety_032 v100_32 ngc2102 False 2137.74 0.468 512 472.43 2.111 512 19.44 224
regnety_032 rtx3090 ngc2102 False 2041.65 0.49 512 557.82 1.786 256 19.44 224
regnety_032 rtx3090 ngc2102 True 2026.21 0.493 512 559.02 1.782 256 19.44 224
resnet50d titanrtx 1.7.1cu10.2 False 2001.21 0.5 512 538.87 1.847 256 25.58 224
resnet50d titanrtx ngc2102 False 1986.98 0.503 512 540.89 1.845 256 25.58 224
vit_deit_small_patch16_224 rtx3090 1.8cu11.1 False 1979.64 0.505 512 706.89 1.411 256 22.05 224
vit_deit_small_patch16_224 rtx3090 1.8cu11.1 True 1966.28 0.509 512 706.57 1.412 256 22.05 224
resnet50d titanrtx ngc2012 False 1960.82 0.51 512 557.39 1.785 256 25.58 224
regnety_032 rtx3090 ngc2012 True 1924.62 0.52 512 446.75 2.223 256 19.44 224
vit_deit_small_patch16_224 rtx3090 ngc2102 True 1866.07 0.536 512 689.11 1.447 256 22.05 224
vit_deit_small_patch16_224 rtx3090 ngc2102 False 1860.15 0.538 512 686.23 1.453 256 22.05 224
resnet50d titanrtx 1.8cu11.1 False 1851.08 0.54 512 525.38 1.9 256 25.58 224
vit_deit_small_patch16_224 rtx3090 ngc2012 True 1845.39 0.542 512 688.47 1.444 256 22.05 224
vit_deit_small_patch16_224 rtx3090 ngc2012 False 1844.19 0.542 512 686.29 1.449 256 22.05 224
vit_deit_small_patch16_224 v100_32 1.7.1cu11.0 False 1823.44 0.548 512 638.87 1.55 256 22.05 224
vit_deit_small_patch16_224 v100_32 1.7.1cu11.0 True 1812.18 0.552 512 638.18 1.552 256 22.05 224
vit_deit_small_patch16_224 rtx3090 1.7.1cu11.0 False 1780.96 0.561 512 653.06 1.524 256 22.05 224
regnety_032 titanrtx 1.7.1cu10.2 False 1779.92 0.562 512 427.59 2.324 256 19.44 224
vit_deit_small_patch16_224 rtx3090 1.7.1cu11.0 True 1770.03 0.565 512 660.33 1.507 256 22.05 224
regnety_032 v100_32 ngc2012 False 1763.78 0.567 512 434.99 2.286 512 19.44 224
regnety_032 rtx3090 1.8cu11.1 True 1729.9 0.578 512 135.39 7.38 256 19.44 224
regnety_032 rtx3090 ngc2012 False 1729.26 0.578 512 574.86 1.725 256 19.44 224
regnety_032 rtx3090 1.7.1cu11.0 True 1728.38 0.579 512 133.05 7.501 256 19.44 224
regnety_032 v100_32 1.8cu11.1 False 1714.03 0.583 512 428.49 2.328 512 19.44 224
regnety_032 v100_32 ngc2102 True 1713.13 0.584 512 275.58 3.623 512 19.44 224
regnety_032 v100_32 1.7.1cu11.0 False 1709.49 0.585 512 424.84 2.34 512 19.44 224
efficientnet_b3a rtx3090 ngc2102 True 1675.74 0.597 512 332.3 2.994 128 12.23 320
regnety_032 rtx3090 1.8cu11.1 False 1669.55 0.599 512 475.91 2.095 256 19.44 224
regnety_032 titanrtx ngc2102 False 1658.34 0.603 512 412.81 2.415 256 19.44 224
regnety_032 titanrtx ngc2012 False 1653.32 0.605 512 416.85 2.384 256 19.44 224
nfnet_l0c rtx3090 ngc2012 True 1638.26 0.61 512 540.72 1.839 256 24.14 256
regnety_032 rtx3090 1.7.1cu11.0 False 1636.99 0.611 512 444.13 2.237 256 19.44 224
regnety_032 titanrtx 1.8cu11.1 False 1620.35 0.617 512 402.69 2.477 256 19.44 224
regnety_032 v100_32 ngc2012 True 1604.15 0.623 512 297.09 3.354 512 19.44 224
vit_deit_small_patch16_224 titanrtx 1.8cu11.1 True 1578.33 0.634 512 561.0 1.779 256 22.05 224
vit_deit_small_patch16_224 titanrtx 1.8cu11.1 False 1571.24 0.636 512 561.14 1.778 256 22.05 224
efficientnet_b3a rtx3090 ngc2012 True 1569.22 0.637 512 299.5 3.304 128 12.23 320
vit_deit_small_patch16_224 v100_32 1.8cu11.1 False 1547.63 0.646 512 517.99 1.924 256 22.05 224
vit_deit_small_patch16_224 v100_32 1.8cu11.1 True 1536.99 0.651 512 516.88 1.928 256 22.05 224
regnety_032 v100_32 1.8cu11.1 True 1495.27 0.669 512 137.03 7.292 512 19.44 224
nfnet_l0c v100_32 ngc2012 False 1490.41 0.671 512 450.92 2.209 512 24.14 256
efficientnet_b3a rtx3090 1.8cu11.1 True 1489.79 0.671 512 248.36 4.012 128 12.23 320
regnety_032 v100_32 1.7.1cu11.0 True 1489.05 0.672 512 138.53 7.204 512 19.44 224
efficientnet_b3a rtx3090 1.7.1cu11.0 True 1488.6 0.672 512 242.98 4.081 128 12.23 320
nfnet_l0c v100_32 ngc2102 False 1488.41 0.672 512 447.14 2.232 512 24.14 256
nfnet_l0c rtx3090 ngc2012 False 1481.17 0.675 512 479.25 2.077 256 24.14 256
vit_deit_small_patch16_224 titanrtx ngc2102 True 1475.43 0.678 512 557.13 1.791 256 22.05 224
vit_deit_small_patch16_224 titanrtx ngc2012 True 1469.3 0.681 512 545.1 1.826 256 22.05 224
vit_deit_small_patch16_224 titanrtx ngc2102 False 1468.17 0.681 512 556.14 1.794 256 22.05 224
vit_deit_small_patch16_224 titanrtx ngc2012 False 1464.04 0.683 512 544.63 1.828 256 22.05 224
vit_deit_small_patch16_224 v100_32 ngc2102 False 1463.09 0.683 512 512.81 1.943 256 22.05 224
vit_deit_small_patch16_224 v100_32 ngc2102 True 1459.26 0.685 512 511.65 1.948 256 22.05 224
nfnet_l0c rtx3090 ngc2102 False 1458.33 0.686 512 463.94 2.151 256 24.14 256
vit_deit_small_patch16_224 v100_32 ngc2012 False 1456.84 0.686 512 505.34 1.965 256 22.05 224
vit_deit_small_patch16_224 v100_32 ngc2012 True 1449.63 0.69 512 504.99 1.967 256 22.05 224
efficientnet_b3a v100_32 ngc2012 True 1444.02 0.692 512 316.15 3.105 128 12.23 320
efficientnet_b3a v100_32 1.7.1cu11.0 True 1392.78 0.718 512 306.88 3.194 128 12.23 320
efficientnet_b3a v100_32 1.8cu11.1 True 1391.83 0.718 512 312.43 3.172 128 12.23 320
vit_deit_small_patch16_224 titanrtx 1.7.1cu10.2 False 1375.53 0.727 512 511.3 1.948 256 22.05 224
nfnet_l0c v100_32 1.8cu11.1 False 1370.9 0.729 512 411.72 2.425 512 24.14 256
nfnet_l0c v100_32 1.7.1cu11.0 False 1370.83 0.729 512 407.89 2.442 512 24.14 256
nfnet_l0c rtx3090 1.8cu11.1 False 1361.55 0.734 512 319.23 3.128 256 24.14 256
nfnet_l0c rtx3090 1.7.1cu11.0 False 1355.96 0.737 512 312.36 3.192 256 24.14 256
regnety_032 titanrtx ngc2102 True 1327.87 0.753 512 360.0 2.77 256 19.44 224
regnety_032 titanrtx ngc2012 True 1311.36 0.763 512 356.36 2.791 256 19.44 224
nfnet_l0c rtx3090 ngc2102 True 1276.22 0.784 512 401.45 2.486 256 24.14 256
nfnet_l0c titanrtx 1.7.1cu10.2 False 1246.43 0.802 512 357.64 2.786 256 24.14 256
efficientnet_b3a rtx3090 ngc2102 False 1208.3 0.828 512 291.41 3.416 128 12.23 320
nfnet_l0c rtx3090 1.8cu11.1 True 1204.71 0.83 512 148.9 6.712 256 24.14 256
efficientnet_b3a titanrtx ngc2102 True 1203.01 0.831 512 262.93 3.786 128 12.23 320
efficientnet_b3a rtx3090 ngc2012 False 1202.46 0.832 512 281.95 3.512 128 12.23 320
nfnet_l0c rtx3090 1.7.1cu11.0 True 1189.57 0.841 512 145.3 6.873 256 24.14 256
efficientnet_b3a v100_32 ngc2102 False 1166.61 0.857 512 253.0 3.926 128 12.23 320
efficientnet_b3a rtx3090 1.7.1cu11.0 False 1165.26 0.858 512 252.23 3.932 128 12.23 320
efficientnet_b3a rtx3090 1.8cu11.1 False 1164.61 0.859 512 261.12 3.816 128 12.23 320
nfnet_l0c titanrtx ngc2102 False 1143.06 0.875 512 355.88 2.805 256 24.14 256
nfnet_l0c titanrtx ngc2012 False 1138.72 0.878 512 353.14 2.822 256 24.14 256
efficientnet_b3a v100_32 ngc2012 False 1136.71 0.88 512 252.17 3.905 128 12.23 320
nfnet_l0c v100_32 ngc2102 True 1135.01 0.881 512 186.48 5.359 512 24.14 256
efficientnet_b3a v100_32 1.7.1cu11.0 False 1128.23 0.886 512 248.97 3.952 128 12.23 320
efficientnet_b3a v100_32 1.8cu11.1 False 1128.09 0.886 512 252.57 3.932 128 12.23 320
nfnet_l0c v100_32 ngc2012 True 1122.97 0.89 512 218.06 4.578 512 24.14 256
efficientnet_b3a titanrtx ngc2012 True 1119.84 0.893 512 256.72 3.86 128 12.23 320
regnety_032 titanrtx 1.8cu11.1 True 1107.51 0.903 512 91.89 10.875 256 19.44 224
efficientnet_b3a titanrtx 1.8cu11.1 True 1089.51 0.918 512 247.61 4.023 128 12.23 320
nfnet_l0c titanrtx 1.8cu11.1 False 1084.79 0.922 512 313.98 3.18 256 24.14 256
nfnet_l0c v100_32 1.8cu11.1 True 1028.07 0.973 512 183.18 5.455 512 24.14 256
nfnet_l0c v100_32 1.7.1cu11.0 True 1025.14 0.975 512 184.78 5.403 512 24.14 256
efficientnet_b3a titanrtx 1.7.1cu10.2 False 960.06 1.042 512 206.79 4.802 128 12.23 320
efficientnet_b3a titanrtx ngc2102 False 943.43 1.06 512 209.09 4.765 128 12.23 320
efficientnet_b3a titanrtx ngc2012 False 940.67 1.063 512 209.21 4.744 128 12.23 320
efficientnet_b3a titanrtx 1.8cu11.1 False 935.66 1.069 512 209.35 4.761 128 12.23 320
vit_base_patch16_224 rtx3090 1.8cu11.1 False 820.86 1.218 512 290.25 3.438 128 86.57 224
nfnet_l0c titanrtx ngc2102 True 818.53 1.222 512 232.71 4.292 256 24.14 256
vit_base_patch16_224 rtx3090 1.8cu11.1 True 816.96 1.224 512 288.62 3.458 128 86.57 224
nfnet_l0c titanrtx ngc2012 True 797.63 1.254 512 231.25 4.314 256 24.14 256
vit_base_patch16_224 v100_32 1.7.1cu11.0 False 783.68 1.276 512 281.78 3.518 128 86.57 224
vit_base_patch16_224 v100_32 1.7.1cu11.0 True 782.86 1.277 512 281.98 3.516 128 86.57 224
vit_base_patch16_224 rtx3090 ngc2102 False 780.29 1.282 512 285.34 3.497 128 86.57 224
vit_base_patch16_224 rtx3090 ngc2102 True 779.76 1.282 512 284.82 3.503 128 86.57 224
vit_base_patch16_224 rtx3090 1.7.1cu11.0 True 774.44 1.291 512 279.92 3.556 128 86.57 224
vit_base_patch16_224 rtx3090 ngc2012 False 773.59 1.293 512 283.75 3.508 128 86.57 224
vit_base_patch16_224 rtx3090 1.7.1cu11.0 False 771.56 1.296 512 277.72 3.585 128 86.57 224
vit_base_patch16_224 rtx3090 ngc2012 True 771.17 1.297 512 284.33 3.501 128 86.57 224
nfnet_l0c titanrtx 1.8cu11.1 True 756.57 1.322 512 122.28 8.173 256 24.14 256
efficientnet_b3a v100_32 ngc2102 True 746.23 1.34 512 280.34 3.54 128 12.23 320
vit_base_patch16_224 titanrtx 1.8cu11.1 False 702.14 1.424 512 254.52 3.921 128 86.57 224
vit_base_patch16_224 titanrtx 1.8cu11.1 True 692.63 1.444 512 252.76 3.948 128 86.57 224
vit_base_patch16_224 v100_32 1.8cu11.1 True 678.53 1.474 512 233.42 4.27 128 86.57 224
vit_base_patch16_224 v100_32 1.8cu11.1 False 678.35 1.474 512 233.77 4.264 128 86.57 224
vit_base_patch16_224 titanrtx ngc2102 True 663.11 1.508 512 249.05 4.007 128 86.57 224
vit_base_patch16_224 titanrtx ngc2012 True 660.27 1.514 512 246.6 4.039 128 86.57 224
vit_base_patch16_224 titanrtx ngc2102 False 660.12 1.515 512 250.92 3.977 128 86.57 224
vit_base_patch16_224 titanrtx ngc2012 False 658.31 1.519 512 248.43 4.009 128 86.57 224
vit_base_patch16_224 v100_32 ngc2102 True 647.1 1.545 512 230.97 4.316 128 86.57 224
vit_base_patch16_224 v100_32 ngc2102 False 646.52 1.547 512 230.92 4.317 128 86.57 224
vit_base_patch16_224 v100_32 ngc2012 True 645.92 1.548 512 229.2 4.335 128 86.57 224
vit_base_patch16_224 v100_32 ngc2012 False 644.76 1.551 512 229.02 4.339 128 86.57 224
vit_base_patch16_224 titanrtx 1.7.1cu10.2 False 627.93 1.592 512 231.21 4.309 128 86.57 224
dm_nfnet_f0 rtx3090 ngc2102 False 626.98 1.595 512 217.62 4.584 128 71.49 256
dm_nfnet_f0 rtx3090 ngc2012 False 620.45 1.612 512 220.76 4.505 128 71.49 256
dm_nfnet_f0 v100_32 ngc2012 False 612.7 1.632 512 193.43 5.126 128 71.49 256
dm_nfnet_f0 rtx3090 ngc2012 True 599.24 1.669 512 195.39 5.093 128 71.49 256
dm_nfnet_f0 v100_32 ngc2102 False 597.71 1.673 512 194.51 5.121 128 71.49 256
dm_nfnet_f0 rtx3090 1.8cu11.1 False 593.21 1.686 512 166.49 5.995 128 71.49 256
dm_nfnet_f0 rtx3090 1.7.1cu11.0 False 585.76 1.707 512 162.21 6.141 128 71.49 256
seresnet152d rtx3090 ngc2102 True 585.49 1.708 512 159.3 6.216 64 66.84 320
seresnet152d v100_32 ngc2102 True 578.07 1.73 512 156.44 6.288 64 66.84 320
seresnet152d v100_32 ngc2012 True 570.22 1.754 512 144.94 6.67 64 66.84 320
seresnet152d rtx3090 ngc2012 True 553.84 1.806 512 152.85 6.402 64 66.84 320
dm_nfnet_f0 v100_32 1.8cu11.1 False 550.26 1.817 512 183.32 5.434 128 71.49 256
dm_nfnet_f0 v100_32 1.7.1cu11.0 False 550.08 1.818 512 180.45 5.496 128 71.49 256
seresnet152d rtx3090 1.7.1cu11.0 True 549.13 1.821 512 84.66 11.678 64 66.84 320
seresnet152d rtx3090 1.8cu11.1 True 545.44 1.833 512 89.68 11.095 64 66.84 320
seresnet152d v100_32 1.7.1cu11.0 True 539.99 1.852 512 134.82 7.161 64 66.84 320
seresnet152d v100_32 1.8cu11.1 True 538.83 1.856 512 141.12 6.974 64 66.84 320
dm_nfnet_f0 v100_32 ngc2012 True 527.2 1.897 512 109.9 9.058 128 71.49 256
seresnet152d v100_32 ngc2102 False 510.22 1.96 512 134.66 7.321 64 66.84 320
seresnet152d v100_32 ngc2012 False 499.82 2.001 512 130.68 7.414 64 66.84 320
seresnet152d v100_32 1.8cu11.1 False 499.41 2.002 512 134.24 7.343 64 66.84 320
seresnet152d v100_32 1.7.1cu11.0 False 499.24 2.003 512 128.27 7.541 64 66.84 320
dm_nfnet_f0 titanrtx 1.7.1cu10.2 False 484.35 2.065 512 163.37 6.096 128 71.49 256
seresnet152d titanrtx ngc2102 True 479.89 2.084 512 121.23 8.18 64 66.84 320
seresnet152d rtx3090 ngc2102 False 476.25 2.1 512 135.44 7.322 64 66.84 320
dm_nfnet_f0 titanrtx ngc2102 False 475.81 2.102 512 174.82 5.707 128 71.49 256
seresnet152d titanrtx ngc2012 True 474.03 2.11 512 118.46 8.304 64 66.84 320
dm_nfnet_f0 titanrtx ngc2012 False 473.15 2.113 512 173.1 5.752 128 71.49 256
seresnet152d rtx3090 ngc2012 False 462.95 2.16 512 131.67 7.458 64 66.84 320
seresnet152d titanrtx 1.8cu11.1 True 449.44 2.225 512 108.97 9.115 64 66.84 320
dm_nfnet_f0 titanrtx 1.8cu11.1 False 445.52 2.245 512 141.66 7.047 128 71.49 256
seresnet152d rtx3090 1.8cu11.1 False 442.17 2.262 512 106.2 9.36 64 66.84 320
seresnet152d rtx3090 1.7.1cu11.0 False 440.41 2.271 512 103.17 9.564 64 66.84 320
dm_nfnet_f0 rtx3090 1.8cu11.1 True 438.29 2.282 512 70.17 14.241 128 71.49 256
dm_nfnet_f0 rtx3090 1.7.1cu11.0 True 433.28 2.308 512 69.19 14.427 128 71.49 256
dm_nfnet_f0 rtx3090 ngc2102 True 428.52 2.334 512 101.53 9.838 128 71.49 256
seresnet152d titanrtx 1.7.1cu10.2 False 421.69 2.371 512 102.56 9.616 64 66.84 320
dm_nfnet_f0 v100_32 ngc2102 True 418.32 2.39 512 83.74 11.922 128 71.49 256
seresnet152d titanrtx ngc2102 False 388.19 2.576 512 106.62 9.311 64 66.84 320
seresnet152d titanrtx ngc2012 False 386.67 2.586 512 106.42 9.257 64 66.84 320
dm_nfnet_f0 v100_32 1.8cu11.1 True 386.52 2.587 512 73.75 13.537 128 71.49 256
dm_nfnet_f0 v100_32 1.7.1cu11.0 True 384.12 2.603 512 74.14 13.442 128 71.49 256
seresnet152d titanrtx 1.8cu11.1 False 382.65 2.613 512 102.1 9.733 64 66.84 320
dm_nfnet_f0 titanrtx ngc2102 True 352.2 2.839 512 161.58 6.175 128 71.49 256
dm_nfnet_f0 titanrtx ngc2012 True 349.75 2.859 512 159.94 6.227 128 71.49 256
dm_nfnet_f0 titanrtx 1.8cu11.1 True 288.33 3.468 512 52.52 19.027 128 71.49 256
model gpu env cl infer_samples_per_sec infer_step_time infer_batch_size train_samples_per_sec train_step_time train_batch_size param_count img_size
efficientnet_b0 rtx3090 ngc2102 True 7179.22 0.139 512 1628.51 0.609 256 5.29 224
efficientnet_b0 v100_32 ngc2102 True 6496.56 0.154 512 1556.66 0.638 512 5.29 224
efficientnet_b0 rtx3090 ngc2012 True 6527.77 0.153 512 1504.58 0.654 256 5.29 224
efficientnet_b0 v100_32 ngc2012 True 5666.05 0.176 512 1459.05 0.676 512 5.29 224
efficientnet_b0 v100_32 1.8cu11.1 True 5529.09 0.181 512 1444.02 0.688 512 5.29 224
efficientnet_b0 v100_32 1.7.1cu11.0 True 5526.07 0.181 512 1425.38 0.691 512 5.29 224
efficientnet_b0 rtx3090 1.8cu11.1 True 5979.7 0.167 512 1286.76 0.775 512 5.29 224
efficientnet_b0 rtx3090 1.7.1cu11.0 True 6020.3 0.166 512 1266.03 0.785 512 5.29 224
efficientnet_b0 titanrtx ngc2102 True 5118.38 0.195 512 1156.83 0.862 512 5.29 224
efficientnet_b0 rtx3090 ngc2102 False 4780.86 0.209 512 1128.97 0.881 256 5.29 224
efficientnet_b0 titanrtx ngc2012 True 4651.09 0.215 512 1111.87 0.894 512 5.29 224
efficientnet_b0 titanrtx 1.8cu11.1 True 4496.45 0.222 512 1090.66 0.914 512 5.29 224
efficientnet_b0 rtx3090 ngc2012 False 4770.98 0.21 512 1087.72 0.908 256 5.29 224
efficientnet_b0 rtx3090 1.7.1cu11.0 False 4641.62 0.215 512 1046.28 0.951 512 5.29 224
resnet50d v100_32 ngc2102 True 3000.19 0.333 512 1021.26 0.976 512 25.58 224
resnet50d v100_32 ngc2012 True 2936.48 0.341 512 1002.62 0.99 512 25.58 224
resnet50d rtx3090 ngc2012 True 3049.81 0.328 512 996.77 0.995 256 25.58 224
efficientnet_b0 rtx3090 1.8cu11.1 False 4674.49 0.214 512 996.63 0.999 256 5.29 224
efficientnet_b0 v100_32 ngc2102 False 4446.88 0.225 512 984.15 1.012 512 5.29 224
efficientnet_b0 v100_32 1.8cu11.1 False 4412.53 0.227 512 977.09 1.019 512 5.29 224
efficientnet_b0 v100_32 ngc2012 False 4453.39 0.225 512 975.08 1.016 512 5.29 224
resnet50d v100_32 1.8cu11.1 True 2797.34 0.357 512 970.56 1.027 512 25.58 224
efficientnet_b0 v100_32 1.7.1cu11.0 False 4438.73 0.225 512 968.9 1.022 512 5.29 224
resnet50d v100_32 1.7.1cu11.0 True 2796.9 0.358 512 962.26 1.031 512 25.58 224
resnet50d rtx3090 ngc2102 True 2942.05 0.34 512 865.85 1.151 256 25.58 224
efficientnet_b0 titanrtx ngc2012 False 3758.4 0.266 512 841.22 1.183 512 5.29 224
efficientnet_b0 titanrtx 1.8cu11.1 False 3765.34 0.266 512 835.61 1.194 512 5.29 224
efficientnet_b0 titanrtx ngc2102 False 3770.31 0.265 512 829.25 1.203 512 5.29 224
efficientnet_b0 titanrtx 1.7.1cu10.2 False 3689.5 0.271 512 799.42 1.246 512 5.29 224
resnet50d v100_32 ngc2012 False 2542.68 0.393 512 723.99 1.374 512 25.58 224
resnet50d rtx3090 ngc2012 False 2321.56 0.431 512 719.45 1.381 256 25.58 224
resnet50d titanrtx ngc2102 True 2479.47 0.403 512 719.32 1.386 256 25.58 224
resnet50d v100_32 1.8cu11.1 False 2499.41 0.4 512 717.03 1.391 512 25.58 224
resnet50d v100_32 1.7.1cu11.0 False 2497.69 0.4 512 711.89 1.397 512 25.58 224
vit_deit_small_patch16_224 rtx3090 1.8cu11.1 False 1979.64 0.505 512 706.89 1.411 256 22.05 224
vit_deit_small_patch16_224 rtx3090 1.8cu11.1 True 1966.28 0.509 512 706.57 1.412 256 22.05 224
resnet50d titanrtx ngc2012 True 2415.57 0.414 512 704.36 1.411 256 25.58 224
resnet50d v100_32 ngc2102 False 2550.52 0.392 512 701.28 1.423 512 25.58 224
resnet50d rtx3090 ngc2102 False 2346.05 0.426 512 691.15 1.443 256 25.58 224
vit_deit_small_patch16_224 rtx3090 ngc2102 True 1866.07 0.536 512 689.11 1.447 256 22.05 224
vit_deit_small_patch16_224 rtx3090 ngc2012 True 1845.39 0.542 512 688.47 1.444 256 22.05 224
vit_deit_small_patch16_224 rtx3090 ngc2012 False 1844.19 0.542 512 686.29 1.449 256 22.05 224
vit_deit_small_patch16_224 rtx3090 ngc2102 False 1860.15 0.538 512 686.23 1.453 256 22.05 224
vit_deit_small_patch16_224 rtx3090 1.7.1cu11.0 True 1770.03 0.565 512 660.33 1.507 256 22.05 224
resnet50d titanrtx 1.8cu11.1 True 2235.28 0.447 512 655.21 1.522 256 25.58 224
vit_deit_small_patch16_224 rtx3090 1.7.1cu11.0 False 1780.96 0.561 512 653.06 1.524 256 22.05 224
vit_deit_small_patch16_224 v100_32 1.7.1cu11.0 False 1823.44 0.548 512 638.87 1.55 256 22.05 224
vit_deit_small_patch16_224 v100_32 1.7.1cu11.0 True 1812.18 0.552 512 638.18 1.552 256 22.05 224
regnety_032 rtx3090 ngc2012 False 1729.26 0.578 512 574.86 1.725 256 19.44 224
vit_deit_small_patch16_224 titanrtx 1.8cu11.1 False 1571.24 0.636 512 561.14 1.778 256 22.05 224
vit_deit_small_patch16_224 titanrtx 1.8cu11.1 True 1578.33 0.634 512 561.0 1.779 256 22.05 224
regnety_032 rtx3090 ngc2102 True 2026.21 0.493 512 559.02 1.782 256 19.44 224
regnety_032 rtx3090 ngc2102 False 2041.65 0.49 512 557.82 1.786 256 19.44 224
resnet50d titanrtx ngc2012 False 1960.82 0.51 512 557.39 1.785 256 25.58 224
vit_deit_small_patch16_224 titanrtx ngc2102 True 1475.43 0.678 512 557.13 1.791 256 22.05 224
vit_deit_small_patch16_224 titanrtx ngc2102 False 1468.17 0.681 512 556.14 1.794 256 22.05 224
vit_deit_small_patch16_224 titanrtx ngc2012 True 1469.3 0.681 512 545.1 1.826 256 22.05 224
vit_deit_small_patch16_224 titanrtx ngc2012 False 1464.04 0.683 512 544.63 1.828 256 22.05 224
resnet50d titanrtx ngc2102 False 1986.98 0.503 512 540.89 1.845 256 25.58 224
nfnet_l0c rtx3090 ngc2012 True 1638.26 0.61 512 540.72 1.839 256 24.14 256
resnet50d rtx3090 1.8cu11.1 False 2196.35 0.455 512 539.46 1.85 256 25.58 224
resnet50d titanrtx 1.7.1cu10.2 False 2001.21 0.5 512 538.87 1.847 256 25.58 224
resnet50d titanrtx 1.8cu11.1 False 1851.08 0.54 512 525.38 1.9 256 25.58 224
resnet50d rtx3090 1.7.1cu11.0 False 2209.95 0.452 512 524.91 1.897 256 25.58 224
vit_deit_small_patch16_224 v100_32 1.8cu11.1 False 1547.63 0.646 512 517.99 1.924 256 22.05 224
vit_deit_small_patch16_224 v100_32 1.8cu11.1 True 1536.99 0.651 512 516.88 1.928 256 22.05 224
vit_deit_small_patch16_224 v100_32 ngc2102 False 1463.09 0.683 512 512.81 1.943 256 22.05 224
resnet50d rtx3090 1.8cu11.1 True 2787.26 0.359 512 512.65 1.947 256 25.58 224
vit_deit_small_patch16_224 v100_32 ngc2102 True 1459.26 0.685 512 511.65 1.948 256 22.05 224
vit_deit_small_patch16_224 titanrtx 1.7.1cu10.2 False 1375.53 0.727 512 511.3 1.948 256 22.05 224
vit_deit_small_patch16_224 v100_32 ngc2012 False 1456.84 0.686 512 505.34 1.965 256 22.05 224
vit_deit_small_patch16_224 v100_32 ngc2012 True 1449.63 0.69 512 504.99 1.967 256 22.05 224
resnet50d rtx3090 1.7.1cu11.0 True 2807.85 0.356 512 497.35 2.002 256 25.58 224
nfnet_l0c rtx3090 ngc2012 False 1481.17 0.675 512 479.25 2.077 256 24.14 256
regnety_032 rtx3090 1.8cu11.1 False 1669.55 0.599 512 475.91 2.095 256 19.44 224
regnety_032 v100_32 ngc2102 False 2137.74 0.468 512 472.43 2.111 512 19.44 224
nfnet_l0c rtx3090 ngc2102 False 1458.33 0.686 512 463.94 2.151 256 24.14 256
nfnet_l0c v100_32 ngc2012 False 1490.41 0.671 512 450.92 2.209 512 24.14 256
nfnet_l0c v100_32 ngc2102 False 1488.41 0.672 512 447.14 2.232 512 24.14 256
regnety_032 rtx3090 ngc2012 True 1924.62 0.52 512 446.75 2.223 256 19.44 224
regnety_032 rtx3090 1.7.1cu11.0 False 1636.99 0.611 512 444.13 2.237 256 19.44 224
regnety_032 v100_32 ngc2012 False 1763.78 0.567 512 434.99 2.286 512 19.44 224
regnety_032 v100_32 1.8cu11.1 False 1714.03 0.583 512 428.49 2.328 512 19.44 224
regnety_032 titanrtx 1.7.1cu10.2 False 1779.92 0.562 512 427.59 2.324 256 19.44 224
regnety_032 v100_32 1.7.1cu11.0 False 1709.49 0.585 512 424.84 2.34 512 19.44 224
regnety_032 titanrtx ngc2012 False 1653.32 0.605 512 416.85 2.384 256 19.44 224
regnety_032 titanrtx ngc2102 False 1658.34 0.603 512 412.81 2.415 256 19.44 224
nfnet_l0c v100_32 1.8cu11.1 False 1370.9 0.729 512 411.72 2.425 512 24.14 256
nfnet_l0c v100_32 1.7.1cu11.0 False 1370.83 0.729 512 407.89 2.442 512 24.14 256
regnety_032 titanrtx 1.8cu11.1 False 1620.35 0.617 512 402.69 2.477 256 19.44 224
nfnet_l0c rtx3090 ngc2102 True 1276.22 0.784 512 401.45 2.486 256 24.14 256
regnety_032 titanrtx ngc2102 True 1327.87 0.753 512 360.0 2.77 256 19.44 224
nfnet_l0c titanrtx 1.7.1cu10.2 False 1246.43 0.802 512 357.64 2.786 256 24.14 256
regnety_032 titanrtx ngc2012 True 1311.36 0.763 512 356.36 2.791 256 19.44 224
nfnet_l0c titanrtx ngc2102 False 1143.06 0.875 512 355.88 2.805 256 24.14 256
nfnet_l0c titanrtx ngc2012 False 1138.72 0.878 512 353.14 2.822 256 24.14 256
efficientnet_b3a rtx3090 ngc2102 True 1675.74 0.597 512 332.3 2.994 128 12.23 320
nfnet_l0c rtx3090 1.8cu11.1 False 1361.55 0.734 512 319.23 3.128 256 24.14 256
efficientnet_b3a v100_32 ngc2012 True 1444.02 0.692 512 316.15 3.105 128 12.23 320
nfnet_l0c titanrtx 1.8cu11.1 False 1084.79 0.922 512 313.98 3.18 256 24.14 256
efficientnet_b3a v100_32 1.8cu11.1 True 1391.83 0.718 512 312.43 3.172 128 12.23 320
nfnet_l0c rtx3090 1.7.1cu11.0 False 1355.96 0.737 512 312.36 3.192 256 24.14 256
efficientnet_b3a v100_32 1.7.1cu11.0 True 1392.78 0.718 512 306.88 3.194 128 12.23 320
efficientnet_b3a rtx3090 ngc2012 True 1569.22 0.637 512 299.5 3.304 128 12.23 320
regnety_032 v100_32 ngc2012 True 1604.15 0.623 512 297.09 3.354 512 19.44 224
efficientnet_b3a rtx3090 ngc2102 False 1208.3 0.828 512 291.41 3.416 128 12.23 320
vit_base_patch16_224 rtx3090 1.8cu11.1 False 820.86 1.218 512 290.25 3.438 128 86.57 224
vit_base_patch16_224 rtx3090 1.8cu11.1 True 816.96 1.224 512 288.62 3.458 128 86.57 224
vit_base_patch16_224 rtx3090 ngc2102 False 780.29 1.282 512 285.34 3.497 128 86.57 224
vit_base_patch16_224 rtx3090 ngc2102 True 779.76 1.282 512 284.82 3.503 128 86.57 224
vit_base_patch16_224 rtx3090 ngc2012 True 771.17 1.297 512 284.33 3.501 128 86.57 224
vit_base_patch16_224 rtx3090 ngc2012 False 773.59 1.293 512 283.75 3.508 128 86.57 224
vit_base_patch16_224 v100_32 1.7.1cu11.0 True 782.86 1.277 512 281.98 3.516 128 86.57 224
efficientnet_b3a rtx3090 ngc2012 False 1202.46 0.832 512 281.95 3.512 128 12.23 320
vit_base_patch16_224 v100_32 1.7.1cu11.0 False 783.68 1.276 512 281.78 3.518 128 86.57 224
efficientnet_b3a v100_32 ngc2102 True 746.23 1.34 512 280.34 3.54 128 12.23 320
vit_base_patch16_224 rtx3090 1.7.1cu11.0 True 774.44 1.291 512 279.92 3.556 128 86.57 224
vit_base_patch16_224 rtx3090 1.7.1cu11.0 False 771.56 1.296 512 277.72 3.585 128 86.57 224
regnety_032 v100_32 ngc2102 True 1713.13 0.584 512 275.58 3.623 512 19.44 224
efficientnet_b3a titanrtx ngc2102 True 1203.01 0.831 512 262.93 3.786 128 12.23 320
efficientnet_b3a rtx3090 1.8cu11.1 False 1164.61 0.859 512 261.12 3.816 128 12.23 320
efficientnet_b3a titanrtx ngc2012 True 1119.84 0.893 512 256.72 3.86 128 12.23 320
vit_base_patch16_224 titanrtx 1.8cu11.1 False 702.14 1.424 512 254.52 3.921 128 86.57 224
efficientnet_b3a v100_32 ngc2102 False 1166.61 0.857 512 253.0 3.926 128 12.23 320
vit_base_patch16_224 titanrtx 1.8cu11.1 True 692.63 1.444 512 252.76 3.948 128 86.57 224
efficientnet_b3a v100_32 1.8cu11.1 False 1128.09 0.886 512 252.57 3.932 128 12.23 320
efficientnet_b3a rtx3090 1.7.1cu11.0 False 1165.26 0.858 512 252.23 3.932 128 12.23 320
efficientnet_b3a v100_32 ngc2012 False 1136.71 0.88 512 252.17 3.905 128 12.23 320
vit_base_patch16_224 titanrtx ngc2102 False 660.12 1.515 512 250.92 3.977 128 86.57 224
vit_base_patch16_224 titanrtx ngc2102 True 663.11 1.508 512 249.05 4.007 128 86.57 224
efficientnet_b3a v100_32 1.7.1cu11.0 False 1128.23 0.886 512 248.97 3.952 128 12.23 320
vit_base_patch16_224 titanrtx ngc2012 False 658.31 1.519 512 248.43 4.009 128 86.57 224
efficientnet_b3a rtx3090 1.8cu11.1 True 1489.79 0.671 512 248.36 4.012 128 12.23 320
efficientnet_b3a titanrtx 1.8cu11.1 True 1089.51 0.918 512 247.61 4.023 128 12.23 320
vit_base_patch16_224 titanrtx ngc2012 True 660.27 1.514 512 246.6 4.039 128 86.57 224
efficientnet_b3a rtx3090 1.7.1cu11.0 True 1488.6 0.672 512 242.98 4.081 128 12.23 320
vit_base_patch16_224 v100_32 1.8cu11.1 False 678.35 1.474 512 233.77 4.264 128 86.57 224
vit_base_patch16_224 v100_32 1.8cu11.1 True 678.53 1.474 512 233.42 4.27 128 86.57 224
nfnet_l0c titanrtx ngc2102 True 818.53 1.222 512 232.71 4.292 256 24.14 256
nfnet_l0c titanrtx ngc2012 True 797.63 1.254 512 231.25 4.314 256 24.14 256
vit_base_patch16_224 titanrtx 1.7.1cu10.2 False 627.93 1.592 512 231.21 4.309 128 86.57 224
vit_base_patch16_224 v100_32 ngc2102 True 647.1 1.545 512 230.97 4.316 128 86.57 224
vit_base_patch16_224 v100_32 ngc2102 False 646.52 1.547 512 230.92 4.317 128 86.57 224
vit_base_patch16_224 v100_32 ngc2012 True 645.92 1.548 512 229.2 4.335 128 86.57 224
vit_base_patch16_224 v100_32 ngc2012 False 644.76 1.551 512 229.02 4.339 128 86.57 224
dm_nfnet_f0 rtx3090 ngc2012 False 620.45 1.612 512 220.76 4.505 128 71.49 256
nfnet_l0c v100_32 ngc2012 True 1122.97 0.89 512 218.06 4.578 512 24.14 256
dm_nfnet_f0 rtx3090 ngc2102 False 626.98 1.595 512 217.62 4.584 128 71.49 256
efficientnet_b3a titanrtx 1.8cu11.1 False 935.66 1.069 512 209.35 4.761 128 12.23 320
efficientnet_b3a titanrtx ngc2012 False 940.67 1.063 512 209.21 4.744 128 12.23 320
efficientnet_b3a titanrtx ngc2102 False 943.43 1.06 512 209.09 4.765 128 12.23 320
efficientnet_b3a titanrtx 1.7.1cu10.2 False 960.06 1.042 512 206.79 4.802 128 12.23 320
dm_nfnet_f0 rtx3090 ngc2012 True 599.24 1.669 512 195.39 5.093 128 71.49 256
dm_nfnet_f0 v100_32 ngc2102 False 597.71 1.673 512 194.51 5.121 128 71.49 256
dm_nfnet_f0 v100_32 ngc2012 False 612.7 1.632 512 193.43 5.126 128 71.49 256
nfnet_l0c v100_32 ngc2102 True 1135.01 0.881 512 186.48 5.359 512 24.14 256
nfnet_l0c v100_32 1.7.1cu11.0 True 1025.14 0.975 512 184.78 5.403 512 24.14 256
dm_nfnet_f0 v100_32 1.8cu11.1 False 550.26 1.817 512 183.32 5.434 128 71.49 256
nfnet_l0c v100_32 1.8cu11.1 True 1028.07 0.973 512 183.18 5.455 512 24.14 256
dm_nfnet_f0 v100_32 1.7.1cu11.0 False 550.08 1.818 512 180.45 5.496 128 71.49 256
dm_nfnet_f0 titanrtx ngc2102 False 475.81 2.102 512 174.82 5.707 128 71.49 256
dm_nfnet_f0 titanrtx ngc2012 False 473.15 2.113 512 173.1 5.752 128 71.49 256
dm_nfnet_f0 rtx3090 1.8cu11.1 False 593.21 1.686 512 166.49 5.995 128 71.49 256
dm_nfnet_f0 titanrtx 1.7.1cu10.2 False 484.35 2.065 512 163.37 6.096 128 71.49 256
dm_nfnet_f0 rtx3090 1.7.1cu11.0 False 585.76 1.707 512 162.21 6.141 128 71.49 256
dm_nfnet_f0 titanrtx ngc2102 True 352.2 2.839 512 161.58 6.175 128 71.49 256
dm_nfnet_f0 titanrtx ngc2012 True 349.75 2.859 512 159.94 6.227 128 71.49 256
seresnet152d rtx3090 ngc2102 True 585.49 1.708 512 159.3 6.216 64 66.84 320
seresnet152d v100_32 ngc2102 True 578.07 1.73 512 156.44 6.288 64 66.84 320
seresnet152d rtx3090 ngc2012 True 553.84 1.806 512 152.85 6.402 64 66.84 320
nfnet_l0c rtx3090 1.8cu11.1 True 1204.71 0.83 512 148.9 6.712 256 24.14 256
nfnet_l0c rtx3090 1.7.1cu11.0 True 1189.57 0.841 512 145.3 6.873 256 24.14 256
seresnet152d v100_32 ngc2012 True 570.22 1.754 512 144.94 6.67 64 66.84 320
dm_nfnet_f0 titanrtx 1.8cu11.1 False 445.52 2.245 512 141.66 7.047 128 71.49 256
seresnet152d v100_32 1.8cu11.1 True 538.83 1.856 512 141.12 6.974 64 66.84 320
regnety_032 v100_32 1.7.1cu11.0 True 1489.05 0.672 512 138.53 7.204 512 19.44 224
regnety_032 v100_32 1.8cu11.1 True 1495.27 0.669 512 137.03 7.292 512 19.44 224
seresnet152d rtx3090 ngc2102 False 476.25 2.1 512 135.44 7.322 64 66.84 320
regnety_032 rtx3090 1.8cu11.1 True 1729.9 0.578 512 135.39 7.38 256 19.44 224
seresnet152d v100_32 1.7.1cu11.0 True 539.99 1.852 512 134.82 7.161 64 66.84 320
seresnet152d v100_32 ngc2102 False 510.22 1.96 512 134.66 7.321 64 66.84 320
seresnet152d v100_32 1.8cu11.1 False 499.41 2.002 512 134.24 7.343 64 66.84 320
regnety_032 rtx3090 1.7.1cu11.0 True 1728.38 0.579 512 133.05 7.501 256 19.44 224
seresnet152d rtx3090 ngc2012 False 462.95 2.16 512 131.67 7.458 64 66.84 320
seresnet152d v100_32 ngc2012 False 499.82 2.001 512 130.68 7.414 64 66.84 320
seresnet152d v100_32 1.7.1cu11.0 False 499.24 2.003 512 128.27 7.541 64 66.84 320
nfnet_l0c titanrtx 1.8cu11.1 True 756.57 1.322 512 122.28 8.173 256 24.14 256
seresnet152d titanrtx ngc2102 True 479.89 2.084 512 121.23 8.18 64 66.84 320
seresnet152d titanrtx ngc2012 True 474.03 2.11 512 118.46 8.304 64 66.84 320
dm_nfnet_f0 v100_32 ngc2012 True 527.2 1.897 512 109.9 9.058 128 71.49 256
seresnet152d titanrtx 1.8cu11.1 True 449.44 2.225 512 108.97 9.115 64 66.84 320
seresnet152d titanrtx ngc2102 False 388.19 2.576 512 106.62 9.311 64 66.84 320
seresnet152d titanrtx ngc2012 False 386.67 2.586 512 106.42 9.257 64 66.84 320
seresnet152d rtx3090 1.8cu11.1 False 442.17 2.262 512 106.2 9.36 64 66.84 320
seresnet152d rtx3090 1.7.1cu11.0 False 440.41 2.271 512 103.17 9.564 64 66.84 320
seresnet152d titanrtx 1.7.1cu10.2 False 421.69 2.371 512 102.56 9.616 64 66.84 320
seresnet152d titanrtx 1.8cu11.1 False 382.65 2.613 512 102.1 9.733 64 66.84 320
dm_nfnet_f0 rtx3090 ngc2102 True 428.52 2.334 512 101.53 9.838 128 71.49 256
regnety_032 titanrtx 1.8cu11.1 True 1107.51 0.903 512 91.89 10.875 256 19.44 224
seresnet152d rtx3090 1.8cu11.1 True 545.44 1.833 512 89.68 11.095 64 66.84 320
seresnet152d rtx3090 1.7.1cu11.0 True 549.13 1.821 512 84.66 11.678 64 66.84 320
dm_nfnet_f0 v100_32 ngc2102 True 418.32 2.39 512 83.74 11.922 128 71.49 256
dm_nfnet_f0 v100_32 1.7.1cu11.0 True 384.12 2.603 512 74.14 13.442 128 71.49 256
dm_nfnet_f0 v100_32 1.8cu11.1 True 386.52 2.587 512 73.75 13.537 128 71.49 256
dm_nfnet_f0 rtx3090 1.8cu11.1 True 438.29 2.282 512 70.17 14.241 128 71.49 256
dm_nfnet_f0 rtx3090 1.7.1cu11.0 True 433.28 2.308 512 69.19 14.427 128 71.49 256
dm_nfnet_f0 titanrtx 1.8cu11.1 True 288.33 3.468 512 52.52 19.027 128 71.49 256
@ieted
Copy link

ieted commented Mar 9, 2021

Channels last. NHWC memory layout, instead of the NCHW pytorch default

On Mon, Mar 8, 2021, 12:47 AM Ted @.> wrote: @.* commented on this gist. ------------------------------ What does "cl" here stand for? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://gist.github.com/bb59f9e245162cee0e38bd66bd8cd77f#gistcomment-3657320, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLQICFHQ3NUS6M4H47PWL3TCSFLVANCNFSM4YY7K5GA .

Thanks for the reply. For anyone who wants to know more about channel last trick, refer to the link.

@monney
Copy link

monney commented Mar 21, 2021

Hi, thank you very much for this detailed benchmark and the rest of your work.
Any clue why the NFNets are benchmarking so much slower than the paper suggests? The Deep mind variants being slow make sense because of conversion overhead, but even L0C seems slow compared what's outlined in the paper? (Only ~1.7x faster than b3a, and ~3x slower than b0, with a lower reported accuracy than f0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment