Skip to content

Instantly share code, notes, and snippets.

@vadi2
Created January 9, 2022 14:27
Show Gist options
  • Save vadi2/b6bbc3dbd2e52132bd7e5802bc0b559a to your computer and use it in GitHub Desktop.
Save vadi2/b6bbc3dbd2e52132bd7e5802bc0b559a to your computer and use it in GitHub Desktop.

4 core machine with svof loaded:

| relative |               ns/op |                op/s |    err% |     total | Implementations
|---------:|--------------------:|--------------------:|--------:|----------:|:----------------
|   100.0% |          180,631.04 |            5,536.15 |    2.2% |      4.58 | `plain for loop`
|    96.9% |          186,443.52 |            5,363.55 |    3.5% |      4.55 | `std::copy_if`
|   110.4% |          163,570.31 |            6,113.58 |    1.4% |      4.00 | `tbb::parallel_for_each, no acc.`
|   108.3% |          166,760.68 |            5,996.62 |    2.6% |      4.13 | `tbb::parallel_for, automatic grain`
|   107.8% |          167,554.00 |            5,968.22 |    2.2% |      4.08 | `tbb::parallel_for, 10 grainsize`
|   108.5% |          166,410.18 |            6,009.25 |    1.9% |      4.14 | `tbb::parallel_for, 100 grainsize`
|    79.3% |          227,648.80 |            4,392.73 |    0.6% |      5.55 | `tbb::parallel_for, 1000 grainsize`
|    86.7% |          208,295.59 |            4,800.87 |    6.0% |      5.24 | :wavy_dash: `tbb::parallel_for, 10000 grainsize` (Unstable with ~2,200.7 iters. Increase `minEpochIterations` to e.g. 22007)
|    92.2% |          195,943.66 |            5,103.51 |    3.2% |      4.85 | `tbb::parallel_for, 100000 grainsize`
|   108.2% |          166,880.97 |            5,992.30 |    4.2% |      4.08 | `tbb::parallel_for, affinity scheduler`
|    45.5% |          397,383.87 |            2,516.46 |    1.7% |      9.71 | `QtConcurrent filtered.results()`
---
|   100.0% |          143,465.91 |            6,970.30 |    3.2% |      3.58 | `plain for loop`
|   101.0% |          142,108.23 |            7,036.89 |    0.8% |      3.49 | `std::copy_if`
|    97.7% |          146,909.14 |            6,806.93 |    1.0% |      3.63 | `tbb::parallel_for_each, no acc.`
|    94.9% |          151,128.10 |            6,616.90 |    3.4% |      3.83 | `tbb::parallel_for, automatic grain`
|    96.6% |          148,528.51 |            6,732.71 |    1.3% |      3.69 | `tbb::parallel_for, 10 grainsize`
|    98.9% |          145,123.51 |            6,890.68 |    1.4% |      3.69 | `tbb::parallel_for, 100 grainsize`
|    77.3% |          185,519.66 |            5,390.26 |    1.4% |      4.53 | `tbb::parallel_for, 1000 grainsize`
|    98.3% |          145,895.96 |            6,854.20 |    2.7% |      3.56 | `tbb::parallel_for, 10000 grainsize`
|    97.2% |          147,616.30 |            6,774.32 |    2.4% |      3.67 | `tbb::parallel_for, 100000 grainsize`
|   106.5% |          134,667.81 |            7,425.68 |    1.7% |      3.32 | `tbb::parallel_for, affinity scheduler`
|    37.8% |          379,300.89 |            2,636.43 |    0.7% |      9.22 | `QtConcurrent filtered.results()`
---
|   100.0% |          132,902.42 |            7,524.32 |    2.9% |      3.22 | `plain for loop`
|   100.9% |          131,696.30 |            7,593.23 |    2.2% |      3.32 | `std::copy_if`
|    99.3% |          133,873.87 |            7,469.72 |    3.3% |      3.30 | `tbb::parallel_for_each, no acc.`
|    99.3% |          133,832.83 |            7,472.01 |    1.6% |      3.31 | `tbb::parallel_for, automatic grain`
|    99.2% |          133,975.18 |            7,464.07 |    0.8% |      3.35 | `tbb::parallel_for, 10 grainsize`
|   102.3% |          129,916.43 |            7,697.26 |    1.7% |      3.28 | `tbb::parallel_for, 100 grainsize`
|    76.6% |          173,507.44 |            5,763.44 |    1.4% |      4.22 | `tbb::parallel_for, 1000 grainsize`
|    98.2% |          135,359.20 |            7,387.75 |    1.7% |      3.42 | `tbb::parallel_for, 10000 grainsize`
|    97.2% |          136,790.82 |            7,310.43 |    2.2% |      3.31 | `tbb::parallel_for, 100000 grainsize`
|    98.9% |          134,370.94 |            7,442.09 |    3.2% |      3.26 | `tbb::parallel_for, affinity scheduler`
|    35.2% |          377,569.54 |            2,648.52 |    2.4% |      9.21 | `QtConcurrent filtered.results()`
---
|   100.0% |          150,664.05 |            6,637.28 |    1.0% |      3.73 | `plain for loop`
|    98.6% |          152,859.08 |            6,541.97 |    1.6% |      3.82 | `std::copy_if`
|   102.0% |          147,758.21 |            6,767.81 |    2.2% |      3.61 | `tbb::parallel_for_each, no acc.`
|   101.9% |          147,897.95 |            6,761.42 |    2.9% |      3.74 | `tbb::parallel_for, automatic grain`
|   101.4% |          148,526.04 |            6,732.83 |    1.3% |      3.63 | `tbb::parallel_for, 10 grainsize`
|   102.3% |          147,239.66 |            6,791.65 |    1.8% |      3.59 | `tbb::parallel_for, 100 grainsize`
|    75.3% |          200,124.39 |            4,996.89 |    2.0% |      4.86 | `tbb::parallel_for, 1000 grainsize`
|    95.8% |          157,222.57 |            6,360.41 |    1.8% |      3.85 | `tbb::parallel_for, 10000 grainsize`
|    95.7% |          157,467.08 |            6,350.53 |    1.3% |      3.89 | `tbb::parallel_for, 100000 grainsize`
|   106.1% |          142,056.10 |            7,039.47 |    1.8% |      3.54 | `tbb::parallel_for, affinity scheduler`
|    41.8% |          360,300.78 |            2,775.46 |    2.2% |      8.79 | `QtConcurrent filtered.results()`
---
|   100.0% |          132,145.37 |            7,567.42 |    3.4% |      3.22 | `plain for loop`
|   100.6% |          131,403.09 |            7,610.17 |    0.5% |      3.20 | `std::copy_if`
|    93.7% |          141,060.69 |            7,089.15 |    5.7% |      3.44 | :wavy_dash: `tbb::parallel_for_each, no acc.` (Unstable with ~2,200.7 iters. Increase `minEpochIterations` to e.g. 22007)
|    98.8% |          133,791.87 |            7,474.30 |    0.9% |      3.30 | `tbb::parallel_for, automatic grain`
|    95.1% |          138,988.28 |            7,194.85 |    1.8% |      3.40 | `tbb::parallel_for, 10 grainsize`
|    98.7% |          133,885.28 |            7,469.08 |    2.4% |      3.38 | `tbb::parallel_for, 100 grainsize`
|    76.9% |          171,853.28 |            5,818.92 |    0.3% |      4.20 | `tbb::parallel_for, 1000 grainsize`
|    91.7% |          144,142.48 |            6,937.58 |    5.2% |      3.52 | :wavy_dash: `tbb::parallel_for, 10000 grainsize` (Unstable with ~2,200.7 iters. Increase `minEpochIterations` to e.g. 22007)
|    96.0% |          137,660.30 |            7,264.26 |    0.7% |      3.46 | `tbb::parallel_for, 100000 grainsize`
|   103.1% |          128,176.47 |            7,801.74 |    1.0% |      3.25 | `tbb::parallel_for, affinity scheduler`
|    35.6% |          371,121.31 |            2,694.54 |    0.4% |      9.13 | `QtConcurrent filtered.results()`
---
|   100.0% |          152,636.67 |            6,551.51 |    0.7% |      3.80 | `plain for loop`
|    99.2% |          153,932.27 |            6,496.36 |    1.4% |      3.90 | `std::copy_if`
|   104.9% |          145,513.29 |            6,872.22 |    1.0% |      3.60 | `tbb::parallel_for_each, no acc.`
|   102.8% |          148,545.35 |            6,731.95 |    3.1% |      3.64 | `tbb::parallel_for, automatic grain`
|   102.7% |          148,586.38 |            6,730.09 |    1.9% |      3.76 | `tbb::parallel_for, 10 grainsize`
|   105.0% |          145,332.13 |            6,880.79 |    2.4% |      3.59 | `tbb::parallel_for, 100 grainsize`
|    79.7% |          191,591.20 |            5,219.45 |    0.7% |      4.68 | `tbb::parallel_for, 1000 grainsize`
|    98.7% |          154,568.89 |            6,469.61 |    0.8% |      3.91 | `tbb::parallel_for, 10000 grainsize`
|    97.3% |          156,810.43 |            6,377.13 |    2.3% |      3.87 | `tbb::parallel_for, 100000 grainsize`
|   111.3% |          137,114.62 |            7,293.17 |    2.1% |      3.49 | `tbb::parallel_for, affinity scheduler`
|    40.5% |          377,231.30 |            2,650.89 |    0.7% |      9.24 | `QtConcurrent filtered.results()`
---
|   100.0% |          132,073.40 |            7,571.55 |    1.7% |      3.34 | `plain for loop`
|    97.1% |          136,035.81 |            7,351.01 |    2.9% |      3.31 | `std::copy_if`
|    99.8% |          132,336.95 |            7,556.47 |    0.7% |      3.22 | `tbb::parallel_for_each, no acc.`
|    96.7% |          136,586.11 |            7,321.39 |    3.1% |      3.47 | `tbb::parallel_for, automatic grain`
|    95.9% |          137,731.92 |            7,260.48 |    0.4% |      3.43 | `tbb::parallel_for, 10 grainsize`
|    98.2% |          134,474.94 |            7,436.33 |    1.4% |      3.30 | `tbb::parallel_for, 100 grainsize`
|    74.0% |          178,527.09 |            5,601.39 |    2.0% |      4.32 | `tbb::parallel_for, 1000 grainsize`
|    97.9% |          134,890.88 |            7,413.40 |    1.2% |      3.33 | `tbb::parallel_for, 10000 grainsize`
|    98.5% |          134,121.60 |            7,455.92 |    0.6% |      3.31 | `tbb::parallel_for, 100000 grainsize`
|    98.2% |          134,525.06 |            7,433.56 |    5.3% |      3.34 | :wavy_dash: `tbb::parallel_for, affinity scheduler` (Unstable with ~2,200.7 iters. Increase `minEpochIterations` to e.g. 22007)
|    34.8% |          379,163.47 |            2,637.38 |    1.6% |      9.28 | `QtConcurrent filtered.results()`
---
|   100.0% |          146,846.90 |            6,809.81 |    1.6% |      3.59 | `plain for loop`
|    99.3% |          147,853.15 |            6,763.47 |    1.0% |      3.69 | `std::copy_if`
|    98.8% |          148,592.09 |            6,729.83 |    2.3% |      3.69 | `tbb::parallel_for_each, no acc.`
|    99.4% |          147,708.23 |            6,770.10 |    1.4% |      3.65 | `tbb::parallel_for, automatic grain`
|    94.6% |          155,255.68 |            6,440.99 |    2.2% |      3.77 | `tbb::parallel_for, 10 grainsize`
|    96.7% |          151,913.25 |            6,582.70 |    3.7% |      3.76 | `tbb::parallel_for, 100 grainsize`
|    76.0% |          193,121.06 |            5,178.10 |    0.6% |      4.71 | `tbb::parallel_for, 1000 grainsize`
|    97.2% |          151,085.84 |            6,618.75 |    0.8% |      3.83 | `tbb::parallel_for, 10000 grainsize`
|    93.9% |          156,333.89 |            6,396.57 |    3.1% |      3.84 | `tbb::parallel_for, 100000 grainsize`
|   103.2% |          142,356.32 |            7,024.63 |    2.0% |      3.55 | `tbb::parallel_for, affinity scheduler`
|    37.9% |          387,357.02 |            2,581.60 |    0.5% |      9.52 | `QtConcurrent filtered.results()`
---
|   100.0% |          167,798.43 |            5,959.53 |    1.8% |      4.18 | `plain for loop`
|   100.9% |          166,318.70 |            6,012.55 |    0.9% |      4.13 | `std::copy_if`
|   114.6% |          146,409.48 |            6,830.16 |    0.9% |      3.61 | `tbb::parallel_for_each, no acc.`
|   113.6% |          147,717.62 |            6,769.67 |    1.3% |      3.77 | `tbb::parallel_for, automatic grain`
|   112.9% |          148,673.28 |            6,726.16 |    1.1% |      3.62 | `tbb::parallel_for, 10 grainsize`
|   111.6% |          150,330.53 |            6,652.01 |    2.2% |      3.74 | `tbb::parallel_for, 100 grainsize`
|    81.4% |          206,159.81 |            4,850.61 |    0.6% |      5.02 | `tbb::parallel_for, 1000 grainsize`
|    97.8% |          171,635.09 |            5,826.31 |    0.9% |      4.18 | `tbb::parallel_for, 10000 grainsize`
|    96.7% |          173,475.08 |            5,764.52 |    2.7% |      4.33 | `tbb::parallel_for, 100000 grainsize`
|   115.8% |          144,873.09 |            6,902.59 |    1.9% |      3.58 | `tbb::parallel_for, affinity scheduler`
|    43.4% |          387,019.25 |            2,583.85 |    1.7% |      9.50 | `QtConcurrent filtered.results()`
---
|   100.0% |          137,535.95 |            7,270.83 |    4.4% |      3.27 | `plain for loop`
|   104.1% |          132,112.14 |            7,569.33 |    1.8% |      3.29 | `std::copy_if`
|    99.7% |          138,007.62 |            7,245.98 |    1.2% |      3.44 | `tbb::parallel_for_each, no acc.`
|    98.5% |          139,679.91 |            7,159.23 |    1.2% |      3.50 | `tbb::parallel_for, automatic grain`
|    96.8% |          142,065.21 |            7,039.02 |    1.2% |      3.59 | `tbb::parallel_for, 10 grainsize`
|    98.5% |          139,687.05 |            7,158.86 |    2.7% |      3.43 | `tbb::parallel_for, 100 grainsize`
|    78.3% |          175,632.04 |            5,693.72 |    1.4% |      4.38 | `tbb::parallel_for, 1000 grainsize`
|    99.2% |          138,682.61 |            7,210.71 |    1.0% |      3.39 | `tbb::parallel_for, 10000 grainsize`
|    96.8% |          142,121.28 |            7,036.24 |    1.9% |      3.50 | `tbb::parallel_for, 100000 grainsize`
|   102.5% |          134,182.69 |            7,452.53 |    0.7% |      3.43 | `tbb::parallel_for, affinity scheduler`
|    36.0% |          382,226.81 |            2,616.25 |    1.6% |      9.52 | `QtConcurrent filtered.results()`
---
|   100.0% |          163,549.30 |            6,114.36 |    8.8% |      4.05 | :wavy_dash: `plain for loop` (Unstable with ~2,200.7 iters. Increase `minEpochIterations` to e.g. 22007)
|   102.6% |          159,398.62 |            6,273.58 |    3.1% |      3.88 | `std::copy_if`
|   110.6% |          147,906.83 |            6,761.01 |    2.9% |      3.74 | `tbb::parallel_for_each, no acc.`
|   112.7% |          145,165.27 |            6,888.70 |    1.2% |      3.60 | `tbb::parallel_for, automatic grain`
|   111.3% |          146,976.88 |            6,803.79 |    1.3% |      3.72 | `tbb::parallel_for, 10 grainsize`
|   111.1% |          147,227.26 |            6,792.22 |    2.9% |      3.71 | `tbb::parallel_for, 100 grainsize`
|    81.9% |          199,712.03 |            5,007.21 |    1.3% |      4.90 | `tbb::parallel_for, 1000 grainsize`
|   101.9% |          160,525.78 |            6,229.53 |    1.6% |      3.91 | `tbb::parallel_for, 10000 grainsize`
|   101.4% |          161,358.16 |            6,197.39 |    2.2% |      4.02 | `tbb::parallel_for, 100000 grainsize`
|   115.5% |          141,634.23 |            7,060.44 |    1.0% |      3.47 | `tbb::parallel_for, affinity scheduler`
|    44.5% |          367,912.17 |            2,718.04 |    2.2% |      9.00 | `QtConcurrent filtered.results()`
---
|   100.0% |          131,420.77 |            7,609.15 |    1.7% |      3.18 | `plain for loop`
|    99.2% |          132,428.25 |            7,551.26 |    2.1% |      3.38 | `std::copy_if`
|    98.1% |          134,023.74 |            7,461.36 |    3.9% |      3.35 | `tbb::parallel_for_each, no acc.`
|    99.2% |          132,452.34 |            7,549.89 |    0.8% |      3.26 | `tbb::parallel_for, automatic grain`
|    95.3% |          137,928.39 |            7,250.14 |    3.1% |      3.46 | `tbb::parallel_for, 10 grainsize`
|    98.5% |          133,455.81 |            7,493.12 |    1.7% |      3.28 | `tbb::parallel_for, 100 grainsize`
|    76.1% |          172,788.58 |            5,787.42 |    1.6% |      4.20 | `tbb::parallel_for, 1000 grainsize`
|    97.7% |          134,474.89 |            7,436.33 |    1.1% |      3.45 | `tbb::parallel_for, 10000 grainsize`
|    96.8% |          135,757.70 |            7,366.06 |    1.2% |      3.32 | `tbb::parallel_for, 100000 grainsize`
|   102.2% |          128,596.54 |            7,776.26 |    2.0% |      3.18 | `tbb::parallel_for, affinity scheduler`
|    34.6% |          379,960.04 |            2,631.86 |    1.4% |      9.24 | `QtConcurrent filtered.results()`
---
|   100.0% |          164,408.32 |            6,082.42 |    1.3% |      4.02 | `plain for loop`
|   101.6% |          161,827.46 |            6,179.42 |    0.8% |      3.93 | `std::copy_if`
|   107.4% |          153,112.24 |            6,531.16 |    2.9% |      3.80 | `tbb::parallel_for_each, no acc.`
|   103.6% |          158,693.06 |            6,301.47 |    4.6% |      3.90 | `tbb::parallel_for, automatic grain`
|   107.0% |          153,593.67 |            6,510.69 |    1.1% |      3.82 | `tbb::parallel_for, 10 grainsize`
|   107.5% |          152,881.59 |            6,541.01 |    1.9% |      3.77 | `tbb::parallel_for, 100 grainsize`
|    78.4% |          209,762.65 |            4,767.29 |    0.6% |      5.12 | `tbb::parallel_for, 1000 grainsize`
|    95.5% |          172,118.87 |            5,809.94 |    1.4% |      4.23 | `tbb::parallel_for, 10000 grainsize`
|    94.5% |          173,941.08 |            5,749.07 |    1.8% |      4.30 | `tbb::parallel_for, 100000 grainsize`
|   110.1% |          149,338.57 |            6,696.19 |    3.2% |      3.64 | `tbb::parallel_for, affinity scheduler`
|    42.7% |          384,984.79 |            2,597.51 |    0.8% |      9.48 | `QtConcurrent filtered.results()`
---
|   100.0% |          169,039.26 |            5,915.79 |    2.8% |      4.17 | `plain for loop`
|   100.4% |          168,342.23 |            5,940.28 |    1.2% |      4.39 | `std::copy_if`
|   106.4% |          158,888.79 |            6,293.71 |    1.0% |      3.94 | `tbb::parallel_for_each, no acc.`
|   110.7% |          152,750.97 |            6,546.60 |    1.3% |      3.89 | `tbb::parallel_for, automatic grain`
|   109.7% |          154,044.18 |            6,491.64 |    2.2% |      3.77 | `tbb::parallel_for, 10 grainsize`
|   107.9% |          156,708.59 |            6,381.27 |    3.6% |      3.78 | `tbb::parallel_for, 100 grainsize`
|    79.5% |          212,655.79 |            4,702.43 |    0.8% |      5.22 | `tbb::parallel_for, 1000 grainsize`
|    97.0% |          174,213.04 |            5,740.10 |    1.7% |      4.26 | `tbb::parallel_for, 10000 grainsize`
|    93.3% |          181,085.08 |            5,522.27 |    3.0% |      4.39 | `tbb::parallel_for, 100000 grainsize`
|   110.6% |          152,784.40 |            6,545.17 |    4.2% |      3.75 | `tbb::parallel_for, affinity scheduler`
|    42.4% |          398,942.55 |            2,506.63 |    2.3% |      9.70 | `QtConcurrent filtered.results()`
---
|   100.0% |          171,826.03 |            5,819.84 |    3.6% |      4.19 | `plain for loop`
|   101.7% |          168,939.57 |            5,919.28 |    2.0% |      4.11 | `std::copy_if`
|   112.6% |          152,549.76 |            6,555.24 |    1.0% |      3.75 | `tbb::parallel_for_each, no acc.`
|   107.1% |          160,497.19 |            6,230.64 |    4.3% |      3.93 | `tbb::parallel_for, automatic grain`
|   113.2% |          151,732.88 |            6,590.53 |    0.6% |      3.77 | `tbb::parallel_for, 10 grainsize`
|   113.8% |          151,048.93 |            6,620.37 |    1.8% |      3.74 | `tbb::parallel_for, 100 grainsize`
|    82.0% |          209,432.38 |            4,774.81 |    0.5% |      5.14 | `tbb::parallel_for, 1000 grainsize`
|   100.5% |          171,051.46 |            5,846.19 |    0.8% |      4.21 | `tbb::parallel_for, 10000 grainsize`
|    97.0% |          177,185.96 |            5,643.79 |    1.5% |      4.37 | `tbb::parallel_for, 100000 grainsize`
|   114.2% |          150,493.44 |            6,644.81 |    2.3% |      3.70 | `tbb::parallel_for, affinity scheduler`
|    44.5% |          386,385.10 |            2,588.09 |    1.4% |      9.42 | `QtConcurrent filtered.results()`
---
|   100.0% |          128,931.82 |            7,756.04 |    2.0% |      3.13 | `plain for loop`
|    97.6% |          132,055.07 |            7,572.60 |    3.4% |      3.28 | `std::copy_if`
|    92.6% |          139,229.49 |            7,182.39 |    1.6% |      3.47 | `tbb::parallel_for_each, no acc.`
|    91.5% |          140,911.58 |            7,096.65 |    0.7% |      3.45 | `tbb::parallel_for, automatic grain`
|    84.9% |          151,774.32 |            6,588.73 |    6.5% |      3.95 | :wavy_dash: `tbb::parallel_for, 10 grainsize` (Unstable with ~2,200.7 iters. Increase `minEpochIterations` to e.g. 22007)
|    85.4% |          151,010.66 |            6,622.05 |    8.8% |      3.73 | :wavy_dash: `tbb::parallel_for, 100 grainsize` (Unstable with ~2,200.7 iters. Increase `minEpochIterations` to e.g. 22007)
|    72.6% |          177,705.41 |            5,627.29 |    1.3% |      4.40 | `tbb::parallel_for, 1000 grainsize`
|    94.9% |          135,813.04 |            7,363.06 |    2.0% |      3.38 | `tbb::parallel_for, 10000 grainsize`
|    93.9% |          137,236.59 |            7,286.69 |    1.6% |      3.37 | `tbb::parallel_for, 100000 grainsize`
|    94.6% |          136,347.15 |            7,334.22 |    1.5% |      3.38 | `tbb::parallel_for, affinity scheduler`
|    33.5% |          385,112.79 |            2,596.64 |    1.7% |      9.32 | `QtConcurrent filtered.results()`
---
|   100.0% |          154,195.88 |            6,485.26 |    2.2% |      3.90 | `plain for loop`
|   100.0% |          154,251.77 |            6,482.91 |    0.9% |      3.81 | `std::copy_if`
|    97.0% |          158,929.85 |            6,292.08 |    1.8% |      3.88 | `tbb::parallel_for_each, no acc.`
|    96.6% |          159,589.43 |            6,266.08 |    2.7% |      3.96 | `tbb::parallel_for, automatic grain`
|    96.4% |          159,951.77 |            6,251.88 |    1.4% |      3.94 | `tbb::parallel_for, 10 grainsize`
|    98.2% |          156,972.11 |            6,370.56 |    0.9% |      3.96 | `tbb::parallel_for, 100 grainsize`
|    78.2% |          197,080.35 |            5,074.07 |    0.9% |      4.84 | `tbb::parallel_for, 1000 grainsize`
|    98.5% |          156,607.86 |            6,385.38 |    0.7% |      3.80 | `tbb::parallel_for, 10000 grainsize`
|    95.7% |          161,200.50 |            6,203.45 |    3.1% |      4.04 | `tbb::parallel_for, 100000 grainsize`
|    96.5% |          159,753.06 |            6,259.66 |    4.6% |      3.88 | `tbb::parallel_for, affinity scheduler`
|    39.2% |          393,617.47 |            2,540.54 |    1.5% |      9.59 | `QtConcurrent filtered.results()`
---
|   100.0% |          156,178.04 |            6,402.95 |    1.9% |      3.87 | `plain for loop`
|    99.5% |          156,987.11 |            6,369.95 |    1.7% |      3.82 | `std::copy_if`
|   103.0% |          151,593.55 |            6,596.59 |    2.9% |      3.75 | `tbb::parallel_for_each, no acc.`
|   100.4% |          155,609.36 |            6,426.35 |    3.2% |      3.83 | `tbb::parallel_for, automatic grain`
|   101.9% |          153,298.52 |            6,523.22 |    1.2% |      3.73 | `tbb::parallel_for, 10 grainsize`
|   103.7% |          150,545.68 |            6,642.50 |    1.6% |      3.67 | `tbb::parallel_for, 100 grainsize`
|    77.3% |          202,081.48 |            4,948.50 |    2.3% |      4.93 | `tbb::parallel_for, 1000 grainsize`
|    97.4% |          160,341.94 |            6,236.67 |    2.0% |      3.93 | `tbb::parallel_for, 10000 grainsize`
|    97.2% |          160,677.80 |            6,223.63 |    2.3% |      4.03 | `tbb::parallel_for, 100000 grainsize`
|   103.7% |          150,587.13 |            6,640.67 |    2.7% |      3.69 | `tbb::parallel_for, affinity scheduler`
|    40.0% |          390,502.13 |            2,560.81 |    2.2% |      9.50 | `QtConcurrent filtered.results()`
---
|   100.0% |          146,123.50 |            6,843.53 |    2.6% |      3.56 | `plain for loop`
|    98.4% |          148,544.48 |            6,731.99 |    2.5% |      3.69 | `std::copy_if`
|    97.3% |          150,155.90 |            6,659.75 |    2.0% |      3.68 | `tbb::parallel_for_each, no acc.`
|    96.4% |          151,586.24 |            6,596.90 |    1.4% |      3.74 | `tbb::parallel_for, automatic grain`
|    96.8% |          151,021.95 |            6,621.55 |    0.7% |      3.77 | `tbb::parallel_for, 10 grainsize`
|    99.6% |          146,643.54 |            6,819.26 |    1.9% |      3.60 | `tbb::parallel_for, 100 grainsize`
|    78.1% |          187,155.69 |            5,343.15 |    0.5% |      4.58 | `tbb::parallel_for, 1000 grainsize`
|    95.3% |          153,381.63 |            6,519.69 |    3.1% |      3.77 | `tbb::parallel_for, 10000 grainsize`
|    96.5% |          151,442.99 |            6,603.14 |    1.4% |      3.74 | `tbb::parallel_for, 100000 grainsize`
|   101.1% |          144,529.76 |            6,918.99 |    2.0% |      3.62 | `tbb::parallel_for, affinity scheduler`
|    38.5% |          379,853.59 |            2,632.59 |    1.1% |      9.26 | `QtConcurrent filtered.results()`
---
|   100.0% |          134,440.94 |            7,438.21 |    2.8% |      3.34 | `plain for loop`
|   102.0% |          131,861.70 |            7,583.70 |    0.8% |      3.22 | `std::copy_if`
|    95.6% |          140,572.85 |            7,113.75 |    2.0% |      3.42 | `tbb::parallel_for_each, no acc.`
|    97.2% |          138,331.02 |            7,229.04 |    1.1% |      3.58 | `tbb::parallel_for, automatic grain`
|    83.1% |          161,824.01 |            6,179.55 |    8.1% |      4.12 | :wavy_dash: `tbb::parallel_for, 10 grainsize` (Unstable with ~2,200.7 iters. Increase `minEpochIterations` to e.g. 22007)
|    98.0% |          137,244.77 |            7,286.25 |    1.9% |      3.44 | `tbb::parallel_for, 100 grainsize`
|    77.1% |          174,308.05 |            5,736.97 |    1.6% |      4.28 | `tbb::parallel_for, 1000 grainsize`
|    99.1% |          135,700.49 |            7,369.17 |    1.8% |      3.33 | `tbb::parallel_for, 10000 grainsize`
|    96.0% |          140,070.50 |            7,139.26 |    3.9% |      3.44 | `tbb::parallel_for, 100000 grainsize`
|    99.7% |          134,825.79 |            7,416.98 |    1.4% |      3.32 | `tbb::parallel_for, affinity scheduler`
|    35.7% |          376,914.54 |            2,653.12 |    1.0% |      9.20 | `QtConcurrent filtered.results()`
---
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment