https://gist.github.com/tanakamura/efbfab5cfdf6707d098714d616dd6ef3
PL2 を変えてやってみた
PL2 = 80W
 Performance counter stats for 'make -j 40':
https://gist.github.com/tanakamura/efbfab5cfdf6707d098714d616dd6ef3
PL2 を変えてやってみた
PL2 = 80W
 Performance counter stats for 'make -j 40':
============= LATENCY ==============================================================================
                              instruction |     IPC         (   rel[%]),     CPI         (   rel[%])
------------------------------------------+---------------------------------------------------------
    m128                            addps |    0.50-0.25    ( 100.0[%]),    2.00-4.00    ( -50.0[%])
    m128                           aesdec |    0.33-0.14    ( 133.4[%]),    3.00-7.00    ( -57.1[%])
    m128                       aesdeclast |    0.33-0.14    ( 133.4[%]),    3.00-7.00    ( -57.1[%])
    m128                           aesenc |    0.33-0.14    ( 133.3[%]),    3.00-7.00    ( -57.1[%])
    m128                       aesenclast |    0.33-0.14    ( 133.4[%]),    3.00-7.00    ( -57.1[%])
    m128                          blendps |    1.00-1.00    (   0.1[%]),    1.00-1.00    (  -0.1[%])
https://zenn.dev/tanakmura/articles/litex_linux_ae3feff0b48ede これで説明した make.py を vivado で実行
zen2
alderlake P core
linux build
Linux-5.14.15 の make defconfig したものから make を二回やって二回目
[J] は、/sys/class/powercap/intel-rapl:0/energy_uj を読んで出たJoule値 (CPU内蔵センサー値なので、AMDとIntelで基準が違う可能性あり)
以下のようなのを rapl-run.py として、
| ooo ratio : 1.398357 | |
| ostimer: clock_gettime | |
| userland_timer: cntvct | |
| perf_counter: no | |
| Qualcomm Snapdragon 710 | |
| ==== idiv32-realtime ==== | |
| -> : divider_bit | |
| | | 1| 2| 3| 4| 5| 6| 7| 8| 9| 10| 11| 12| 13| 14| 15| 16| 17| 18| 19| 20| 21| 22| 23| 24| 25| 26| 27| 28| 29| 30| 31| 32 | |
| --------------------------------------------------------------------------------------------------------------------------------------------- | |
| | 0 | 2.9|2.9| 7.2| 2.9|2.9|2.9|2.9| 3.2|3.0|2.9| 2.9|2.9|2.9|2.9|2.9| 2.9|2.9|2.9|2.9|2.9| 2.9|2.9|2.9| 2.9|2.9|2.9|2.9|2.9|2.9|14.2|4.0|2.9 | 
| | |result | |
| -------------------------- | |
| | ROB | 389 | |
| | INT PRF | 384 | |
| | FP PRF | 372 | |
| | INT(multi chain) | 32 | |
| | FP(multi chain) | 25 | |
| |INT(single chain) | 32 | |
| | FP(single chain) | 25 | |
| v : test_name | 
| ostimer: clock_gettime | |
| userland_timer: rdtscp | |
| perf_counter: yes | |
| Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz | |
| ==== fpu ==== | |
| | | nsec/call | |
| ------------------------- | |
| |denormal_add | 1.28803 | |
| | normal_add | 1.04813 | |
| |denormal_mul | 1.05132 | 
| ostimer: clock_gettime | |
| userland_timer: rdtscp | |
| perf_counter: yes | |
| AMD Ryzen 7 3700X 8-Core Processor | |
| ==== libc ==== | |
| | | nsec/call | |
| ----------------------------------- | |
| | atoi_99999 | 14.19927 | |
| | fflush_stdout | 5.74030 | |
| | sscanf_double_99999 | 122.22827 | 
| ostimer: clock_gettime | |
| userland_timer: rdtscp | |
| perf_counter: yes | |
| AMD Ryzen 7 3700X 8-Core Processor | |
| ==== libc ==== | |
| | | nsec/call | |
| ---------------------------------- | |
| | atoi_99999 | 17.43726 | |
| | fflush_stdout | 11.08052 | |
| | sscanf_double_99999 | 121.75406 | 
| ostimer: clock_gettime | |
| userland_timer: rdtscp | |
| perf_counter: yes | |
| Intel(R) Core(TM) i5-8250U CPU @ 1.60GHz | |
| ==== cache-bandwidth-1t ==== | |
| <copy> | |
| | |GiB/s | |
| -------------------- | |
| | 3072 |187.16192 | |
| | 4096 |159.48263 |