-
-
Save kaityo256/d41481467df504c5039715d38954ab8a to your computer and use it in GitHub Desktop.
#include <iostream> | |
#include <random> | |
struct myrand { | |
uint32_t operator()() { | |
return 0; | |
} | |
uint32_t max(){ | |
return std::mt19937::max(); | |
} | |
uint32_t min(){ | |
return 0; | |
} | |
}; | |
double run(void) { | |
myrand mt; | |
double r = 0.0; | |
std::uniform_real_distribution<> ud(-1.0, 1.0); | |
for (int j = 0; j <10000; j++) { | |
for (int i = 0; i < 10000; i++) { | |
if (i%2) r += ud(mt); | |
} | |
} | |
return r; | |
} | |
int main(){ | |
std::cout << run() << std::endl; | |
} |
Would you try this myrand
definition?
(As per the C++11 spec, G::min() and G::max() are static and constexpr functions, given G is a type which satisfies the UniformRandomBitGenerator requirement)
struct myrand {
uint32_t operator()() {
return 0;
}
static constexpr uint32_t max(){
return std::mt19937::max();
}
static constexpr uint32_t min(){
return 0;
}
};
@equal-l2
Thank you for your suggestion. I have added static constexpr
, but the results did not change.
$ time ./gcc.out
-5e+07
./gcc.out 0.06s user 0.00s system 94% cpu 0.064 total
$ time ./icpc.out
-5e+07
./icpc.out 4.43s user 0.00s system 99% cpu 4.442 total
Actually, the compilers generated identical assembly codes.
I compiled and ran the test program on macOS 10.15.3.
(CPU: Intel(R) Core(TM) i7-8569U CPU @ 2.80GHz)
$ g++-9 --version
g++-9 (Homebrew GCC 9.2.0_3) 9.2.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ icpc --version
icpc (ICC) 19.1.0.166 20191121
Copyright (C) 1985-2019 Intel Corporation. All rights reserved.
$ g++-9 -O3 -march=native -Wall -Wextra -std=c++11 test.cpp -o gcc.out
$ icpc -O3 -xHOST -Wall -Wextra -std=c++11 test.cpp -o icpc.out
$ time ./gcc.out
-5e+07
real 0m0.057s
user 0m0.052s
sys 0m0.003s
$ time ./icpc.out
-5e+07
real 0m0.011s
user 0m0.008s
sys 0m0.002s
The binary built with icpc is faster than the one with g++, under my environment.
Thank you @equal-l2.
I have tried on Linux.
- CentOS Linux release 7.6.1810 (Core)
- Intel(R) Xeon(R) Gold 6230 CPU @ 2.10GHz
$ g++ --version
g++ (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ icpc --version
icpc (ICC) 19.0.4.243 20190416
Copyright (C) 1985-2019 Intel Corporation. All rights reserved.
$ g++ -O3 -march=native -Wall -Wextra -std=c++11 test.cpp -o gcc.out
$ icpc -O3 -xHOST -Wall -Wextra -std=c++11 test.cpp -o icpc.out
$ time ./gcc.out
-5e+07
./gcc.out 0.05s user 0.00s system 99% cpu 0.054 total
$ time ./icpc.out
-5e+07
./icpc.out 2.72s user 0.00s system 99% cpu 2.727 total
While I used the latest version of the Intel compiler, I observed similar behavior.
It's weird...
I measured the execution time of the code modified by @equal-l2. And I also measured the execution time of that code with Clang and show the results.
Environment
- CPU: Intel(R) Core(TM) i7-7820X @ 3.60GHz
- OS: Linux Mint 19.3 Tricia (x86_64)
- g++ (gcc) 9.2.1 20191102
- icpc (ICC) 19.1.0.166
- clang++ (Clang) 8.0.0-3~ubuntu18.04.2
$ g++ -O3 -march=native -Wall -Wextra -std=c++11 test.cpp -o gcc.out
$ time ./gcc.out
-5e+07
real 0m0.058s
user 0m0.058s
sys 0m0.000s
$ icpc -O3 -xHOST -Wall -Wextra -std=c++11 test.cpp -o icpc.out
$ time ./icpc.out
-5e+07
real 0m2.914s
user 0m2.914s
sys 0m0.000s
$ clang++-8 -O3 -march=native -Wall -Wextra -std=c++11 test.cpp -o clang.out
$ time ./clang.out
-5e+07
real 0m3.358s
user 0m3.354s
sys 0m0.004s
As you can see, under my environment, the Intel compiler is about 50 times slower than gcc and Clang is about 58 times slower than gcc.
Thanks @dc1394. That's interesting.
In that sense, we should say "GCC generates faster executables" instead of "Intel compiler generates slower ones"...
I think this code doesn't measure the code-gen quality of those two compilers but compares the performance of the Mersenne twister implementation and tuning...
Yep, you are right, @uTnOJkji5quPSNE5.
I should say, "the Mersenne Twister implementation included in GCC was fast". Anyway, I'm not sure why.
Environment