Today I got to know about this Intel Distribution for Python and, after browsing a bit on their web site, I was interested to run one of the benchmarks by myself and play a bit with it to compare the results.
Here is how I got two different environments ready for tests and play:
$ docker run -i -t -p 8888:8888 intelpython/intelpython3_full /bin/bash \
-c "/opt/conda/bin/conda install jupyter -y --quiet && mkdir /opt/notebooks && /opt/conda/bin/jupyter \
notebook --notebook-dir=/opt/notebooks --ip='*' --port=8888 --no-browser --allow-root"
Jupyter Notebook Kernel: Python 3.7.7 (default, Mar 13 2020, 13:32:22) [GCC 7.3.0]
$ docker run -i -t -p 8887:8887 continuumio/anaconda3 /bin/bash \
-c "/opt/conda/bin/conda install jupyter -y --quiet && mkdir /opt/notebooks && /opt/conda/bin/jupyter \
notebook --notebook-dir=/opt/notebooks --ip='*' --port=8887 --no-browser --allow-root"
Jupyter Notebook Kernel: Python 3.7.6 (default, Jan 8 2020, 19:59:22) [GCC 7.3.0]
When the containers are ready you can access the Jupyter notebooks using your browser.
In the terminal you'll find the address with containing a token, like this:
http://127.0.0.1:8888/?token=a4b4631ec9f42b7d9d321fed71deff5cc4ed68e8a093dd12
.
Notice that we are using different ports per environment. That's how you are going
to know which environment you are executing the code.
Copy and paste this code in a Notebook cell for each environment and execute it. You will immediately see the difference.
I have adapted the example below that simulates a stochastic differential equation from the IPython Cookbook, Second Edition and executed it for the two different profiles Intel Python and "Plain" Anaconda. The results are interesting but not surprising.
%%timeit
import numpy as np
import numpy as np
try:
import mkl_random as rnd
except ImportError:
import numpy.random as rnd
import matplotlib.pyplot as plt
%matplotlib inline
sigma = 1. # Standard deviation.
mu = 10. # Mean.
tau = .05 # Time constant.
dt = .001 # Time step.
T = 1. # Total time.
n = int(T / dt) # Number of time steps.
t = np.linspace(0., T, n) # Vector of times.
sigma_bis = sigma * np.sqrt(2. / tau)
sqrtdt = np.sqrt(dt)
ntrials = 1000000
X = np.zeros(ntrials)
bins = np.linspace(-2., 14., 100)
fig, ax = plt.subplots(1, 1, figsize=(8, 4))
for i in range(n):
# We update the process independently for
# all trials
X += dt * (-(X - mu) / tau) + \
sigma_bis * sqrtdt * rnd.randn(ntrials)
# We display the histogram for a few points in
# time
if i in (5, 50, 900):
hist, _ = np.histogram(X, bins=bins)
ax.plot((bins[1:] + bins[:-1]) / 2, hist,
{5: '-', 50: '.', 900: '-.', }[i],
label=f"t={i * dt:.2f}")
ax.legend()
- More about random number generation: https://software.intel.com/en-us/blogs/2016/06/15/faster-random-number-generation-in-intel-distribution-for-python
- The benchmarks page: https://software.intel.com/en-us/distribution-for-python/benchmarks
- This is the list of packages and a good way to see what the Intel Distribution in fact adds: https://software.intel.com/en-us/articles/complete-list-of-packages-for-the-intel-distribution-for-python
- Accelerating Scientific Python with Intel Optimizations Paper: http://conference.scipy.org/proceedings/scipy2017/pdfs/oleksandr_pavlyk.pdf