Skip to content

Instantly share code, notes, and snippets.

@jorisvandenbossche
Last active September 17, 2019 07:09
Show Gist options
  • Select an option

  • Save jorisvandenbossche/a97d780c04959812c40b22d6c866f4fa to your computer and use it in GitHub Desktop.

Select an option

Save jorisvandenbossche/a97d780c04959812c40b22d6c866f4fa to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Profiling pyproj transformations releasing the GIL\n",
"\n",
"Context: https://github.com/pyproj4/pyproj/pull/437"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pyproj\n",
"pyproj.datadir.set_data_dir('/home/joris/miniconda3/envs/proj-dev/share/proj')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using an array of one million elements:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"x_coords = np.random.randint(80000, 120000, size=1000000)\n",
"y_coords = np.random.randint(200000, 250000, size=1000000)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Single-threaded timings"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using the Transformer:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"804 ms ± 16.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%%timeit \n",
"transformer = pyproj.Transformer.from_proj(2263, 4326)\n",
"transformer.transform(x_coords, y_coords)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Using the `Proj` interface:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"p = pyproj.Proj(2263)\n",
"lat_coords, lon_coords = p(x_coords, y_coords, inverse=True)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"924 ms ± 6.46 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%%timeit\n",
"p = pyproj.Proj(2263)\n",
"p(x_coords, y_coords, inverse=True)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"568 ms ± 14.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%%timeit\n",
"p = pyproj.Proj(2263)\n",
"p(lat_coords, lon_coords)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When redoing the above timings with pyproj master with the commit before nogil was added, the timings don't really change. The forward call on the Proj object is slightly faster (460 without nogil vs 568 with nogil), but would need to run that some more to see if this is really consistent:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"797 ms ± 16.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%%timeit \n",
"transformer = pyproj.Transformer.from_proj(2263, 4326)\n",
"transformer.transform(x_coords, y_coords)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"p = pyproj.Proj(2263)\n",
"lat_coords, lon_coords = p(x_coords, y_coords, inverse=True)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"915 ms ± 2.29 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%%timeit\n",
"p = pyproj.Proj(2263)\n",
"p(x_coords, y_coords, inverse=True)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"460 ms ± 3.64 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%%timeit\n",
"p = pyproj.Proj(2263)\n",
"p(lat_coords, lon_coords)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"(note: the above 4 code cells are thus run with a different pyproj version! Now returning back to the latest master with the nogil changes added)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"---\n",
"\n",
"### Speeding up with multi-threading?\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To time this, I am using a small \"test_parallel\" utility to run a function in parallel, so it can be easily compared to running this sequentially and what speed up parallel can bring."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"# from pandas.util.testing\n",
"from functools import wraps\n",
"\n",
"def test_parallel(num_threads=2, kwargs_list=None):\n",
" \"\"\"Decorator to run the same function multiple times in parallel.\n",
"\n",
" Parameters\n",
" ----------\n",
" num_threads : int, optional\n",
" The number of times the function is run in parallel.\n",
" kwargs_list : list of dicts, optional\n",
" The list of kwargs to update original\n",
" function kwargs on different threads.\n",
" Notes\n",
" -----\n",
" This decorator does not pass the return value of the decorated function.\n",
"\n",
" Original from scikit-image:\n",
"\n",
" https://github.com/scikit-image/scikit-image/pull/1519\n",
"\n",
" \"\"\"\n",
"\n",
" assert num_threads > 0\n",
" has_kwargs_list = kwargs_list is not None\n",
" if has_kwargs_list:\n",
" assert len(kwargs_list) == num_threads\n",
" import threading\n",
"\n",
" def wrapper(func):\n",
" @wraps(func)\n",
" def inner(*args, **kwargs):\n",
" if has_kwargs_list:\n",
" update_kwargs = lambda i: dict(kwargs, **kwargs_list[i])\n",
" else:\n",
" update_kwargs = lambda i: kwargs\n",
" threads = []\n",
" for i in range(num_threads):\n",
" updated_kwargs = update_kwargs(i)\n",
" thread = threading.Thread(target=func, args=args, kwargs=updated_kwargs)\n",
" threads.append(thread)\n",
" for thread in threads:\n",
" thread.start()\n",
" for thread in threads:\n",
" thread.join()\n",
"\n",
" return inner\n",
"\n",
" return wrapper"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The code to run in parallel or not:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"def run_transformer():\n",
" # Transformer object needs to be created in each thread, so including in the function\n",
" transformer = pyproj.Transformer.from_proj(2263, 4326)\n",
" transformer.transform(x_coords, y_coords)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Running this 4 times sequentially:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"def f():\n",
" for i in range(4):\n",
" run_transformer()"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3.22 s ± 50.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%timeit f()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Running this 4 times but in parallel:"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"@test_parallel(4)\n",
"def g():\n",
" run_transformer()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1.55 s ± 143 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%timeit g()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This gives a nice speed-up (factor 2 - 2.5, so still quite a bit below the theoretical maximum), meaning that the nogil is doing its work."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now doing this for the Pyproj inverse and forward ones (nogil was added to the forward call, but not to inverse):"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"def run_proj_inverse():\n",
" p = pyproj.Proj(2263)\n",
" p(x_coords, y_coords, inverse=True)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"def f():\n",
" for i in range(4):\n",
" run_proj_inverse()"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3.77 s ± 45 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%timeit f()"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [],
"source": [
"@test_parallel(4)\n",
"def g():\n",
" run_proj_inverse()"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"4.08 s ± 282 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%timeit g()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This gives no speed-up, which is to be expected since the gil is not released. It gives a slight slowdown which is to be expected from the overhead of the threads.\n",
"\n",
"Now the forward call:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"def run_proj_forward():\n",
" p = pyproj.Proj(2263)\n",
" p(lat_coords, lon_coords)"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"def f():\n",
" for i in range(4):\n",
" run_proj_forward()"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2.32 s ± 60 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%timeit f()"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"@test_parallel(4)\n",
"def g():\n",
" run_proj_forward()"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"14.4 s ± 510 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
]
}
],
"source": [
"%timeit g()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This gives a big slowdown, which I think can be explained by the fact that the GIL is released *inside* the for loop over all coordinates. So acquring and releasing the gil many times for tiny computations, which can give an overhead.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python (proj-dev)",
"language": "python",
"name": "proj-dev"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment