Skip to content

Instantly share code, notes, and snippets.

@jcrist
Created July 17, 2021 03:46
Show Gist options
  • Save jcrist/a533a91918ff5e7dfc6d99d4cc8fa6fe to your computer and use it in GitHub Desktop.
Save jcrist/a533a91918ff5e7dfc6d99d4cc8fa6fe to your computer and use it in GitHub Desktop.
Benchmark for msgspec JSON encoding
import argparse
import json
import lzma
import os
import timeit
import urllib.request
import msgspec
def setup_benchmark(path):
"""Download test data"""
if os.path.exists(path):
return
print(f"Downloading {path}...", end='')
with urllib.request.urlopen(
f"https://github.com/ijl/orjson/raw/master/data/{path}.xz"
) as f:
data = f.read()
with open(path, "wb") as f:
f.write(lzma.decompress(data))
print("Done!")
def bench_encode(path):
print(f"Benchmark: {path}")
with open(path, "rb") as f:
data = json.load(f)
enc = msgspec.JSONEncoder()
timer = timeit.Timer("func(data)", globals={"func": enc.encode, "data": data})
n, t = timer.autorange()
print(f"{t * 1e6 / n :.2f} us")
def main():
parser = argparse.ArgumentParser()
parser.add_argument(
"--benchmark", "-b", choices=["twitter", "github", "canada"], default="twitter"
)
args = parser.parse_args()
path = args.benchmark + ".json"
setup_benchmark(path)
bench_encode(path)
if __name__ == "__main__":
main()
@jcrist
Copy link
Author

jcrist commented Jul 17, 2021

Follow up from a discussion on twitter.

Running on jcrist/msgspec#46 on Kubuntu 21.04, with python 3.9, testing 3 different C compilers. I've been performance tuning against gcc-10, but get different results when trying clang or other gcc versions (some benchmarks improve, some get worse). I'd like to understand what code is leading to these differences, and anything I can change to prevent these performance regressions.

GCC 10.3.0

$ gcc --version
gcc (Ubuntu 10.3.0-1ubuntu1) 10.3.0
Copyright (C) 2020 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE
$ CC=gcc-10 python setup.py clean --all develop
$ python bench.py
Benchmark: twitter.json
468.29 us
$ python bench.py -b canada
Benchmark: canada.json
4432.95 us

GCC 11.1.0

$ gcc-11 --version
gcc-11 (Ubuntu 11.1.0-1ubuntu1~21.04) 11.1.0
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ CC=gcc-11 python setup.py clean --all develop
$ python bench.py 
Benchmark: twitter.json
547.07 us
$ python bench.py -b canada
Benchmark: canada.json
4340.72 us

Clang 12.0.0

$ clang --version
Ubuntu clang version 12.0.0-3ubuntu1~21.04.1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
$ CC=clang python setup.py clean --all develop
$ python bench.py 
Benchmark: twitter.json
553.96 us
$ python bench.py -b canada
Benchmark: canada.json
4016.51 us

Notes:

  • The canada benchmark is mostly floats, and benchmarks the throughput of the float -> string algorithm (ryu in this case). This runs best under clang (~9.5% speedup over gcc-10).
  • The twitter benchmark is a good mix of different object types, but is mostly strings. This runs best under gcc-10, and experiences a ~17% slowdown when upgrading to gcc-11.

@llllllllll
Copy link

Can you show the actual compilation commands that were generated from the various setup.py calls. I have been poking into this; however, I immediately got a very different relative result, so I want to see what flags you might be using.

@jcrist
Copy link
Author

jcrist commented Jul 18, 2021

Thanks for looking! Here ya go:

GCC 10

$ CC=gcc-10 python setup.py clean --all develop
running clean
running develop
running egg_info
writing msgspec.egg-info/PKG-INFO
writing dependency_links to msgspec.egg-info/dependency_links.txt
writing top-level names to msgspec.egg-info/top_level.txt
reading manifest file 'msgspec.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'msgspec.egg-info/SOURCES.txt'
running build_ext
building 'msgspec.core' extension
creating build
creating build/temp.linux-x86_64-3.9
creating build/temp.linux-x86_64-3.9/msgspec
gcc-10 -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/jcristharif/miniconda3/envs/msgspec/include -fPIC -O2 -isystem /home/jcristharif/miniconda3/envs/msgspec/include -fPIC -I/home/jcristharif/miniconda3/envs/msgspec/include/python3.9 -c msgspec/core.c -o build/temp.linux-x86_64-3.9/msgspec/core.o
creating build/lib.linux-x86_64-3.9
creating build/lib.linux-x86_64-3.9/msgspec
gcc -pthread -B /home/jcristharif/miniconda3/envs/msgspec/compiler_compat -Wl,--sysroot=/ -shared -Wl,-rpath,/home/jcristharif/miniconda3/envs/msgspec/lib -Wl,-rpath-link,/home/jcristharif/miniconda3/envs/msgspec/lib -L/home/jcristharif/miniconda3/envs/msgspec/lib -Wl,-rpath,/home/jcristharif/miniconda3/envs/msgspec/lib -Wl,-rpath-link,/home/jcristharif/miniconda3/envs/msgspec/lib -L/home/jcristharif/miniconda3/envs/msgspec/lib build/temp.linux-x86_64-3.9/msgspec/core.o -o build/lib.linux-x86_64-3.9/msgspec/core.cpython-39-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.9/msgspec/core.cpython-39-x86_64-linux-gnu.so -> msgspec
Creating /home/jcristharif/miniconda3/envs/msgspec/lib/python3.9/site-packages/msgspec.egg-link (link to .)
msgspec 0.3.1+16.g158b5ee is already the active version in easy-install.pth

Installed /home/jcristharif/Code/msgspec
Processing dependencies for msgspec==0.3.1+16.g158b5ee
Finished processing dependencies for msgspec==0.3.1+16.g158b5ee

GCC 11

$ cc=gcc-11 python setup.py clean --all develop
running clean
removing 'build/temp.linux-x86_64-3.9' (and everything under it)
removing 'build/lib.linux-x86_64-3.9' (and everything under it)
removing 'build'
running develop
running egg_info
writing msgspec.egg-info/PKG-INFO
writing dependency_links to msgspec.egg-info/dependency_links.txt
writing top-level names to msgspec.egg-info/top_level.txt
reading manifest file 'msgspec.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'msgspec.egg-info/SOURCES.txt'
running build_ext
building 'msgspec.core' extension
creating build
creating build/temp.linux-x86_64-3.9
creating build/temp.linux-x86_64-3.9/msgspec
gcc-11 -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/jcristharif/miniconda3/envs/msgspec/include -fPIC -O2 -isystem /home/jcristharif/miniconda3/envs/msgspec/include -fPIC -I/home/jcristharif/miniconda3/envs/msgspec/include/python3.9 -c msgspec/core.c -o build/temp.linux-x86_64-3.9/msgspec/core.o
creating build/lib.linux-x86_64-3.9
creating build/lib.linux-x86_64-3.9/msgspec
gcc -pthread -B /home/jcristharif/miniconda3/envs/msgspec/compiler_compat -Wl,--sysroot=/ -shared -Wl,-rpath,/home/jcristharif/miniconda3/envs/msgspec/lib -Wl,-rpath-link,/home/jcristharif/miniconda3/envs/msgspec/lib -L/home/jcristharif/miniconda3/envs/msgspec/lib -Wl,-rpath,/home/jcristharif/miniconda3/envs/msgspec/lib -Wl,-rpath-link,/home/jcristharif/miniconda3/envs/msgspec/lib -L/home/jcristharif/miniconda3/envs/msgspec/lib build/temp.linux-x86_64-3.9/msgspec/core.o -o build/lib.linux-x86_64-3.9/msgspec/core.cpython-39-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.9/msgspec/core.cpython-39-x86_64-linux-gnu.so -> msgspec
Creating /home/jcristharif/miniconda3/envs/msgspec/lib/python3.9/site-packages/msgspec.egg-link (link to .)
msgspec 0.3.1+16.g158b5ee is already the active version in easy-install.pth

Installed /home/jcristharif/Code/msgspec
Processing dependencies for msgspec==0.3.1+16.g158b5ee
Finished processing dependencies for msgspec==0.3.1+16.g158b5ee

Clang

$ CC=clang python setup.py clean --all develop
running clean
removing 'build/temp.linux-x86_64-3.9' (and everything under it)
removing 'build/lib.linux-x86_64-3.9' (and everything under it)
removing 'build'
running develop
running egg_info
writing msgspec.egg-info/PKG-INFO
writing dependency_links to msgspec.egg-info/dependency_links.txt
writing top-level names to msgspec.egg-info/top_level.txt
reading manifest file 'msgspec.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
writing manifest file 'msgspec.egg-info/SOURCES.txt'
running build_ext
building 'msgspec.core' extension
creating build
creating build/temp.linux-x86_64-3.9
creating build/temp.linux-x86_64-3.9/msgspec
clang -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/jcristharif/miniconda3/envs/msgspec/include -fPIC -O2 -isystem /home/jcristharif/miniconda3/envs/msgspec/include -fPIC -I/home/jcristharif/miniconda3/envs/msgspec/include/python3.9 -c msgspec/core.c -o build/temp.linux-x86_64-3.9/msgspec/core.o
creating build/lib.linux-x86_64-3.9
creating build/lib.linux-x86_64-3.9/msgspec
gcc -pthread -B /home/jcristharif/miniconda3/envs/msgspec/compiler_compat -Wl,--sysroot=/ -shared -Wl,-rpath,/home/jcristharif/miniconda3/envs/msgspec/lib -Wl,-rpath-link,/home/jcristharif/miniconda3/envs/msgspec/lib -L/home/jcristharif/miniconda3/envs/msgspec/lib -Wl,-rpath,/home/jcristharif/miniconda3/envs/msgspec/lib -Wl,-rpath-link,/home/jcristharif/miniconda3/envs/msgspec/lib -L/home/jcristharif/miniconda3/envs/msgspec/lib build/temp.linux-x86_64-3.9/msgspec/core.o -o build/lib.linux-x86_64-3.9/msgspec/core.cpython-39-x86_64-linux-gnu.so
copying build/lib.linux-x86_64-3.9/msgspec/core.cpython-39-x86_64-linux-gnu.so -> msgspec
Creating /home/jcristharif/miniconda3/envs/msgspec/lib/python3.9/site-packages/msgspec.egg-link (link to .)
msgspec 0.3.1+16.g158b5ee is already the active version in easy-install.pth

Installed /home/jcristharif/Code/msgspec
Processing dependencies for msgspec==0.3.1+16.g158b5ee
Finished processing dependencies for msgspec==0.3.1+16.g158b5ee

I did notice that it's still using gcc 10 everywhere for linking. Forcing the same CC for linking as well (by setting LDSHARED) made no measurable difference to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment