This profile achieves 50% - 80% release profile performance, while also provides a reasonable amount of safety checks and debugging support. This should also be the profile for your CI build.
-Og -Wall -Wextra -D_FORTIFY_SOURCE=2 -fstack-protector-strong -g -D_GLIBCXX_ASSERTIONS
- Enables most warnings. Tune the warnings with
-Wfallthrough
orWno-fallthrough
. Didn't include compiler-specific flags due to varied opinion. -Og
is usually 30% slower than-O2
but provides much better debug experience-D_FORTIFY_SOURCE=2
(could bump to=3
if you have gcc >= 12) performs checked memory and string operations-fstack-protector-strong
catches a fraction of stack buffer overflow with very little performance cost-D_GLIBCXX_ASSERTIONS
enables the bounds check on C++ stdlib containers, plus the non-emptiness check forunique_ptr/shared_ptr/optional/variant
. Incurs ~10-20% overhead depending on your use case
You may want to check Reproducible builds too to reduce output flakes.
GLIBC_TUNABLES=glibc.malloc.perturb=204
This makes glibc overwrite all freed memory region with 0xCC
, so that use-after-free could be more easily caught. Choosing this value because 0xcc
maps to INT 3
on x86-64, and also is a non-printable character under ASCII. It could also be a negative value when interpreted as a uint8
, allowing users to catch it visually.
hardened_malloc is a hardened malloc implementation, designed to catch many common heap memory issues. Simply build it and run your program with ./preload.sh {PROGARM} [ARGS...]
and it will automatically replaces malloc/free implementation.
tl;dr: Follow RedHat or OpenSSF recommendations.
Note that some of the recommendations have a small runtime performance cost and you could tune based on your need. My personal experience is:
- 10%-20% slowdown with ``-D_GLIBCXX_ASSERTIONS`
- 2%-10% slowdown with
-fno-delete-null-pointer-checks
(I write a lot of low-level pointer manipulation code and YMMV) - 0%-20% slowdown with
-fno-strict-aliasing
. Affects vectorizer a lot and probably worth tuning off in numerical calculation sources -ftrivial-auto-var-init=zero
: This is definitely safer for production if you have a lot of uninitialized variables that might be read, but personally I prefer initializing them manually with the clang-tidy check and opt out when necessary. Still defense in depth though.
Also don't forget to strip
your binary if you don't want the customer to find out the function names and sources. It also results in a smaller binary.
This is a profile to maximize debug checks and debugging experience. As a tradeoff, it's typically 2x-10x slower than the release profile. At my company, we run it once every day (because each run takes half a day :(). Only use it when you want to actively debug a program.
-Og -g -Wall -Wextra -D_FORTIFY_SOURCE=2 -fsanitize=undefine,address,float-divide-by-zero,nullability
You can also attach -fsanitize=thread
if your program is multithreaded, but it has to be in a separate build as TSAN conflicts with ASAN.
Runtime:
ASAN_OPTIONS=detect_stack_use_after_return=true:quarantine_size_mb=1024 UBSAN_OPTIONS=print_stacktrace=1
The default quarantine_size_mb
(256MB) is often too low for programs using a lot of memory. Bump it to a reasonable number to make sure that is enough to cover a full lifecycle of control logic.
-Os -fuse-ld=lld
This profile is about 30% faster to recompile, at the cost of no warnings or safety checks. Helpful for compile-retry workflow
The default -g
produces a lot of debug info. If you only want to have a stacktrace with source line number, pass -g1
to speed up compilation and reduce binary size
Nice summary, thank you for sharing! Doesn't
-D_FORTIFY_SOURCE=2
only work when compiled with-O1
or higher though? (source)