This sample demonstrates how to add support for multiple NVIDIA architectures (Compute Capabilities, CC) into a single OpenMP target offload C program with Clang.
In this example the executable is compiled to support sm_80 (e.g. NVIDIA A100) and sm_90 (e.g. NVIDIA H100):
> make
LIBRARY_PATH=/usr/lib/llvm-19/lib clang-19 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda \
        -Xopenmp-target=nvptx64-nvidia-cuda --offload-arch=sm_90 \
        -Xopenmp-target=nvptx64-nvidia-cuda --offload-arch=sm_80 \
        -fuse-ld=lld multi_sm_test.c -o multi_sm_test
> strings ./multi_sm_test | grep "sm_80"
sm_80
> strings ./multi_sm_test | grep "sm_90"
sm_90
The latest Clang 19 is used above, however the same works for Clang 17.
This feature is discussed in D128090, noting that -Xopenmp-target=nvptx64-nvidia-cuda --offload-arch=sm_80 works, while -Xopenmp-target=nvptx64-nvidia-cuda -march=sm_80 does not.