-
-
Save seddonm1/5927db05cb7ad38d98a22674fa82a4c6 to your computer and use it in GitHub Desktop.
On an Orin NX 16G the memory was too low to compile and the SWAP file had to be increased. | |
/etc/systemd/nvzramconfig.sh | |
change: | |
``` | |
# Calculate memory to use for zram (1/2 of ram) | |
totalmem=`LC_ALL=C free | grep -e "^Mem:" | sed -e 's/^Mem: *//' -e 's/ *.*//'` | |
mem=$((("${totalmem}" / 2 / "${NRDEVICES}") * 1024)) | |
``` | |
to: | |
``` | |
# Calculate memory to use for zram (size of ram) | |
totalmem=`LC_ALL=C free | grep -e "^Mem:" | sed -e 's/^Mem: *//' -e 's/ *.*//'` | |
mem=$((("${totalmem}" / "${NRDEVICES}") * 1024)) | |
``` | |
docker run \ | |
--rm \ | |
-it \ | |
-e ONNXRUNTIME_REPO=https://github.com/microsoft/onnxruntime \ | |
-e ONNXRUNTIME_COMMIT=v1.17.0 \ | |
-e BUILD_CONFIG=Release \ | |
-e CMAKE_VERSION=3.28.3 \ | |
-e CPU_ARCHITECTURE=$(uname -m) \ | |
-v /usr/lib/aarch64-linux-gnu/tegra:/usr/lib/aarch64-linux-gnu/tegra:ro \ | |
-v $(pwd):/output \ | |
-w /tmp \ | |
nvcr.io/nvidia/deepstream:6.4-triton-multiarch \ | |
/bin/bash -c " | |
# set up cmake | |
apt remove -y cmake &&\ | |
rm -rf /usr/local/bin/cmake &&\ | |
apt update &&\ | |
apt install -y wget &&\ | |
rm -rf /tmp/cmake &&\ | |
mkdir /tmp/cmake &&\ | |
wget https://github.com/Kitware/CMake/releases/download/v\${CMAKE_VERSION}/cmake-\${CMAKE_VERSION}-linux-\${CPU_ARCHITECTURE}.tar.gz &&\ | |
tar zxf cmake-\${CMAKE_VERSION}-linux-\${CPU_ARCHITECTURE}.tar.gz --strip-components=1 -C /tmp/cmake &&\ | |
export PATH=\$PATH:/tmp/cmake/bin &&\ | |
# clone onnxruntime repository and build | |
apt-get install -y patch &&\ | |
git clone \${ONNXRUNTIME_REPO} onnxruntime &&\ | |
cd onnxruntime &&\ | |
git checkout \${ONNXRUNTIME_COMMIT} &&\ | |
/bin/sh build.sh \ | |
--parallel \ | |
--build_shared_lib \ | |
--allow_running_as_root \ | |
--compile_no_warning_as_error \ | |
--cuda_home /usr/local/cuda \ | |
--cudnn_home /usr/lib/\${CPU_ARCHITECTURE}-linux-gnu/ \ | |
--use_tensorrt \ | |
--tensorrt_home /usr/lib/\${CPU_ARCHITECTURE}-linux-gnu/ \ | |
--config \${BUILD_CONFIG} \ | |
--skip_tests \ | |
--cmake_extra_defines 'onnxruntime_BUILD_UNIT_TESTS=OFF' &&\ | |
# package and copy to output | |
export ONNXRUNTIME_VERSION=\$(cat /tmp/onnxruntime/VERSION_NUMBER) &&\ | |
rm -rf /tmp/onnxruntime/build/onnxruntime-linux-\${CPU_ARCHITECTURE}-gpu-\${ONNXRUNTIME_VERSION} &&\ | |
BINARY_DIR=build \ | |
ARTIFACT_NAME=onnxruntime-linux-\${CPU_ARCHITECTURE}-gpu-\${ONNXRUNTIME_VERSION} \ | |
LIB_NAME=libonnxruntime.so \ | |
BUILD_CONFIG=Linux/\${BUILD_CONFIG} \ | |
SOURCE_DIR=/tmp/onnxruntime \ | |
COMMIT_ID=\$(git rev-parse HEAD) \ | |
tools/ci_build/github/linux/copy_strip_binary.sh &&\ | |
cd /tmp/onnxruntime/build/onnxruntime-linux-\${CPU_ARCHITECTURE}-gpu-\${ONNXRUNTIME_VERSION}/lib/ &&\ | |
ln -s libonnxruntime.so libonnxruntime.so.\${ONNXRUNTIME_VERSION} &&\ | |
cp -r /tmp/onnxruntime/build/onnxruntime-linux-\${CPU_ARCHITECTURE}-gpu-\${ONNXRUNTIME_VERSION} /output | |
" |
@Donghyun-Son the issue is with Microsoft not publishing the correct build for Jetson: microsoft/onnxruntime#16000
@Donghyun-Son the issue is with Microsoft not publishing the correct build for Jetson: microsoft/onnxruntime#16000
Fortunately, I succeeded in building onnxruntime-gpu 1.14.1 version in Jetson AGX Orin with the command in my link.
Good. The command above will build v1.14.1 correctly.
This will now build with 1.15.1
Thanks for the script @seddonm1! Some suggested updates that would allow building the python wheel as well:
Just like @Donghyun-Son I also needed a python wheel for onnxruntime-gpu==1.15.1
(for an Nvidia Jetson AGX Orin 32GB).
By adding the --build-wheel
argument to build.sh
, I also ran into a subprocess error:
[100%] Built target onnxruntime_test_all
2023-07-24 18:06:04,609 util.run [DEBUG] - Subprocess completed. Return code: 0
2023-07-24 18:06:04,610 util.run [INFO] - Running subprocess in '/tmp/onnxruntime/build/Linux/Release'
/usr/bin/python3 /tmp/onnxruntime/setup.py bdist_wheel --wheel_name_suffix=gpu
Traceback (most recent call last):
File "/tmp/onnxruntime/setup.py", line 17, in <module>
from packaging.tags import sys_tags
ModuleNotFoundError: No module named 'packaging'
Traceback (most recent call last):
File "/tmp/onnxruntime/tools/ci_build/build.py", line 2599, in <module>
sys.exit(main())
File "/tmp/onnxruntime/tools/ci_build/build.py", line 2523, in main
build_python_wheel(
File "/tmp/onnxruntime/tools/ci_build/build.py", line 1951, in build_python_wheel
run_subprocess(args, cwd=cwd)
File "/tmp/onnxruntime/tools/ci_build/build.py", line 781, in run_subprocess
return run(*args, cwd=cwd, capture_stdout=capture_stdout, shell=shell, env=my_env)
File "/tmp/onnxruntime/tools/python/util/run.py", line 49, in run
completed_process = subprocess.run(
File "/usr/lib/python3.8/subprocess.py", line 516, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/usr/bin/python3', '/tmp/onnxruntime/setup.py', 'bdist_wheel', '--wheel_name_suffix=gpu']' returned non-zero exit status 1.
which can be solved by adding python3 -m pip install packaging
before the execution of build.sh
After building the wheel file can be found in /tmp/onnxruntime/build/Linux/Release/dist/onnxruntime_gpu-1.15.1-cp38-cp38-linux_aarch64.whl
so make sure to copy that to the output directory as well.
@adhilcolab can you tell me what is going wrong? I ran this on a Jetson Orin AGX very recently and it worked.
@seddonm1 Thanks for sharing. I created another docker version here:
https://github.com/ykawa2/onnxruntime-gpu-for-jetson
Shared created binary as Releases and worked in Jetson Orin AGX
@ykawa2, thank you for your assistance! I successfully built ONNXRuntime-gpu with TensorRT using ONNXRUNTIME_COMMIT=v1.14.1, and everything went smoothly. I obtained the wheel file and installed it on my system. However, I noticed that the first inference after loading the model takes a significant amount of time, but subsequent inferences perform well. Have you encountered similar performance issues?
@ykawa2, thank you for your assistance! I successfully built ONNXRuntime-gpu with TensorRT using ONNXRUNTIME_COMMIT=v1.14.1, and everything went smoothly. I obtained the wheel file and installed it on my system. However, I noticed that the first inference after loading the model takes a significant amount of time, but subsequent inferences perform well. Have you encountered similar performance issues?
@seddonm1 also if you could help here.
No problem. I think quite a few things changed between 1.14.1 which these instructions were tested with and 1.15.0. I will fix them next week.