Skip to content

Instantly share code, notes, and snippets.

@AmosLewis
Last active September 12, 2025 20:04
Show Gist options
  • Save AmosLewis/b03a6f6349e08c0b09d177182c644cc6 to your computer and use it in GitHub Desktop.
Save AmosLewis/b03a6f6349e08c0b09d177182c644cc6 to your computer and use it in GitHub Desktop.
((.venv12) ) ➜ 2024q2-sdxl-mlperf-sprint git:(mi355_llama_working_harness_v1) ✗ ./LLAMA_inference/build_docker_mi355.sh
[+] Building 183.1s (11/21) docker:default
=> [internal] load build definition from llama_harness_355_nightly.dockerfile 0.0s
=> => transferring dockerfile: 4.55kB 0.0s
=> [internal] load metadata for ghcr.io/rocm/no_rocm_image_ubuntu24_04:main 0.5s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> CACHED [ 1/18] FROM ghcr.io/rocm/no_rocm_image_ubuntu24_04:main@sha256:4150afe4759d14822f0e3f8930e1124f26e11f68b5c7b91ec9a02b20b1ebbb98 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 21.25kB 0.0s
=> [ 2/18] RUN apt-get update && apt-get install -y ffmpeg libsm6 libxext6 git wget unzip software-properties-common git build-essential curl cmake ninja-buil 21.4s
=> [ 3/18] RUN python3 -m venv /opt/venv 4.0s
=> [ 4/18] RUN python3 -m pip install --upgrade pip setuptools wheel && python3 -m pip install pybind11 'nanobind<2' numpy==1.* pandas && python3 -m pip install h 12.8s
=> [ 5/18] RUN python3 -m pip install --index-url https://rocm.nightlies.amd.com/v2/gfx950-dcgpu/ rocm[libraries,devel] 66.5s
=> [ 6/18] RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y 17.1s
=> ERROR [ 7/18] RUN mkdir /mlperf/ && cd /mlperf && git clone --recursive https://github.com/mlcommons/inference.git inference && cd inference/loadgen && pip 60.9s
------
> [ 7/18] RUN mkdir /mlperf/ && cd /mlperf && git clone --recursive https://github.com/mlcommons/inference.git inference && cd inference/loadgen && pip install -r requirements.txt && CFLAGS="-std=c++14" python3 setup.py install:
2.315 Cloning into 'inference'...
45.83 Submodule 'language/bert/DeepLearningExamples' (https://github.com/NVIDIA/DeepLearningExamples.git) registered for path 'language/bert/DeepLearningExamples'
45.83 Submodule 'language/deepseek-r1/submodules/LiveCodeBench' (https://github.com/LiveCodeBench/LiveCodeBench) registered for path 'language/deepseek-r1/submodules/LiveCodeBench'
45.83 Submodule 'language/deepseek-r1/submodules/prm800k' (https://github.com/openai/prm800k) registered for path 'language/deepseek-r1/submodules/prm800k'
45.83 Cloning into '/mlperf/inference/language/bert/DeepLearningExamples'...
51.96 Cloning into '/mlperf/inference/language/deepseek-r1/submodules/LiveCodeBench'...
52.65 Cloning into '/mlperf/inference/language/deepseek-r1/submodules/prm800k'...
53.44 Submodule path 'language/bert/DeepLearningExamples': checked out 'b03375bd6c2c5233130e61a3be49e26d1a20ac7c'
53.44 Submodule 'PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server' (https://github.com/NVIDIA/tensorrt-inference-server.git) registered for path 'language/bert/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server'
53.44 Submodule 'PyTorch/Translation/Transformer/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass'
53.45 Cloning into '/mlperf/inference/language/bert/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server'...
55.73 Cloning into '/mlperf/inference/language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass'...
58.82 Submodule path 'language/bert/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server': checked out '71f0771cb8cb2a2eb1c6a9433f9a56dd1f206c96'
58.90 Submodule path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass': checked out 'ed2ed4d667ce95e1371bd62db32b6a114e774336'
58.91 Submodule 'tools/external/googletest' (https://github.com/google/googletest.git) registered for path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass/tools/external/googletest'
58.91 Cloning into '/mlperf/inference/language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass/tools/external/googletest'...
60.17 Submodule path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass/tools/external/googletest': checked out '9077ec7efe5b652468ab051e93c67589d5cb8f85'
60.21 Submodule path 'language/deepseek-r1/submodules/LiveCodeBench': checked out 'b1e7cab44d610bbc2e10d36d270cd0c89c600492'
60.40 fatal: could not read Username for 'https://github.com': No such device or address
60.40 Downloading prm800k/data/phase1_test.jsonl (829 KB)
60.40 fatal: could not read Username for 'https://github.com': No such device or address
60.40 Error downloading object: prm800k/data/phase1_test.jsonl (f4b3bc5): Smudge error: Error downloading prm800k/data/phase1_test.jsonl (f4b3bc5b095e45c816453dc4d748b755c680d61d55f9895d929a335b487c727d): batch response: Git credentials for https://github.com/openai/prm800k not found.
60.40
60.40 Errors logged to '/mlperf/inference/.git/modules/language/deepseek-r1/submodules/prm800k/lfs/logs/20250912T195546.443381903.log'.
60.40 Use `git lfs logs last` to view the log.
60.41 error: external filter 'git-lfs filter-process' failed
60.41 fatal: prm800k/data/phase1_test.jsonl: smudge filter lfs failed
60.41 fatal: Unable to checkout '7ecc794703b2877f63226f2477a49b34f9b25163' in submodule path 'language/deepseek-r1/submodules/prm800k'
------
llama_harness_355_nightly.dockerfile:39
--------------------
38 | # install loadgen
39 | >>> RUN mkdir /mlperf/ && cd /mlperf && \
40 | >>> git clone --recursive https://github.com/mlcommons/inference.git inference && \
41 | >>> cd inference/loadgen && \
42 | >>> pip install -r requirements.txt && \
43 | >>> CFLAGS="-std=c++14" python3 setup.py install
44 |
--------------------
ERROR: failed to solve: process "/bin/sh -c mkdir /mlperf/ && cd /mlperf && git clone --recursive https://github.com/mlcommons/inference.git inference && cd inference/loadgen && pip install -r requirements.txt && CFLAGS=\"-std=c++14\" python3 setup.py install" did not complete successfully: exit code: 128
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment