Skip to content

Instantly share code, notes, and snippets.

@AmosLewis
Created September 16, 2025 05:09
Show Gist options
  • Save AmosLewis/bd9b2b2d3fc3f84554efcc499b34ef47 to your computer and use it in GitHub Desktop.
Save AmosLewis/bd9b2b2d3fc3f84554efcc499b34ef47 to your computer and use it in GitHub Desktop.
((.venv12) ) ➜ 2024q2-sdxl-mlperf-sprint git:(mi355_llama_working_harness_v1) ✗ git config --global credential.helper store
git config --global user.name AmosLewis
git config --global user.password ghp_nsRzvxclTLke......
((.venv12) ) ➜ 2024q2-sdxl-mlperf-sprint git:(mi355_llama_working_harness_v1) ✗ git config --global --list
[6] + 450113 suspended git config --global --list
((.venv12) ) ➜ 2024q2-sdxl-mlperf-sprint git:(mi355_llama_working_harness_v1) ✗ ./LLAMA_inference/build_docker_mi355.sh
[+] Building 209.4s (12/23) docker:default
=> [internal] load build definition from llama_harness_355_nightly.dockerfile 0.0s
=> => transferring dockerfile: 4.68kB 0.0s
=> [internal] load metadata for ghcr.io/rocm/no_rocm_image_ubuntu24_04:main 0.4s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load build context 1.9s
=> => transferring context: 606.79MB 1.9s
=> CACHED [ 1/20] FROM ghcr.io/rocm/no_rocm_image_ubuntu24_04:main@sha256:4150afe4759d14822f0e3f8930e1124f26e11f68b5c7b91ec9a02b20b1ebbb98 0.0s
=> [ 2/20] RUN apt-get update && apt-get install -y ffmpeg libsm6 libxext6 git wget unzip software-properties-common git build-essential cur 61.0s
=> [ 3/20] RUN python3 -m venv /opt/venv 4.3s
=> [ 4/20] RUN python3 -m pip install --upgrade pip setuptools wheel && python3 -m pip install pybind11 'nanobind<2' numpy==1.* pandas && python 14.0s
=> [ 5/20] RUN python3 -m pip install --index-url https://rocm.nightlies.amd.com/v2/gfx950-dcgpu/ rocm[libraries,devel] 66.8s
=> [ 6/20] RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y 14.5s
=> [ 7/20] RUN apt-get update && apt-get install -y git-lfs && git lfs install 8.9s
=> ERROR [ 8/20] RUN mkdir /mlperf/ && cd /mlperf && git clone --recursive https://github.com/mlcommons/inference.git inference 39.4s
------
> [ 8/20] RUN mkdir /mlperf/ && cd /mlperf && git clone --recursive https://github.com/mlcommons/inference.git inference:
3.380 Cloning into 'inference'...
25.47 Submodule 'language/bert/DeepLearningExamples' (https://github.com/NVIDIA/DeepLearningExamples.git) registered for path 'language/bert/DeepLearningExamples'
25.47 Submodule 'language/deepseek-r1/submodules/LiveCodeBench' (https://github.com/LiveCodeBench/LiveCodeBench) registered for path 'language/deepseek-r1/submodules/LiveCodeBench'
25.47 Submodule 'language/deepseek-r1/submodules/prm800k' (https://github.com/openai/prm800k) registered for path 'language/deepseek-r1/submodules/prm800k'
25.48 Cloning into '/mlperf/inference/language/bert/DeepLearningExamples'...
30.17 Cloning into '/mlperf/inference/language/deepseek-r1/submodules/LiveCodeBench'...
30.78 Cloning into '/mlperf/inference/language/deepseek-r1/submodules/prm800k'...
31.64 Submodule path 'language/bert/DeepLearningExamples': checked out 'b03375bd6c2c5233130e61a3be49e26d1a20ac7c'
31.64 Submodule 'PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server' (https://github.com/NVIDIA/tensorrt-inference-server.git) registered for path 'language/bert/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server'
31.64 Submodule 'PyTorch/Translation/Transformer/cutlass' (https://github.com/NVIDIA/cutlass.git) registered for path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass'
31.65 Cloning into '/mlperf/inference/language/bert/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server'...
34.06 Cloning into '/mlperf/inference/language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass'...
37.07 Submodule path 'language/bert/DeepLearningExamples/PyTorch/SpeechRecognition/Jasper/external/tensorrt-inference-server': checked out '71f0771cb8cb2a2eb1c6a9433f9a56dd1f206c96'
37.15 Submodule path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass': checked out 'ed2ed4d667ce95e1371bd62db32b6a114e774336'
37.15 Submodule 'tools/external/googletest' (https://github.com/google/googletest.git) registered for path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass/tools/external/googletest'
37.15 Cloning into '/mlperf/inference/language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass/tools/external/googletest'...
39.12 Submodule path 'language/bert/DeepLearningExamples/PyTorch/Translation/Transformer/cutlass/tools/external/googletest': checked out '9077ec7efe5b652468ab051e93c67589d5cb8f85'
39.15 Submodule path 'language/deepseek-r1/submodules/LiveCodeBench': checked out 'b1e7cab44d610bbc2e10d36d270cd0c89c600492'
39.34 fatal: could not read Username for 'https://github.com': No such device or address
39.34 Downloading prm800k/data/phase1_test.jsonl (829 KB)
39.35 fatal: could not read Username for 'https://github.com': No such device or address
39.35 Error downloading object: prm800k/data/phase1_test.jsonl (f4b3bc5): Smudge error: Error downloading prm800k/data/phase1_test.jsonl (f4b3bc5b095e45c816453dc4d748b755c680d61d55f9895d929a335b487c727d): batch response: Git credentials for https://github.com/openai/prm800k not found.
39.35
39.35 Errors logged to '/mlperf/inference/.git/modules/language/deepseek-r1/submodules/prm800k/lfs/logs/20250916T050835.538231675.log'.
39.35 Use `git lfs logs last` to view the log.
39.35 error: external filter 'git-lfs filter-process' failed
39.35 fatal: prm800k/data/phase1_test.jsonl: smudge filter lfs failed
39.35 fatal: Unable to checkout '7ecc794703b2877f63226f2477a49b34f9b25163' in submodule path 'language/deepseek-r1/submodules/prm800k'
------
2 warnings found (use docker --debug to expand):
- SecretsUsedInArgOrEnv: Do not use ARG or ENV instructions for sensitive data (ARG "APT_KEY_DONT_WARN_ON_DANGEROUS_USAGE") (line 58)
- UndefinedVar: Usage of undefined variable '$LD_LIBRARY_PATH' (line 36)
llama_harness_355_nightly.dockerfile:42
--------------------
41 |
42 | >>> RUN mkdir /mlperf/ && cd /mlperf && \
43 | >>> git clone --recursive https://github.com/mlcommons/inference.git inference
44 |
--------------------
ERROR: failed to solve: process "/bin/sh -c mkdir /mlperf/ && cd /mlperf && git clone --recursive https://github.com/mlcommons/inference.git inference" did not complete successfully: exit code: 128
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment