Robert Collins robbiemu

Modified

ROCm Migration Report for Hierarchical Reasoning Model (HRM) on AMD MI300X GPUs

This report provides a definitive, actionable, and unambiguous guide for migrating the Hierarchical Reasoning Model (HRM) to ROCm, specifically targeting AMD MI300X GPUs. All previous uncertainties and 'if' statements have been resolved to provide clear instructions for developers.

1. `README.md`

Current CUDA Dependencies:

The README.md explicitly outlines the installation of CUDA and PyTorch with CUDA support, along with FlashAttention, which is a CUDA-dependent library.

	#!/usr/bin/env python3
	# -- coding: utf-8 --

	"""
	pr-capture: A CLI tool to capture GitHub PR data into a comprehensive markdown file.
	"""

	import argparse
	from datetime import datetime
	import json

	#!/bin/bash
	# zip_project.sh - Zip up source and documentation files from a Git repo,
	# excluding .git, .venv, and anything in .gitignore. Use --help for options.
	#
	# Usage:
	# ./zip_project.sh [--output FILE] [--verbose] [--filter [REGEX]] [--exclude REGEX] [--help]
	#
	# Options:
	# -o, --output FILE Name of the output zip file (default: project_bundle.zip)
	# -v, --verbose Print verbose output

	import argparse
	import os
	import re
	from collections import defaultdict
	from typing import List, Set, Tuple

	import requests
	from packaging.version import Version, InvalidVersion

	# ---------------------------------------------------------------------------

	#!/usr/bin/env bash
	# Generates 10-slot wallpapper solar JSONs from 6-stage image sets

	set -euo pipefail

	DIR="/Users/Shared/Wallpapers/WaterWays_Processed"
	cd "$DIR" \|\| { echo "bad path: $DIR"; exit 1; }

	# 10-stage full-day solar path: altitude (°) and azimuth (°)
	ALT=(60 40 20 5 -5 -15 -5 5 20 40) # noon to night and back

	import os
	import json
	import requests
	from typing import Dict, List, Any, Optional
	from smolagents.tools import Tool

	class BraveSearchTool(Tool):
	"""Tool for interacting with the Brave Search API within the smolagents framework."""

	name = "brave_search"

	git clone https://huggingface.co/sail/Sailor2-1B-Chat
	Cloning into 'Sailor2-1B-Chat'...
	remote: Enumerating objects: 39, done.
	remote: Counting objects: 100% (36/36), done.
	remote: Compressing objects: 100% (36/36), done.
	remote: Total 39 (delta 14), reused 0 (delta 0), pack-reused 3 (from 1)
	Unpacking objects: 100% (39/39), 2.02 MiB \| 2.26 MiB/s, done.
	Filtering content: 100% (2/2), 1.85 GiB \| 14.92 MiB/s, done.

	./convert_hf_to_gguf.py --outfile $HF/Sailor2-1B-Chat_bf16.gguf --outtype bf16 $HF/Sailor2-1B-Chat

	#!/bin/bash

	# Check if at least one argument is provided
	if [ "$#" -lt 1 ]; then
	echo "Usage: $0 <image> [--report-file <file>] [--verbose]"
	exit 1
	fi

	# Parse arguments
	IMAGE=""

	from collections import namedtuple
	import contextlib
	import json
	import llama_cpp
	import logging
	import math
	import multiprocessing as mp
	import numpy as np
	import optuna
	import os

	vram llama3.1:8b-instruct-q8_0 --verbose
	VERBOSE: Default fits value from sysctl: 40.0 GB
	VERBOSE: Quant value for llama3.1:8b-instruct-q8_0: Q8_0
	VERBOSE: VRAM nth for llama3.1:8b-instruct-q8_0: 131072
	VERBOSE: Running gollama -vram for llama3.1:8b-instruct-q8_0 with fits=40.0 GB
	VERBOSE: VRAM output header, labels, and rows gathered
	VERBOSE: Quant row: \| Q8_0 \| 8.50 \| 9.1 \| 10.9 \| 13.4(12.4,11.9) \| 18.4(16.4,15.4) \| 28.3(24.3,22.3) \| 48.2(40.2,36.2) \|
	VERBOSE: Max A: 28.3 at 64K
	VERBOSE: Max B: 24.3 at 64K
	VERBOSE: Max C: 36.2 at 128K

Robert Collins robbiemu

ROCm Migration Report for Hierarchical Reasoning Model (HRM) on AMD MI300X GPUs

1. README.md

1. `README.md`