Skip to content

Instantly share code, notes, and snippets.

@robbiemu
robbiemu / HPO and Batch sizes.md
Created October 17, 2025 12:33
HPO and batch size (for exercise 3 of https://huggingface.co/learn/smol-course unit 1)

side quest: HPO and batch size

To follow along you'll need to install optuna.

Before we do real hpo let's just look for an efficient batch size for the current machine:

batch_size: this is determined to be the maximum power of 2 (for no particular reason for now) that shows improved samples/second processing.

we need a max_length for this because of how batches are handled when training. The training process automatically pads the data for you on-the-fly for every single batch.

@robbiemu
robbiemu / litesearch.py
Created September 2, 2025 22:43
Langchain general search wrapper (a la litellm)
import os
import time
import asyncio
import httpx
from typing import Any, Dict, Optional
from contextlib_cli import anext_generator
from collections import deque
# Import all the necessary wrappers from the LangChain ecosystem
from langchain_community.utilities import (
{{- /* Extract system message and other messages */ -}}
{{- $system_message := "" -}}
{{- $loop_messages := .Messages -}}
{{- if and .Messages (gt (len .Messages) 0) (eq (index .Messages 0).Role "system") -}}
{{- $system_message = (index .Messages 0).Content -}}
{{- $loop_messages = slice .Messages 1 -}}
{{- end -}}
{{- /* Handle tools if they exist */ -}}
{{- $has_tools := and .Tools (gt (len .Tools) 0) -}}
@robbiemu
robbiemu / HRM_Rocm_migration.md
Created July 27, 2025 19:39
ROCm Migration Report for Hierarchical Reasoning Model (HRM) on AMD MI300X GPUs

Modified

ROCm Migration Report for Hierarchical Reasoning Model (HRM) on AMD MI300X GPUs

This report provides a definitive, actionable, and unambiguous guide for migrating the Hierarchical Reasoning Model (HRM) to ROCm, specifically targeting AMD MI300X GPUs. All previous uncertainties and 'if' statements have been resolved to provide clear instructions for developers.

1. README.md

Current CUDA Dependencies:

The README.md explicitly outlines the installation of CUDA and PyTorch with CUDA support, along with FlashAttention, which is a CUDA-dependent library.

@robbiemu
robbiemu / pr_capture.py
Last active June 17, 2025 18:43
pr_capture.py - captures and converts GitHub PR data to comprehensive markdown
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
pr-capture: A CLI tool to capture GitHub PR data into a comprehensive markdown file.
"""
import argparse
from datetime import datetime
import json
@robbiemu
robbiemu / zip_project
Created June 14, 2025 19:48
zip_project - Zip up source and documentation files from a Git repo, excluding .git, .venv, and anything in .gitignore.
#!/bin/bash
# zip_project.sh - Zip up source and documentation files from a Git repo,
# excluding .git, .venv, and anything in .gitignore. Use --help for options.
#
# Usage:
# ./zip_project.sh [--output FILE] [--verbose] [--filter [REGEX]] [--exclude REGEX] [--help]
#
# Options:
# -o, --output FILE Name of the output zip file (default: project_bundle.zip)
# -v, --verbose Print verbose output
@robbiemu
robbiemu / python_version_torch_compatibility_grid.py
Created May 15, 2025 16:17
print a grid showing availability of Pytorch wheel images for python versions
import argparse
import os
import re
from collections import defaultdict
from typing import List, Set, Tuple
import requests
from packaging.version import Version, InvalidVersion
# ---------------------------------------------------------------------------
@robbiemu
robbiemu / generate_json_solar.sh
Last active May 7, 2025 13:22
Waterways script - color variations
#!/usr/bin/env bash
# Generates 10-slot wallpapper solar JSONs from 6-stage image sets
set -euo pipefail
DIR="/Users/Shared/Wallpapers/WaterWays_Processed"
cd "$DIR" || { echo "bad path: $DIR"; exit 1; }
# 10-stage full-day solar path: altitude (°) and azimuth (°)
ALT=(60 40 20 5 -5 -15 -5 5 20 40) # noon to night and back
@robbiemu
robbiemu / brave_search_tool.py
Created April 4, 2025 18:14
smolagents BraveSearchTool
import os
import json
import requests
from typing import Dict, List, Any, Optional
from smolagents.tools import Tool
class BraveSearchTool(Tool):
"""Tool for interacting with the Brave Search API within the smolagents framework."""
name = "brave_search"
@robbiemu
robbiemu / 1. get the models
Last active January 10, 2025 19:39
Creating Sailor2 imatrix for llama.cpp quantization
git clone https://huggingface.co/sail/Sailor2-1B-Chat
Cloning into 'Sailor2-1B-Chat'...
remote: Enumerating objects: 39, done.
remote: Counting objects: 100% (36/36), done.
remote: Compressing objects: 100% (36/36), done.
remote: Total 39 (delta 14), reused 0 (delta 0), pack-reused 3 (from 1)
Unpacking objects: 100% (39/39), 2.02 MiB | 2.26 MiB/s, done.
Filtering content: 100% (2/2), 1.85 GiB | 14.92 MiB/s, done.
./convert_hf_to_gguf.py --outfile $HF/Sailor2-1B-Chat_bf16.gguf --outtype bf16 $HF/Sailor2-1B-Chat