Zach Mueller muellerzr

Passionate about Open Source and Deep Learning, working on 🤗 Accelerate

muellerzr / journal.py

Last active May 6, 2025 13:58

Quick journal entry maker in Python

	import json
	import os
	from datetime import datetime
	import argparse
	from typing import Dict, List, Optional

	"""
	Quick Journal is a simple CLI journal that allows you to write, search, and export journal entries.
	Specialized for ML experiment tracking with predefined fields.

muellerzr / run_clm.py

Last active October 16, 2024 18:51

	import torch.nn as nn
	from datasets import load_dataset
	from transformers import (
	AutoModelForCausalLM,
	AutoTokenizer,
	DataCollatorForLanguageModeling,
	Trainer,
	TrainingArguments,
	set_seed,
	)

muellerzr / combined_zero.py

Created September 11, 2024 17:22

	# Copyright 2024 The HuggingFace Inc. team. All rights reserved.
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software
	# distributed under the License is distributed on an "AS IS" BASIS,

muellerzr / instrument.ipynb

Created August 9, 2024 16:42

Sorry, something went wrong. Reload?

Sorry, we cannot display this file.

Sorry, this file is invalid so it cannot be displayed.

muellerzr / fsdp_fp8.yaml

Created July 31, 2024 19:55

	compute_environment: LOCAL_MACHINE
	debug: false
	distributed_type: FSDP
	downcast_bf16: 'no'
	enable_cpu_affinity: false
	fsdp_config:
	fsdp_activation_checkpointing: false
	fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
	fsdp_backward_prefetch: BACKWARD_PRE
	fsdp_cpu_ram_efficient_loading: true

muellerzr / test.py

Created July 3, 2024 20:38

Model loading speed test

	import time
	from transformers import AutoTokenizer, LlamaForCausalLM
	from accelerate.utils import set_seed

	set_seed(42)


	file_size = 132 # 70B
	# file_size = 30 # 8B
	start_time = time.time()

muellerzr / deploy.yml

Created May 30, 2024 16:40

Password protection on static gh site

muellerzr / base_drivers.txt

Created April 15, 2024 17:59

P2P tests with 4090's

	[P2P (Peer-to-Peer) GPU Bandwidth Latency Test]
	Device: 0, NVIDIA GeForce RTX 4090, pciBusID: 1, pciDeviceID: 0, pciDomainID:0
	Device: 1, NVIDIA GeForce RTX 4090, pciBusID: 2, pciDeviceID: 0, pciDomainID:0
	Device=0 CANNOT Access Peer Device=1
	Device=1 CANNOT Access Peer Device=0

	***NOTE: In case a device doesn't have P2P access to other one, it falls back to normal memcopy procedure.
	So you can see lesser Bandwidth (GB/s) and unstable Latency (us) in those cases.

	P2P Connectivity Matrix

muellerzr / affinity.py

Last active March 19, 2024 22:29

	import builtins
	import fcntl
	import os
	import socket
	import torch
	import torch.distributed as dist

	print("STARTED")

	def print(args, *kwargs):

muellerzr / test.py

Created September 15, 2023 18:29

Model memory stuff

	import torch
	from transformers import AutoModel, AutoConfig, AutoModelForSequenceClassification

	def get_model_memory(model: torch.nn.Module):
	"""
	Returns the memory usage of the given model
	"""
	total_memory = 0
	for param in model.parameters():
	total_memory += param.numel() * param.element_size()