Skip to content

Instantly share code, notes, and snippets.

View jwa91's full-sized avatar
👶
learning

Jan Willem Altink jwa91

👶
learning
View GitHub Profile
@jwa91
jwa91 / venv-settings.zsh
Created September 5, 2024 22:18
current shell configuration related to Python development
# ----------------------------------------
# File: venv-settings.zsh
# Description: Virtual environment settings and functions
# Author: Jan Willem Altink*
# *some of the functions in here i got from a blog posts, unfortunately i cant find it anymore
# Last Modified: 2024-09-05
# ----------------------------------------
# ----------------------------------------
# Usage:

Finally used OpenAI's Deep Research for the first time

Building a Scalable AI Fine-Tuning Cluster with AMD Ryzen AI Max+ 395

This report outlines a model-independent framework for fine-tuning large AI models on a cluster of AMD Ryzen AI Max+ 395 nodes. The design supports a minimum of two nodes and scales to much larger deployments. We focus on optimizing fine-tuning efficiency using the XDNA 2 neural processing unit (NPU) in these chips, while keeping the setup accessible to developers of open-source AI models. Key areas include architecture and low-level optimizations, model splitting strategies, network and data throughput tuning, alternative computation models, and continuous benchmarking for improvements.

1. Architecture Analysis & Low-Level Optimization

XDNA 2 NPU vs CPU/GPU: AMD’s XDNA 2 NPU (built into Ryzen AI Max chips) is a specialized spatial dataflow engine optimized for AI workloads. It consists of a 2D array of compute tiles with a flexible interconnect and on-chip SRAM