Skip to content

Instantly share code, notes, and snippets.

View andrewssobral's full-sized avatar
🔴
I may be very slow to respond.

Andrews Cordolino Sobral andrewssobral

🔴
I may be very slow to respond.
View GitHub Profile
@awni
awni / metal_in_python.py
Last active August 12, 2024 20:56
Compile and call a Metal GPU kernel from Python
# Requires:
# pip install pyobjc-framework-Metal
import numpy as np
import Metal
# Get the default GPU device
device = Metal.MTLCreateSystemDefaultDevice()
# Make a command queue to encode command buffers to
command_queue = device.newCommandQueue()
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@wilderlopes
wilderlopes / 30s-terminal-tools.md
Last active December 6, 2024 17:33
List of terminal-based developer tools that deliver value in 30 seconds
@rafaelrc7
rafaelrc7 / Makefile
Last active May 6, 2025 00:53
Generic Makefile for c/cxx/asm
######################### Preamble ###########################################
SHELL := bash
.ONESHELL:
.SHELLFLAGS := -eu -o pipefail -c
.DELETE_ON_ERROR:
.SECONDEXPANSION:
.EXTRA_PREREQS := $(MAKEFILE_LIST)
MAKEFLAGS += --warn-undefined-variables
MAKEFLAGS += --no-builtin-rules
MAKEFLAGS += -j$(shell nproc)
@lewtun
lewtun / sft_trainer.py
Last active April 21, 2025 16:04
Fine-tuning Mistral 7B with TRL & DeepSpeed ZeRO-3
# This is a modified version of TRL's `SFTTrainer` example (https://github.com/huggingface/trl/blob/main/examples/scripts/sft_trainer.py),
# adapted to run with DeepSpeed ZeRO-3 and Mistral-7B-V1.0. The settings below were run on 1 node of 8 x A100 (80GB) GPUs.
#
# Usage:
# - Install the latest transformers & accelerate versions: `pip install -U transformers accelerate`
# - Install deepspeed: `pip install deepspeed==0.9.5`
# - Install TRL from main: pip install git+https://github.com/huggingface/trl.git
# - Clone the repo: git clone github.com/huggingface/trl.git
# - Copy this Gist into trl/examples/scripts
# - Run from root of trl repo with: accelerate launch --config_file=examples/accelerate_configs/deepspeed_zero3.yaml --gradient_accumulation_steps 8 examples/scripts/sft_trainer.py
@seddonm1
seddonm1 / gist:5927db05cb7ad38d98a22674fa82a4c6
Last active December 16, 2024 10:50
How to build onnxruntime on an aarch64 NVIDIA device (like Jetson Orin AGX)
On an Orin NX 16G the memory was too low to compile and the SWAP file had to be increased.
/etc/systemd/nvzramconfig.sh
change:
```
# Calculate memory to use for zram (1/2 of ram)
totalmem=`LC_ALL=C free | grep -e "^Mem:" | sed -e 's/^Mem: *//' -e 's/ *.*//'`
mem=$((("${totalmem}" / 2 / "${NRDEVICES}") * 1024))
```
@sorny
sorny / x11_forwarding_macos_docker.md
Last active May 13, 2025 10:20
X11 forwarding with macOS and Docker

X11 forwarding on macOS and docker

A quick guide on how to setup X11 forwarding on macOS when using docker containers requiring a DISPLAY. Works on both Intel and M1 macs!

This guide was tested on:

  • macOS Catalina 10.15.4
  • docker desktop 2.2.0.5 (43884) - stable release
  • XQuartz 2.7.11 (xorg-server 1.18.4)
  • Macbook Pro (Intel)
@mblondel
mblondel / check_convex.py
Last active January 3, 2025 12:42
A small script to get numerical evidence that a function is convex
# Authors: Mathieu Blondel, Vlad Niculae
# License: BSD 3 clause
import numpy as np
def _gen_pairs(gen, max_iter, max_inner, random_state, verbose):
rng = np.random.RandomState(random_state)
# if tuple, interpret as randn
@JerryLokjianming
JerryLokjianming / Crack Sublime Text Windows and Linux.md
Last active May 10, 2025 09:18
Crack Sublime Text 3.2.2 Build 3211 and Sublime Text 4 Alpha 4098 with Hex

How to Crack Sublime Text 3.2.2 Build 3211 with Hex Editor (Windows | Without License) ↓

  1. Download & Install Sublime Text 3.2.2 Build 3211
  2. Visit https://hexed.it/
  3. Open file select sublime_text.exe
  4. Offset 0x8545: Original 84 -> 85
  5. Offset 0x08FF19: Original 75 -> EB
  6. Offset 0x1932C7: Original 75 -> 74 (remove UNREGISTERED in title bar, so no need to use a license)
@nadavrot
nadavrot / Matrix.md
Last active April 20, 2025 12:59
Efficient matrix multiplication

High-Performance Matrix Multiplication

This is a short post that explains how to write a high-performance matrix multiplication program on modern processors. In this tutorial I will use a single core of the Skylake-client CPU with AVX2, but the principles in this post also apply to other processors with different instruction sets (such as AVX512).

Intro

Matrix multiplication is a mathematical operation that defines the product of