Skip to content

Instantly share code, notes, and snippets.

View Quentin-Anthony's full-sized avatar

Quentin Anthony Quentin-Anthony

View GitHub Profile
@Quentin-Anthony
Quentin-Anthony / 70b-template-mlm.sh
Created September 26, 2024 20:08
Template MLM config for a 70B
#!/bin/bash
# set tokenizer
TOKENIZER_TYPE=<TODO>
TOKENIZER_MODEL=<TODO>
# set up distributed
GPUS_PER_NODE=<TODO>
NNODES=<TODO>
export MASTER_ADDR=localhost #ONLY FOR SINGLE-NODE. CHANGE FOR MULTINODE.
@Quentin-Anthony
Quentin-Anthony / calc_transformer_flops.py
Created November 3, 2023 21:34
Transformer FLOPs with Dense/MoE
import argparse
import math
# Helper function to pretty-print message sizes
def convert_flops(params):
if params == 0:
return "0"
size_name = ("", "KFLOPs", "MFLOPs", "GFLOPs", "TFLOPs", "PFLOPs", "EFLOPs", "ZFLOPs", "YFLOPs")
i = int(math.floor(math.log(params, 1000)))
p = math.pow(1000, i)
@Quentin-Anthony
Quentin-Anthony / calc_transformer_params.py
Created November 3, 2023 04:20
Transformer Parameter Count
import argparse
import math
# Helper function to pretty-print message sizes
def convert_params(params):
if params == 0:
return "0"
size_name = ("", "K", "M", "B", "T", "P", "E", "Z", "Y")
i = int(math.floor(math.log(params, 1000)))
p = math.pow(1000, i)
@Quentin-Anthony
Quentin-Anthony / torch_format_bench.py
Created May 27, 2023 20:50
Compares numpy, native torch, safetensors for save/load
import torch
from safetensors.torch import save_file, load_file
import numpy as np
import argparse
import os
import time
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--no-save", action="store_false", help="disables saving initial tensors")
@Quentin-Anthony
Quentin-Anthony / nowlab_cla.md
Created May 18, 2023 19:45
CLA for NOWLAB projects

NOWLAB Individual Contributor License Agreement

Thank you for your interest in contributing to open source software projects (“Projects”) made available by the Network-Based Computing Laboratory (NBCL) or its affiliates (“NBCL”). This Individual Contributor License Agreement (“Agreement”) sets out the terms governing any source code, object code, bug fixes, configuration changes, tools, specifications, documentation, data, materials, feedback, information or other works of authorship that you submit or have submitted, in any form and in any manner, to NBCL in respect of any of the Projects (collectively “Contributions”). If you have any questions respecting this Agreement, please contact [email protected].

You agree that the following terms apply to all of your past, present and future Contributions. Except for the licenses granted in this Agreement, you retain all of your right, title and interest in and to your Contributions.

Copyright License. You hereby grant, and agree to grant, to NB

@Quentin-Anthony
Quentin-Anthony / transformer_mem.py
Last active October 26, 2024 06:44
Transformer Training/Inference Memory
import argparse
import math
# Helper function to pretty-print message sizes
def convert_params(params):
if params == 0:
return "0"
size_name = ("", "K", "M", "B", "T", "P", "E", "Z", "Y")
i = int(math.floor(math.log(params, 1000)))
p = math.pow(1000, i)
@Quentin-Anthony
Quentin-Anthony / test_gpu.sh
Last active October 16, 2022 17:14
EFA BW test for Stability cluster (adapted from Azure script)
#!/bin/bash
#SBATCH --partition=gpu
#SBATCH --job-name=gputest
#SBATCH --nodes 1
#SBATCH --ntasks-per-node 8
#SBATCH --cpus-per-gpu=6
#SBATCH --gres=gpu:8
#SBATCH --nodelist gpu-st-p4d-24xlarge-42
#SBATCH --output=%x_%j.out
#SBATCH --open-mode=append
@Quentin-Anthony
Quentin-Anthony / megatron-checks.md
Last active June 7, 2022 05:30
Megatron Spreadsheet
ZeRO Stage Data-Parallel MP PP MP+PP MoE MoE+MP
1
2 N/A N/A
3 N/A N/A N/A N/A