Skip to content

Instantly share code, notes, and snippets.

View lebrice's full-sized avatar

Fabrice Normandin lebrice

View GitHub Profile
"""
"""
from __future__ import annotations
import datetime
import itertools
# NOTE: Need to import cv2 to prevent a loading error for GLIBCXX with ffcv.
import cv2 # noqa
@lebrice
lebrice / mt_layers.py
Last active September 1, 2023 16:02
Multi-task layers (that can be split into layers for each task)
from __future__ import annotations
import copy
import functools
import math
from collections import OrderedDict
from typing import Sequence
import torch
from torch import Tensor, nn
@lebrice
lebrice / autograd.py
Last active May 22, 2024 14:33
Interview Question
from __future__ import annotations
class Value:
""" stores a single scalar value and its gradient """
def __init__(self, data, _parents: tuple[Value, ...]=(), _op=''):
self.data = data
self.grad = 0
# internal variables used for autograd graph construction
self._backward = lambda: None

Torch distributed debugging on Tamia cluster

  1. Clone this gist somewhere using the URL, for example git clone (the_url_of_this_gist) some_folder then move to that folder.

  2. Install UV following their docs

  3. run uv sync to recreate the same virtual env

  4. run the following on a login node to download cifar10 in your $SCRATCH/data/cifar10

    1. mkdir -p $SCRATCH/data/cifar10
    2. `uv run python -c 'import pathlib, os, torchvision.datasets; torchvision.datasets.CIFAR10(pathlib.Path(os.environ["SCRATCH"]) / "data/cifar10", download=True)
  5. Launch the job with sbatch job.sh