Skip to content

Instantly share code, notes, and snippets.

View stas00's full-sized avatar

Stas Bekman stas00

View GitHub Profile
@stas00
stas00 / hf-hub-model-download-by-arch-stats.py
Last active December 10, 2021 01:47
most popular HF model downloads by architecture (thanks to @LysandreJik)
from transformers import CONFIG_MAPPING
from huggingface_hub import HfApi
api = HfApi()
keys = list(CONFIG_MAPPING.keys())
downloads = {}
for key in keys:
models = api.list_models(filter=key)
total_downloads = sum(model.downloads if hasattr(model, "downloads") else 0 for model in models)
downloads[key] = total_downloads
ordered = sorted(downloads.items(), reverse=True, key=lambda t: t[1])
@stas00
stas00 / profiler_performance_analysis.ipynb
Created July 6, 2021 05:04
this is a rough beginning of an applied torch.profiler tutorial
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@stas00
stas00 / profiler.py
Created June 24, 2021 00:18
build_table: put name column last, make cols more narrow
# torch/autograd/profiler.py
def build_table(
events,
sort_by=None,
header=None,
row_limit=100,
max_src_column_width=75,
with_flops=False,
profile_memory=False,
top_level_events_only=False):
@stas00
stas00 / conftest.py
Created May 11, 2021 17:15
pytest start/stop tracer - when needing to figure out which tests didn't finish
# conftest.py
# to run:
# TRACE_START_STOP=1 pytest tests/test_trainer.py
import pytest
import os
trace = os.environ.get('TRACE_START_STOP', "")
@pytest.hookimpl(tryfirst=True, hookwrapper=True)
def pytest_runtest_makereport(item, call):
outcome = yield
res = outcome.get_result()
@stas00
stas00 / dropout_abs_max_values.py
Created April 15, 2021 18:01
experiment at trying to overcome overflow when using bf16-trained model with fp16 mixed precision (bf16-trained model leads to huge activations)
# Samyam: I have three thoughts here:
# 1) would dropping off large activations push the network towards producing smaller activations? I don't the answer but it feels unlikely as the network is not getting penalized in anyway for producing large activations,
# 2) dropout is meant to be used as a regularization but by dropping out only large values, it's introducing a bias. It may have unexpected impact on convergence,
# 3) if 1 does not happen then during time of inference where there is no dropout, we have the inf again
def dropout_abs_max_values(x, p=0.2):
""" Like Dropout but instead of random sampling, this one zeroth the p fraction of the biggest absolute values """
topk = int(p * x.shape[-1])
indices = torch.topk(x.abs(), topk, dim=-1, largest=True)[1]
@stas00
stas00 / composable_actions.py
Created March 2, 2021 18:56 — forked from mnm364/composable_actions.py
Composable Python argparse actions
import argparse
def compose_actions(*actions):
"""Compose many argparse actions into one callable action.
Args:
*actions: The actions to compose.
Returns:
argparse.Action: Composed action.
# python -m torch.distributed.launch --nproc_per_node=2 all_reduce_bench.py
import torch
import torch.distributed as dist
import time
import argparse
import os
import fcntl
TRIALS = 5
# same as the other script, but this time each thread allocates on a different device
# still reports correctly
import threading
import time
import torch
def print_mem_usage(prefix):
n_gpus = torch.cuda.device_count()
for id in range(n_gpus):
with torch.cuda.device(id):
@stas00
stas00 / require_no_pytest_distributed.py
Last active December 30, 2020 01:43
pytest skip marker for when a test must not be run under pytest-xdist -n setting since it does something that requires say all gpus untouched
# this goes into transformers/testing_utils.py
_pytest_num_workers = 1
def set_pytest_num_workers(n):
"""
This is helper method that sets how many pytest workers are used (if under pytest-xdist's -n option)
"""
_pytest_num_workers = n
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.