Skip to content

Instantly share code, notes, and snippets.

@eminorhan
eminorhan / pytorch_multinode_slurm.md
Last active May 7, 2021 09:19
A minimal example demonstrating how to do multi-node distributed training with pytorch on a slurm cluster

The following code is intentionally skeletal. Please feel free to flesh out the details according to your own needs.

import os
import builtins
import argparse
import torch
import torch.distributed as dist
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader
@eminorhan
eminorhan / untar_each_to_its_own.sh
Last active May 11, 2021 15:19
(1) go through all tar files in a directory, (2) untar them to their own subdirectory, (3) delete the tar file
for a in `ls -1 *.tar`; do mkdir ${a%.*}; tar -xf $a -C ${a%.*}; rm -f $a; done &
@eminorhan
eminorhan / slurm_interactive.sh
Last active December 1, 2022 05:29
launch an interactive job with 2 gpus in slurm
srun --qos=interactive --gres=gpu:2 --ntasks=8 --constraint="gpu_24gb" --pty bash
# interactive on greene
srun --gres=gpu:a100:1 --cpus-per-task=4 --mem=240GB --pty bash
@eminorhan
eminorhan / wget_txt.sh
Last active September 28, 2021 18:19
download all urls in a txt file
wget -i filename.txt
@eminorhan
eminorhan / cocoapi.py
Last active October 8, 2021 16:34
COCO-API cheat sheet
from pycocotools.coco import COCO
import numpy as np
dataDir = '/home/emin/coco'
dataType = 'val2017'
annFile = '{}/annotations/instances_{}.json'.format(dataDir, dataType)
coco = COCO(annFile) # build COCO object
catIds = coco.getCatIds() # category ids (80)
@eminorhan
eminorhan / transfer_between_clusters.sh
Last active October 9, 2022 15:15
Transfer data between two clusters
For transfering the file file.fl, from the first cluster, issue:
scp file.fl [email protected]:/scratch/eo41/
For transfering the entire directory origin, issue:
scp -r -q origin [email protected]:/scratch/eo41/
@eminorhan
eminorhan / read_videos.sh
Created November 22, 2021 20:03
Read videos without interruption
nohup python -u /misc/vlgscratch5/LakeGroup/emin/sayavakepicut/read_video.py --save-dir '/misc/vlgscratch5/LakeGroup/emin/sayavakepicut/5fps_300s/ava_trainval' --fps 5 --seg-len 300 '/misc/vlgscratch5/LakeGroup/shared_data/ava_v2/trainval' &
@eminorhan
eminorhan / del_checkpoint.py
Last active January 13, 2022 16:40
delete large memory checkpoint after loading in pytorch
checkpoint = torch.load(CHECKPOINT_PATH)
model.module.load_state_dict(checkpoint['model_state_dict'])
del checkpoint
torch.cuda.empty_cache()
@eminorhan
eminorhan / count_files.sh
Created February 9, 2022 17:58
count files in current directory
ls -1 | wc -l
@eminorhan
eminorhan / git_repo.sh
Created March 2, 2022 22:07
new github repo
git init
git add . && git commit -m "first commit"
git remote add origin REMOTE_GITHUB_REPO_URL
git remote -v
git push origin master