Skip to content

Instantly share code, notes, and snippets.

View philtrade's full-sized avatar

Phillip K.S. Chu philtrade

  • Mountain View, CA
View GitHub Profile
@philtrade
philtrade / Race condition in fastai.callbacks.tracker.SaveModelCallback
Last active October 31, 2019 18:03
Simulate race condition in fastai.callbacks.tracker.SaveModelCallback
The script simulates a potential race condition in SaveModelCallBack, when 'every='improvement' is specified.
When launched in PyTorch's DistributedDataParallel (DDP) mode on a single host with multiple GPU:
python3 -m torch.distributed.launch --nproc_per_node=3 test_barrier_load.py
The master process (Rank 0) would sleep a few seconds before saving the model after the last epoch, in on_epoch_end().
Other process would attempt to load in on_train_end(). Without synchronization the script would crash.
When properly synchronized, other processes will wait for the master process arrive at the post-write barrier as well, before proceeding to read the file, as the following run on a single host with 3 GPUs, 3 epochs:
@philtrade
philtrade / wgan_ddp.py
Last active February 14, 2022 09:35
FastAI v1 GANTrainer interfered by PyTorch DistributedDataParallel
#!/usr/bin/env python3
# Run this script as:
# (Yes, even with nproc_per_node=1, it'll trigger the bug)
# python -m torch.distributed.launch --nproc_per_node=1 wgan_ddp.py
#
import argparse
from fastai.vision import *
from fastai.vision.gan import *
from fastai.distributed import *
@philtrade
philtrade / imdb_classifier_ddp.py
Created February 19, 2020 03:18
Text Classification training accuracy problem in fastai distributed training due to samples not being shuffled
#!/usr/bin/env python3
import fastai
from fastai.text import *
from fastai.distributed import *
import torch
import argparse, os
def train(local_rank:int=None, epochs:int=1):
if local_rank is not None:
torch.cuda.set_device(local_rank)
@philtrade
philtrade / ovpn_owrt.sh
Last active February 24, 2025 16:36
Set up VPN server on openWRT with openvpn, openssl, and easyrsa
# This script is adapted/tweaked from the openWRT wiki page on creating VPN server.
# VPN client can access outside world as if the traffic originates from the openWRT router.
#
# Prerequisites
# 1. opkg update && opkg install openvpn-openssl openvpn-easy-rsa
# 2. Get a public DDNS domain name or a static IP for the vpn server, put it into ddns_name="" near the bottom of the script.
# 3. Customize parameters, server/client service name, subnet, server port, output dir etc in the same bottom section.
#
# USAGE:
# 1. sh ./ovpn_owrt.sh <pki directory> [optional dh.pem file]