Skip to content

Instantly share code, notes, and snippets.

View seongminp's full-sized avatar
🍣

Seongmin Park seongminp

🍣
View GitHub Profile
@kklemon
kklemon / iterable_dataset_dist.py
Last active February 24, 2025 06:16
PyTorch IterableDataset implementation with multiprocessing and distributed training support
import torch
import torch.distributed as dist
import torch.multiprocessing as mp
from torch.utils.data import IterableDataset, DataLoader
class DistributedIterableDataset(IterableDataset):
"""
Example implementation of an IterableDataset that handles both multiprocessing (num_workers > 0)
@simonthompson99
simonthompson99 / pudb-cheatsheet.md
Last active March 21, 2025 08:34
[Pudb Cheatsheet] Cheatsheet for pudb debugger #python

Running

python -m pudb <python_file> - opens up the script in pudb window, not executed anything yet. For fire modules better to make separate script that executes the individual function of interest, difficult to debug the original main file. Need pudb in virtualenv.

Or:

from pudb import set_trace
...
set_trace()
@sgraaf
sgraaf / ddp_example.py
Last active November 7, 2024 05:39
PyTorch Distributed Data Parallel (DDP) example
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from argparse import ArgumentParser
import torch
import torch.distributed as dist
from torch.nn.parallel import DistributedDataParallel as DDP
from torch.utils.data import DataLoader, Dataset
from torch.utils.data.distributed import DistributedSampler
from transformers import BertForMaskedLM
@sujitpal
sujitpal / inexact-search-aho-corasick.ipynb
Created April 20, 2020 01:46
Code demonstrating building and querying an Aho-Corasick FSM for inexact search
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@aditya-malte
aditya-malte / smallberta_pretraining.ipynb
Created February 22, 2020 13:41
smallBERTa_Pretraining.ipynb
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@GongXinyuu
GongXinyuu / gumbel_max_pytorch.py
Last active March 23, 2024 08:32
A temporary function to avoid nan in the pytorch gumbel_softmax function.
import torch
import torch.nn.functional as F
def gumbel_softmax(logits, tau=1, hard=False, eps=1e-10, dim=-1):
# type: (Tensor, float, bool, float, int) -> Tensor
r"""
Samples from the `Gumbel-Softmax distribution`_ and optionally discretizes.
You can use this function to replace "F.gumbel_softmax".
@yzh119
yzh119 / st-gumbel.py
Created January 12, 2018 12:25
ST-Gumbel-Softmax-Pytorch
from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
def sample_gumbel(shape, eps=1e-20):
U = torch.rand(shape).cuda()
return -Variable(torch.log(-torch.log(U + eps) + eps))
# -*- coding: utf-8 -*-
# Natural Language Toolkit: GLEU Score
#
# Copyright (C) 2001-2017 NLTK Project
# Authors:
# Contributors:
# URL: <http://nltk.org/>
# For license information, see LICENSE.TXT
""" GLEU score implementation. """
def sample_gumbel(shape, eps=1e-20):
"""Sample from Gumbel(0, 1)"""
U = tf.random_uniform(shape,minval=0,maxval=1)
return -tf.log(-tf.log(U + eps) + eps)
def gumbel_softmax_sample(logits, temperature):
""" Draw a sample from the Gumbel-Softmax distribution"""
y = logits + sample_gumbel(tf.shape(logits))
return tf.nn.softmax( y / temperature)
@chikamichi
chikamichi / .tmux.conf
Last active September 6, 2021 06:37
Prevent highlighting text from scrolling down (exiting copy-mode) in tmux
# Optional but convenient: use C-b v to paste the tmux buffer.
bind v paste-buffer
# Do not exit from copy-mode when selecting text.
# @see https://github.com/tmux/tmux/issues/337
# Note: the setting might be renamed MouseDragEndX.
# Depending on whether you activated tmux or vi keybindings (I'm using vi mode):
#bind -temaics-copy MouseDragEnd1Pane copy-selection -x
bind -tvi-copy MouseDragEnd1Pane copy-selection -x