Skip to content

Instantly share code, notes, and snippets.

View fly51fly's full-sized avatar

爱可可-爱生活 fly51fly

View GitHub Profile
@stephenroller
stephenroller / mixout.py
Last active February 10, 2023 23:49
Example of mixout on generic modules.
#!/usr/bin/env python3
"""
Example of a generic Mixout implementation. (Lee et al., 2019).
https://arxiv.org/abs/1909.11299
Implementation by Stephen Roller (https://stephenroller.com).
Updated 2020-02-10 to include 1/(1 - p) correction term. Thanks to
Cheolhyoung Lee for making this correction.
@redknightlois
redknightlois / ralamb.py
Last active August 9, 2023 20:50
Ralamb optimizer (RAdam + LARS trick)
class Ralamb(Optimizer):
def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight_decay=0):
defaults = dict(lr=lr, betas=betas, eps=eps, weight_decay=weight_decay)
self.buffer = [[None, None, None] for ind in range(10)]
super(Ralamb, self).__init__(params, defaults)
def __setstate__(self, state):
super(Ralamb, self).__setstate__(state)
@thomwolf
thomwolf / gpt-2-wikitext-103.py
Last active September 23, 2024 20:23
A very small and self-contained gist to train a GPT-2 transformer model on wikitext-103
# Copyright (c) 2019-present, Thomas Wolf.
# All rights reserved. This source code is licensed under the MIT-style license.
""" A very small and self-contained gist to train a GPT-2 transformer model on wikitext-103 """
import os
from collections import namedtuple
from tqdm import tqdm
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from ignite.engine import Engine, Events
@thomwolf
thomwolf / top-k-top-p.py
Last active March 11, 2025 03:44
Sample the next token from a probability distribution using top-k and/or nucleus (top-p) sampling
def top_k_top_p_filtering(logits, top_k=0, top_p=0.0, filter_value=-float('Inf')):
""" Filter a distribution of logits using top-k and/or nucleus (top-p) filtering
Args:
logits: logits distribution shape (vocabulary size)
top_k >0: keep only top k tokens with highest probability (top-k filtering).
top_p >0.0: keep the top tokens with cumulative probability >= top_p (nucleus filtering).
Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751)
"""
assert logits.dim() == 1 # batch size 1 for now - could be updated for more but the code would be less clear
top_k = min(top_k, logits.size(-1)) # Safety check
// by Etienne JACOB
// uses a formula inspired by @ozachou_g (on twitter)
//click on canvas to generate a new one
//press key to save
float x,y,z,t;
float[] A = new float[12];
float[] f = new float[12];
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@zeyademam
zeyademam / Troubleshoot-dcnn.md
Last active January 22, 2024 05:54
Troubleshooting Convolutional Neural Nets

Troubleshooting Convolutional Neural Networks

Intro

This is a list of hacks gathered primarily from prior experiences as well as online sources (most notably Stanford's CS231n course notes) on how to troubleshoot the performance of a convolutional neural network . We will focus mainly on supervised learning using deep neural networks. While this guide assumes the user is coding in Python3.6 using tensorflow (TF), it can still be helpful as a language agnostic guide.

Suppose we are given a convolutional neural network to train and evaluate and assume the evaluation results are worse than expected. The following are steps to troubleshoot and potentially improve performance. The first section corresponds to must-do's and generally good practices before you start troubleshooting. Every subsequent section header corresponds to a problem and the section is devoted to solving it. The sections are ordered to reflect "more common" issues first and under each header the "most-eas

@genekogan
genekogan / download_haeckel.sh
Created July 29, 2018 13:45
script to scrape scanned illustrations from Ernst Haeckel's book Kunstformen der Natur (Artforms of Nature)
for i in {1..100}; do
filepath=$(printf "http://caliban.mpipz.mpg.de/haeckel/kunstformen/Tafel_%03d_300.jpg" $i)
wget $filepath;
done
int[][] result;
float t, c;
float ease(float p) {
return 3*p*p - 2*p*p*p;
}
float ease(float p, float g) {
if (p < 0.5)
return 0.5 * pow(2*p, g);