Skip to content

Instantly share code, notes, and snippets.

@thomwolf
thomwolf / prepare_packed_sequence.py
Created October 3, 2017 10:55
Preparer a pyTorch PackedSequence for a batch of sequences
# input_seqs is a batch of input sequences as a numpy array of integers (word indices in vocabulary) padded with zeroas
input_seqs = Variable(torch.from_numpy(input_seqs.astype('int64')).long())
# First: order the batch by decreasing sequence length
input_lengths = torch.LongTensor([torch.max(input_seqs[i, :].data.nonzero()) + 1 for i in range(input_seqs.size()[0])])
input_lengths, perm_idx = input_lengths.sort(0, descending=True)
input_seqs = input_seqs[perm_idx][:, :input_lengths.max()]
# Then pack the sequences
packed_input = pack_padded_sequence(input_seqs, input_lengths.cpu().numpy(), batch_first=True)
@markito
markito / gist:a8ffcdde8cf8ebb0e69ead2363902f06
Created September 15, 2017 14:14
Spark GC/memory settings
How much memory is permanently in memory vs how much is used for transformations
(ratio)
spark.storage.memoryFraction
Suggested settings... (need to debug the logs after these settings)
-XX:+UseG1GC -XX:+PrintFlagsFinal -XX:+PrintReferenceGC -verbose:gc
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintAdaptiveSizePolicy
-XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark
-Xms88g -Xmx88g -XX:InitiatingHeapOccupancyPercent=35
-XX:ConcGCThread=15 -XX:+AlwaysPreTouch
@naotokui
naotokui / GAN-and-trainable.py
Last active October 14, 2021 19:46
How model.trainable = False works in keras (GAN model)
# coding: utf8
## based on this article: http://qiita.com/mokemokechicken/items/937a82cfdc31e9a6ca12
import numpy as np
from keras.models import Sequential
from keras.engine.topology import Input, Container
from keras.engine.training import Model
from keras.layers.core import Dense
@bartolsthoorn
bartolsthoorn / multilabel_example.py
Created April 29, 2017 12:13
Simple multi-laber classification example with Pytorch and MultiLabelSoftMarginLoss (https://en.wikipedia.org/wiki/Multi-label_classification)
import torch
import torch.nn as nn
import numpy as np
import torch.optim as optim
from torch.autograd import Variable
# (1, 0) => target labels 0+2
# (0, 1) => target labels 1
# (1, 1) => target labels 3
train = []
@vlandham
vlandham / part1.md
Last active October 9, 2025 17:01
Feature Branches and Pull Requests : Walkthrough

Here's a little walkthrough of how Yannick and I are using feature branches and pull requests to develop new features and adding them to the project. Below are the steps I take when working on a new feature. Hopefully this, along with watching the process on Github, will serve as a starting point to having everyone use a similar workflow.

Questions, comments, and suggestions for improvements welcome!

Start with the latest on master

When starting a new feature, I make sure to start with the latest and greatest codebase:

git checkout master