Skip to content

Instantly share code, notes, and snippets.

@hughdbrown
Last active November 6, 2015 17:01
Show Gist options
  • Save hughdbrown/fbb1adffc39ac7d839a7 to your computer and use it in GitHub Desktop.
Save hughdbrown/fbb1adffc39ac7d839a7 to your computer and use it in GitHub Desktop.
How I downsample
import numpy as np
def downsample(data, labels):
"""
>>> data = np.arange(100)
>>> label = np.array([1] * 95 + [0] * 5)
>>> print downsample(data, label)
"""
zero_index = np.array([i for i, val in enumerate(labels) if val == 0])
one_index = np.array([i for i, val in enumerate(labels) if val == 1])
smaller, larger = sorted([zero_index, one_index], key=len)
selected = np.sort(np.concatenate((smaller, np.random.choice(larger, size=len(smaller), replace=False))))
return data[selected], labels[selected]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment