Skip to content

Instantly share code, notes, and snippets.

@clungzta
Last active July 3, 2018 13:36
Show Gist options
  • Select an option

  • Save clungzta/f225e8ed110ac9425925ebe5519c015f to your computer and use it in GitHub Desktop.

Select an option

Save clungzta/f225e8ed110ac9425925ebe5519c015f to your computer and use it in GitHub Desktop.
Simple utility function for batching data into a Siamese Neural Network
import numpy as np
def sample_pairs_siamese(train_X, train_y, batch_size):
# Generate a random batch of batch_size samples
rand_index = np.random.choice(len(train_X), size=batch_size)
batch_xs, batch_ys = train_X[rand_index], train_y[rand_index]
new_batch_xs, new_batch_ys = [], []
for label in batch_ys:
'''
Create a distribution over the set of class labels such that half of the
probabillity mass belongs to the current label, the other half lies uniformly elsewhere.
'''
sample_distribution = (np.ones(len(train_y)) / len(train_y)) * 0.5
idxs = np.where(train_y == label)[0]
sample_distribution[idxs] += (np.sum(sample_distribution) / len(idxs))
sampled_idx = np.random.choice(range(len(train_y)), 1, p=sample_distribution)[0]
new_batch_xs.append(train_X[sampled_idx])
new_batch_ys.append(train_y[sampled_idx])
return batch_xs, batch_ys, np.asarray(new_batch_xs), np.asarray(new_batch_ys)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment