Skip to content

Instantly share code, notes, and snippets.

@kurtbrose
Created June 24, 2013 00:04
Show Gist options
  • Select an option

  • Save kurtbrose/5847001 to your computer and use it in GitHub Desktop.

Select an option

Save kurtbrose/5847001 to your computer and use it in GitHub Desktop.
implementation of reservoir sample
class Sample(object):
'''
This class implements Reservoir Sampling to keep a random sample of an infinite stream.
See http://gregable.com/2007/10/reservoir-sampling.html for one description.
'''
def __init__(self, sample_size=2**14, type='f'):
self.sample = array.array(type)
self.sample_size = sample_size
self.num_vals = 0
def add_val(self, val):
if self.num_vals < self.sample_size:
self.sample.append(val)
else:
pos = random.randint(0, self.num_vals)
if pos < self.sample_size:
self.sample[pos] = val
self.num_vals += 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment