Skip to content

Instantly share code, notes, and snippets.

@walsvid
Created January 2, 2017 14:52
Show Gist options
  • Select an option

  • Save walsvid/89ae3d36a4f06bb336d491865e320a0c to your computer and use it in GitHub Desktop.

Select an option

Save walsvid/89ae3d36a4f06bb336d491865e320a0c to your computer and use it in GitHub Desktop.
蓄水池抽样
#!/usr/bin/python
import sys
import random
if len(sys.argv) == 3:
input = open(sys.argv[2],'r')
elif len(sys.argv) == 2:
input = sys.stdin;
else:
sys.exit("Usage: python samplen.py <lines> <?file>")
N = int(sys.argv[1]);
sample = [];
for i,line in enumerate(input):
if i < N:
sample.append(line)
elif i >= N and random.random() < N/float(i+1):
replace = random.randint(0,len(sample)-1)
sample[replace] = line
for line in sample:
sys.stdout.write(line)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment