Skip to content

Instantly share code, notes, and snippets.

@fgregg
Created October 16, 2014 20:55
Show Gist options
  • Select an option

  • Save fgregg/0796bb9accf57dce92a8 to your computer and use it in GitHub Desktop.

Select an option

Save fgregg/0796bb9accf57dce92a8 to your computer and use it in GitHub Desktop.
import dedupe
records = dict([(i, {'name': 'Margret',
'age': '32'})
for i in xrange(10**4)])
deduper = dedupe.Dedupe([{'field' : "name", 'type' : 'String'}], ())
deduper.sample(records, 100000)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment