Skip to content

Instantly share code, notes, and snippets.

@dirkgr
Created September 23, 2020 01:36
Show Gist options
  • Save dirkgr/a1d064ab1c4e410f3ed208662454cefb to your computer and use it in GitHub Desktop.
Save dirkgr/a1d064ab1c4e410f3ed208662454cefb to your computer and use it in GitHub Desktop.
A wrapper for a Python generator that ensures all returned items are unique
import dill
import mmh3
import typing
import io
def hash_object(o: typing.Any) -> str:
with io.BytesIO() as buffer:
dill.dump(o, buffer)
return mmh3.hash_bytes(buffer.getvalue(), x64arch=True)
def unique(items: typing.Iterable) -> typing.Iterable:
seen_items = set()
for item in items:
hash = hash_object(item)
if hash not in seen_items:
seen_items.add(hash)
yield item
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment