Skip to content

Instantly share code, notes, and snippets.

@twolodzko
Created September 7, 2018 12:19
Show Gist options
  • Select an option

  • Save twolodzko/1859fcb3a638a3b13d7dfaae754a73fc to your computer and use it in GitHub Desktop.

Select an option

Save twolodzko/1859fcb3a638a3b13d7dfaae754a73fc to your computer and use it in GitHub Desktop.
One hot encoder using xxHash
import xxhash
from keras.preprocessing.text import hashing_trick
# one_hot and hashing_trick in Keras both use by default python's hash function
# it is unstable: https://stackoverflow.com/q/27522626/3986320
# alternatively, you could use md5, but it's not the fastest hashing function
# xxHash package offers a faster alternative
xxh = lambda w: int(xxhash.xxh32(w.encode()).hexdigest(), 16)
one_hot = lambda x, n, **kwargs: hashing_trick(x, n, hash_function=xxh, **kwargs)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment