Skip to content

Instantly share code, notes, and snippets.

@trungquy
Last active March 5, 2019 02:08
Show Gist options
  • Save trungquy/5bf6195c5cf567dbc87de259ef532ad2 to your computer and use it in GitHub Desktop.
Save trungquy/5bf6195c5cf567dbc87de259ef532ad2 to your computer and use it in GitHub Desktop.
Categorical Encoding

Categorical Encoding Techniques

Many machine learning algorithm only accept numberic values, thus we have to encode categorical values into numberic value.

  • One hot vector encoding
  • Hashing
  • Learn Enbedding if vocabulary size is big
  • Categoical data - a fixed list of values, eg: gender, country/market/language, age group
  • Ordinal data - order is important. Exmaple: ranking, datetime
  • Numeric data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment