Skip to content

Instantly share code, notes, and snippets.

@gnurio
Created February 22, 2018 22:49
Show Gist options
  • Save gnurio/14c1fcff51e68daa13275ec84a4d35dd to your computer and use it in GitHub Desktop.
Save gnurio/14c1fcff51e68daa13275ec84a4d35dd to your computer and use it in GitHub Desktop.
Re-coding a categorical field into one-hot vectors
import pandas as pd
from sklearn.preprocessing import LabelEncoder,OneHotEncoder
def decode_encode(colname):
'''
(str) -> (DataFrame)
Returns a Data Frame with the column given to it transformed into a One-hot encoded set of columns
'''
label_encode = LabelEncoder()
X = data[colname]
X_new = label_encode.fit_transform(X)
X_new = X_new.reshape(-1,1)
encode = OneHotEncoder()
X_new = pd.DataFrame(encode.fit_transform(X_new).toarray(),columns=label_encode.classes_)
data_dropped = data.drop(colname,axis=1)
mergelist = [X_new,data_dropped]
data_onehot = pd.concat(mergelist,axis=1)
return data_onehot
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment