Skip to content

Instantly share code, notes, and snippets.

@danieljfarrell
Created December 3, 2015 07:36
Show Gist options
  • Save danieljfarrell/90aefee59c5292740f0b to your computer and use it in GitHub Desktop.
Save danieljfarrell/90aefee59c5292740f0b to your computer and use it in GitHub Desktop.
Convert a feature-list Pandas data frame to a feature-matrix.
import pandas as pd
from sklearn.preprocessing import MultiLabelBinarizer
# Feature-list data frame
df = pd.DataFrame(columns = ["features"], index=['Item 1', 'Item 2'])
df['features'] = [["A", "B"], ["C", "D"]]
# Use scikits-learn to create feature matrix and feature names
mlb = MultiLabelBinarizer()
feature_column_name = 'features'
feature_matrix = mlb.fit_transform(df[feature_column_name])
feature_names = mlb.classes_
# Create feature-martrix data frame
feature_df = pd.DataFrame(data=feature_matrix, columns=feature_names, index=df.index)
@danieljfarrell
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment