Skip to content

Instantly share code, notes, and snippets.

@WillKoehrsen
Last active September 26, 2018 13:25
Show Gist options
  • Save WillKoehrsen/f214801e60599c1f9c97875fc658a3ee to your computer and use it in GitHub Desktop.
Save WillKoehrsen/f214801e60599c1f9c97875fc658a3ee to your computer and use it in GitHub Desktop.
import featuretools as ft
import pandas as pd
def partition_to_feature_matrix(partition_num, feature_defs):
"""Calculate a feature matrix for one partition and save"""
# Read in data from partition directory
members = pd.read_csv(f's3://{partition}/members.csv')
# ... Read in other dataframes
# Create an entityset
es = ft.EntitySet
es.entity_from_dataframe(id = 'customers', dataframe = members)
# ... Add in other dataframes and relationships
# Deep feature synthesis
feature_matrix = ft.calculate_feature_matrix(entityset = es,
features = feature_defs)
# Save to S3
feature_matrix.to_csv(f's3://{partition}/feature_matrix.csv')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment