Skip to content

Instantly share code, notes, and snippets.

@hamletbatista
Created April 9, 2019 19:04
Show Gist options
  • Select an option

  • Save hamletbatista/28f1842f39d8dd42b37b06d4239bb734 to your computer and use it in GitHub Desktop.

Select an option

Save hamletbatista/28f1842f39d8dd42b37b06d4239bb734 to your computer and use it in GitHub Desktop.
ml_data = form_counts.merge(onehot_img, on="url")
ml_data.loc[:, 'group'] = "N/A"
ml_data.loc[ml_data['url'].str.contains(r".*/products/.*|.*/product/.*"), "group"] = "Products"
ml_data.loc[ml_data['url'].str.contains(r"/collections(?!.*/products.*)(?!.*/product.*)"), "group"] = "Category"
#splitting dataset into training and testing
X_train, X_test, y_train, y_test = train_test_split(ml_data.drop(["group", "url"], axis=1), ml_data['group'], test_size=0.2, random_state=42)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment