Skip to content

Instantly share code, notes, and snippets.

@vallantin
Last active October 19, 2018 19:00
Show Gist options
  • Save vallantin/6d4a0b8f82f9615a555853671965fcec to your computer and use it in GitHub Desktop.
Save vallantin/6d4a0b8f82f9615a555853671965fcec to your computer and use it in GitHub Desktop.
# Create the pipeline for the tweets
text = Pipeline([('process_tweets', process_tweets()),
('vct', TfidfVectorizer(ngram_range=(1,2)))])
# Create the pipeline for the other variables and add selection to choose features
dummies = Pipeline([('dummies_transformation', dummies_transformation(columns=['weekday', 'calendar_day', 'hour', 'is_weekend', 'link']))])
# Merge pipelines using FeatureUnion
features = FeatureUnion([('text', text),
('dummies', dummies)])
# Pipeline with the classifier
pipeline = Pipeline([('features',features),
('clf', SVC(random_state=0, kernel='linear'))])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment