Skip to content

Instantly share code, notes, and snippets.

@vikeshsingh37
Created April 5, 2020 14:39

Revisions

  1. vikeshsingh37 created this gist Apr 5, 2020.
    8 changes: 8 additions & 0 deletions snorkel_classifier.py
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,8 @@
    from sklearn.feature_extraction.text import CountVectorizer
    from sklearn.linear_model import LogisticRegression

    train_text = df_train_augmented.text.tolist()
    X_train = CountVectorizer(ngram_range=(1, 2)).fit_transform(train_text)

    clf = LogisticRegression(solver="lbfgs")
    clf.fit(X=X_train, y=df_train_augmented.label.values)