Skip to content

Instantly share code, notes, and snippets.

@cobanov
Created March 8, 2022 14:13
Show Gist options
  • Save cobanov/ee8e9f08508c266f96bde62162417880 to your computer and use it in GitHub Desktop.
Save cobanov/ee8e9f08508c266f96bde62162417880 to your computer and use it in GitHub Desktop.
import pandas as pd
import string
import sys
INPUT_FILE = sys.argv[1]
OUTPUT_FILE = sys.argv[2]
bad_words = string.punctuation + string.digits
bad_list = [i for i in bad_words]
df = pd.read_csv(INPUT_FILE)
df = df[~df["words"].isin(bad_list)]
df.to_csv(OUTPUT_FILE, index=False)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment