Skip to content

Instantly share code, notes, and snippets.

@emrepun
Last active December 5, 2019 22:59
Show Gist options
  • Save emrepun/725f0a9e8e18465ba2098a34af8b6be2 to your computer and use it in GitHub Desktop.
Save emrepun/725f0a9e8e18465ba2098a34af8b6be2 to your computer and use it in GitHub Desktop.
engine_gist_1
import numpy as np
import pandas as pd
from nltk.corpus import stopwords
df = pd.read_csv('city_data.csv')
def clear(city):
city = city.lower()
city = city.split()
city_keywords = [word for word in city if word not in stopwords.words('english')]
merged_city = " ".join(city_keywords)
return merged_city
for index, row in df.iterrows():
clear_desc = clear(row['description'])
df.at[index, 'description'] = clear_desc
updated_dataset = df.to_csv('city_data_cleared.csv')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment