Skip to content

Instantly share code, notes, and snippets.

View ameyavilankar's full-sized avatar

Ameya Vijay Vilankar ameyavilankar

View GitHub Profile
@ameyavilankar
ameyavilankar / preprocess.py
Last active January 25, 2023 10:19
Removing Punctuation and Stop Words nltk
import string
import nltk
from nltk.tokenize import RegexpTokenizer
from nltk.corpus import stopwords
import re
def preprocess(sentence):
sentence = sentence.lower()
tokenizer = RegexpTokenizer(r'\w+')
tokens = tokenizer.tokenize(sentence)