Skip to content

Instantly share code, notes, and snippets.

@audhiaprilliant
Last active April 23, 2022 12:32
Show Gist options
  • Select an option

  • Save audhiaprilliant/6a08770e34712fe159e323dcf6e83220 to your computer and use it in GitHub Desktop.

Select an option

Save audhiaprilliant/6a08770e34712fe159e323dcf6e83220 to your computer and use it in GitHub Desktop.
How to Automatically Build Stopwords
# How to get a list of top words
def getTopWords(
text: str
):
# Split text by its whitespace
list_words = text.split()
# Count the word frequencies
word_freq = collections.Counter(list_words)
# Get top n words that have highest frequencies
top_words = word_freq.most_common()
return top_words
# Get a list top words
top_words = getTopWords(
text = text_clean
)
# Show the top words
top_words
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment