Skip to content

Instantly share code, notes, and snippets.

@andrea-dagostino
Last active November 23, 2021 17:20
Show Gist options
  • Select an option

  • Save andrea-dagostino/9949889eeaf86915182064dd1bb3870f to your computer and use it in GitHub Desktop.

Select an option

Save andrea-dagostino/9949889eeaf86915182064dd1bb3870f to your computer and use it in GitHub Desktop.
posts/raggruppamento-testuale-con-tf-idf
def get_top_keywords(n_terms):
"""Questa funzione restituisce le keyword per ogni centroide del KMeans"""
df = pd.DataFrame(X.todense()).groupby(clusters).mean() # raggruppa il vettore TF-IDF per gruppo
terms = vectorizer.get_feature_names_out() # accedi ai termini del tf idf
for i,r in df.iterrows():
print('\nCluster {}'.format(i))
print(','.join([terms[t] for t in np.argsort(r)[-n_terms:]])) # per ogni riga del dataframe, trova gli n termini che hanno il punteggio più alto
get_top_keywords(10)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment