Skip to content

Instantly share code, notes, and snippets.

@a-agmon
Last active April 15, 2021 08:08
Show Gist options
  • Save a-agmon/75cada22089bdb3e92299b31b01e58d4 to your computer and use it in GitHub Desktop.
Save a-agmon/75cada22089bdb3e92299b31b01e58d4 to your computer and use it in GitHub Desktop.
# First cluster the item data and return the model
items_model, items_labels, items_cluster_centers = cluster_gmm(exp_model.wv.vectors, k=8)
# [user_means] is a list of vectors, each represents the mean of the item vectors each user has listened to
# then use the model to create a new user vector to each user
#based on their probability of being part of each item cluster
# this is the empty array
ar_users_clusters = np.zeros((len(user_means), items_cluster_centers.shape[0]))
# for each user mean, we need to estimate its probability of being part of each item cluster or distribution
# recall that user mean is a mean of item vectors so they share the same space
for ix, user_key in enumerate(user_means):
user_mean = user_means[user_key]
# predict_proba will give us an array whereas each element represents
#the probability of the user vector being part of the cluster
ar_users_clusters[ix] = items_model.predict_proba([user_mean])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment