Skip to content

Instantly share code, notes, and snippets.

@erap129
Last active May 5, 2022 18:56
Show Gist options
  • Save erap129/d772b9a90f5fb7e73c7a83f4ba47ad41 to your computer and use it in GitHub Desktop.
Save erap129/d772b9a90f5fb7e73c7a83f4ba47ad41 to your computer and use it in GitHub Desktop.
constant_filter = VarianceThreshold(threshold = 0.0002)
constant_filter.fit(tfidf_df)
feature_list = tfidf_df.columns[constant_filter.get_support(indices=True)]
print('Number of selected features: ' ,len(list(feature_list)),'\n')
print('List of selected features: \n' ,list(feature_list))
item_matrix_filtered_words_trainset_loocv = get_item_matrix_with_inner_ids(tfidf_df[feature_list].values, movies_df, train_loocv)
cosine_sim_filtered_words_trainset_loocv = cosine_similarity(item_matrix_filtered_words_trainset_loocv,
item_matrix_filtered_words_trainset_loocv)
item_matrix_filtered_words_trainset = get_item_matrix_with_inner_ids(tfidf_df[feature_list].values, movies_df, trainset)
cosine_sim_filtered_words_trainset = cosine_similarity(item_matrix_all_words_trainset, item_matrix_all_words_trainset)
get_algorithm_report(CustomSimKNNAlgorithm, trainset, testset, train_loocv, test_loocv, movies_df,
target_movie_id='movie_1', target_user_id='user_1', top_k=10,
algo_kwargs_trainset=dict(similarities=cosine_sim_filtered_words_trainset, sim_options={'user_based': False}),
algo_kwargs_trainset_loocv=dict(similarities=cosine_sim_filtered_words_trainset_loocv, sim_options={'user_based': False}))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment