Skip to content

Instantly share code, notes, and snippets.

@jaidevd
Created June 5, 2015 11:40
Show Gist options
  • Save jaidevd/3715846bb729d153694b to your computer and use it in GitHub Desktop.
Save jaidevd/3715846bb729d153694b to your computer and use it in GitHub Desktop.
Make sklearn.metrics.pairwise.cosine_similarity optionally return sparse output.
#! /usr/bin/env python
# -*- coding: utf-8 -*-
# vim:fenc=utf-8
#
# Copyright © 2015 jaidev <jaidev@newton>
#
# Distributed under terms of the MIT license.
from sklearn.metrics.pairwise import check_pairwise_arrays
from sklearn.preprocessing import normalize
from sklearn.utils.extmath import safe_sparse_dot
def cosine_similarity(X, Y=None, dense_output=True):
X, Y = check_pairwise_arrays(X, Y)
X_normalized = normalize(X, copy=True)
if X is Y:
Y_normalized = X_normalized
else:
Y_normalized = normalize(Y, copy=True)
K = safe_sparse_dot(X_normalized, Y_normalized.T, dense_output=dense_output)
return K
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment