Skip to content

Instantly share code, notes, and snippets.

@nlpjoe
Created April 13, 2018 12:53
Show Gist options
  • Select an option

  • Save nlpjoe/88f985ea32e4a1fc378044dc22ec33e2 to your computer and use it in GitHub Desktop.

Select an option

Save nlpjoe/88f985ea32e4a1fc378044dc22ec33e2 to your computer and use it in GitHub Desktop.
idf代码
    def _tf(self, word, count):
        return count[word] / sum(count.values())

    def _idf(self, word, count_list):
        if word in self.idf_dict:
            return self.idf_dict[word]
        c_idf = sum(1 for q_id in count_list.keys() if word in count_list[q_id])
        self.idf_dict[word] = math.log(len(count_list)) / (1 + c_idf)
        return self.idf_dict[word]

    def _tf_idf(self, word, count, count_list):
        return self._tf(word, count) * self._idf(word, count_list)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment