Created
June 20, 2018 21:08
-
-
Save p-baleine/91dcab5ac12e2f97ff93a160ace38ff5 to your computer and use it in GitHub Desktop.
読書メモ〜『情報検索の基礎』 第1章 1
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# p.3、論理検索モデル(Boolean retrieval model) | |
import glob | |
import os | |
DATA_PATH = '../data/Boolean_Retrieval/shakespeare_collection' | |
DOCUMENTS = [ | |
'Tragedies/Antony and Cleopatra', | |
'Tragedies/Julius Caesar', | |
'Comedies/The Tempest', | |
'Tragedies/Hamlet', | |
'Tragedies/Othello', | |
'Tragedies/Macbeth' | |
] | |
# 用語と文章の関係(term-document)からなる結合行列をつくる | |
matrix = utils.create_incidence_matrix( | |
[os.path.join(DATA_PATH, d) for d in DOCUMENTS]) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment