Created
April 26, 2018 21:56
-
-
Save TarasMartynyuk/62bef7e2859eedbdf3af2657a04c4126 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Precision "точність": What fraction of the returned results are relevant to the information need? | |
tp / len(chosen) = tp / (tp + fp) | |
Recall - "повнота": What fraction of the relevant documents in the collection were returned by the system? | |
== sensitivity: | |
tp / (tp + fn) | |
accuracy - "правильність": | |
(tp + tn) / (tp + tn + fp + fn) | |
f measure == balanced f measure: ?: | |
2pr/(p+r), є [0, 1] | |
rocchio: | |
user point of view: | |
query | |
user marks relevant documents | |
query is expanded (new words are added) | |
new results are sent | |
internally: | |
vector q is made distant from non-relevant docs and close to relevant docs | |
Pseudo relevance feedback: | |
assume top k retrieved documents are relevant | |
use rocchio | |
Indirect relevance feedback: | |
treat user clicks, etc as an evidence of relevance | |
champion lists: | |
r [top] docs to each term | |
>> so r is chosen in advance | |
r can be longer for rare terms | |
high and low lists: | |
disjoint sets | |
sorted by static quality | |
high: m docs with highest tf for t | |
0 1 2 3 4 5 | |
0 10 110 1110 11110 111110 // unary code: | |
0 100 101 11000 11001 // gamma code | |
permuterm index: | |
hello | |
hello$ | |
ello$h | |
llo$he | |
lo$hel | |
o$hell | |
$hello | |
he*o | |
$he*o | |
o$he* | |
SPIMI: | |
map for terms to termIDs | |
MapReduce: | |
master, parsers, inverter | |
master identifies idle machines and assigns role to them | |
???: каппа статистика: | |
документ Релев Нерелев | |
термін присутній Xt = 1 Pt Ut | |
турмін відсутній Xt = 0 1-Pt 1-Ut | |
Ціпф - приблизна к-сть термінів в колекції | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment