Scoring

Scoring of complex queries involves a few different pieces. To understand how they fit together, it's helpful to first look a simple query, then add in complexities one at a time until we have the complete picture.

Term Query

The simplest query we can run is a term query for a single term. Consider a term search for light in the field description of the beer-search dataset. The top scoring hit is iron_city_brewing_co-ic_light and when we ask Bleve to explain the scoring we get:

"explanation": {
  "value": 2.024174152548743,
  "message": "fieldWeight(description:light in iron_city_brewing_co-ic_light), product of:",
  "children": [
    {
      "value": 1.4142135623730951,
      "message": "tf(termFreq(description:light)=2"
    },
    {
      "value": 0.3333333432674408,
      "message": "fieldNorm(field=description, doc=iron_city_brewing_co-ic_light)"
    },
    {
      "value": 4.293921680740409,
      "message": "idf(docFreq=270, maxDocs=7303)"
    }
  ]
}

Reproduce: curl -XPOST http://localhost:8094/api/search -d '{"size":10,"explain":true,"highlight":{},"query":{"term":"light","field":"description"}}'

As you can see there are 3 components to the score of this document:

TF (term frequency) - how often does the term appear in this field of the document? The more frequent the term the higher the score. In this implementation, the actual value used is the square root of the term frequency. In this example, the term frequency was 2, thus the value used in the computation is 1.414.
Norm (field normalization) - how many tokens are in the field? If the field has many tokens, we want to lower the score. Consider 2 documents that both use the term "light" twice, one has 10 tokens, the other has 1000. We want the document using twice out of 10 tokens to score higher than the docuemnt that used it twice in 1000 tokens. In this implementation, the actual value used is: 1/sqrt(token_length) In this example, the token length was 9, thus the value used is 0.333. (NOTE: in the future, index time field boosting would be included in this computation, but bleve currently only supports query time boosting)
IDF (inverse document frequency) - out of the whole population of documents, how many of them used this term? The idea is that a term appearing in only a few documents is more significant than a term which appears in many/all documents. In this implementation, the actual value used is: 1 + ln(total_num_docs/(doc_freq + 1)) In this example, the total number of docs was 7303 and the document frequency was 270, thus the computed value is 4.294 (NOTE: the +1's just avoid all divide by 0 cases and don't significantly alter the computation)

So, with these 3 pieces calculated, we compute the final score by multiplying them: 1.414*0.333*4.294=2.024

Searching Multiple Terms

Now that we've seen how to calculate the score of a search for a single term, we can look at the more common case where a user searches for multiple terms of the form A or B or C...

Consider a match search for the terms light and water in the field description of the beer-search dataset. Also we want to boost the importance of water by a factor of 3. The top scoring hit is iron_city_brewing_co-ic_light and when we ask Bleve to explain the scoring we get:

"explanation": {
  "value": 2.2412700681905235,
  "message": "product of:",
  "children": [
    {
      "value": 2.2412700681905235,
      "message": "sum of:",
      "children": [
        {
          "value": 0.5248131710762932,
          "message": "weight(description:light^1.000000 in iron_city_brewing_co-ic_light), product of:",
          "children": [
            {
              "value": 0.2592727361998342,
              "message": "queryWeight(description:light^1.000000), product of:",
              "children": [
                {
                  "value": 1,
                  "message": "boost"
                },
                {
                  "value": 4.293921680740409,
                  "message": "idf(docFreq=270, maxDocs=7303)"
                },
                {
                  "value": 0.060381337964955,
                  "message": "queryNorm"
                }
              ]
            },
            {
              "value": 2.024174152548743,
              "message": "fieldWeight(description:light in iron_city_brewing_co-ic_light), product of:",
              "children": [
                {
                  "value": 1.4142135623730951,
                  "message": "tf(termFreq(description:light)=2"
                },
                {
                  "value": 0.3333333432674408,
                  "message": "fieldNorm(field=description, doc=iron_city_brewing_co-ic_light)"
                },
                {
                  "value": 4.293921680740409,
                  "message": "idf(docFreq=270, maxDocs=7303)"
                }
              ]
            }
          ]
        },
        {
          "value": 1.7164568971142304,
          "message": "weight(description:water^3.000000 in iron_city_brewing_co-ic_light), product of:",
          "children": [
            {
              "value": 0.9658041459133684,
              "message": "queryWeight(description:water^3.000000), product of:",
              "children": [
                {
                  "value": 3,
                  "message": "boost"
                },
                {
                  "value": 5.331692310152274,
                  "message": "idf(docFreq=95, maxDocs=7303)"
                },
                {
                  "value": 0.060381337964955,
                  "message": "queryNorm"
                }
              ]
            },
            {
              "value": 1.7772308230163623,
              "message": "fieldWeight(description:water in iron_city_brewing_co-ic_light), product of:",
              "children": [
                {
                  "value": 1,
                  "message": "tf(termFreq(description:water)=1"
                },
                {
                  "value": 0.3333333432674408,
                  "message": "fieldNorm(field=description, doc=iron_city_brewing_co-ic_light)"
                },
                {
                  "value": 5.331692310152274,
                  "message": "idf(docFreq=95, maxDocs=7303)"
                }
              ]
            }
          ]
        }
      ]
    },
    {
      "value": 1,
      "message": "coord(2/2)"
    }
  ]
}

Reproduce: curl -XPOST http://localhost:8094/api/search -d '{"size":10,"explain":true,"highlight":{},"query":{"disjuncts":[{"term":"light","field":"description"},{"term":"water","field":"description","boost":3.0}]}}'

At first this looks very complicated, but the first thing to recognize is that two of the sub-sections (the ones with message starting "fieldWeight") are ones we just looked at (see the 2.024 value matches the one we just computed above). This value, and the corresponding 1.777 for the other term are raw scores for the two individual terms.

But, when search multiple terms, there are 2 other factors that come into play.

Boosting - the idea of boosting is that when searching for more than one term, sometimes you want to increase the relative importance of (or boost) one term over another. You could just add a step where you multiply by the boost factor, but that would make scores involving boost wildly different from those that didn't. To account for this a normalization is done across all the boost factors. This allows boost factors to alter the relative importance of documents without drastically altering the final scores.

So, first as an intermediate step we will compute a sum square of the weights of all the queries at this level. In this case, there are 2 term queries, one for light and one for water. Each of them computes their boost * IDF, squares that value and returns it. Then these values are summed for all queries at this level.

light - has IDF of 4.293 and boost of 1.0 - multiplied and squared this is 18.437
water - has IDF of 5.331 and boost of 3.0 - multiplied and squared this is 255.842

Now, we sum these, take the square root, then take the inverse, and get 0.060. This value is known as the queryNorm.

Now, when each of the underlying term queries are computing their score, they also calculate a value based on the boost, the IDF, and the queryNorm that we just calculated.

light - boost of 1.0, IDF of 4.293, and queryNorm 0.60 => 0.259
water - boost of 3.0, IDF of 5.331, and queryNorm 0.60 => 0.965

So, the final score for each term is then the product of these new weights, multipied by the simple term score that we learned how to compute above.

light - weight 0.259, term score 2.024 => 0.524
water - weight 0.965, term score 1.777 => 1.716

At this stage, these adjusted term scores are summed yielding the value 2.241.

Coordination Factor - the other major consideration when scoring searches of this form is a factor to account for how many of the terms you searched for matched. The idea is that no matter how well any single term matched, you want to reward documents that match a larger number of the terms you searched. To do this we multiply the score by what we call the coordination factor. It is simply the number of search terms that were satisfied out of the total number of search terms (at this level). In this example, the document satisfied both of the terms we searched for, thus 2/2 or a coordination factor of 1.0. Had only one of the terms matched, the score would have been cut in half due to a coordination factor of 0.5.

Background Material

Lucene TFIDF Similarity

Introduction To Information Retrieval, Chapter 6

mschoch/scoring.md

Scoring

Term Query

Searching Multiple Terms

Background Material

mschoch commented Jul 31, 2020

vijaygopal-couchbase commented Jul 31, 2020 •

edited

Loading

mschoch commented Jul 31, 2020

vijaygopal-couchbase commented Jul 31, 2020

Pos	Term	Start	End
1	ic	1	3
2	light	4	9
5	original	16	24
8	watered	32	39
10	copy	45	49
12	brewed	55	61
15	light	68	73
17	start	79	84
19	finish	88	94

mschoch/scoring.md

Scoring

Term Query

Searching Multiple Terms

Background Material

mschoch commented Jul 31, 2020

vijaygopal-couchbase commented Jul 31, 2020 • edited Loading

mschoch commented Jul 31, 2020

vijaygopal-couchbase commented Jul 31, 2020

vijaygopal-couchbase commented Jul 31, 2020 •

edited

Loading