Skip to content

Instantly share code, notes, and snippets.

@manisnesan
Created February 22, 2022 00:15
Show Gist options
  • Save manisnesan/8699091bdfb6d55c6b3681e6bb5c4664 to your computer and use it in GitHub Desktop.
Save manisnesan/8699091bdfb6d55c6b3681e6bb5c4664 to your computer and use it in GitHub Desktop.
Rescoring examples
# Rescoring
## Delete the index
DELETE /searchml_test
##Create our index
PUT /searchml_test
{
"settings": {
"index": {
"query": {
"default_field": "body"
}
}
},
"mappings": {
"properties": {
"title": {"type": "text", "analyzer": "english"},
"body": {"type": "text", "analyzer": "english"},
"in_stock": {"type": "boolean"},
"category": {"type": "keyword", "ignore_above": "256"},
"price": {"type": "float"}
}
}
}
## Index some documents
PUT /searchml_test/_doc/doc_a
{
"id": "doc_a",
"title": "Fox and Hounds",
"body": "The quick red fox jumped over the lazy brown dogs.",
"price": "5.99",
"in_stock": "true",
"category": "childrens"}
PUT /searchml_test/_doc/doc_b
{
"id": "doc_b",
"title": "Fox wins championship",
"body": "Wearing all red, the Fox jumped out to a lead in the race over the Dog.",
"price": "15.13",
"in_stock": "true",
"category": "sports"}
PUT /searchml_test/_doc/doc_c
{
"id": "doc_c",
"title": "Lead Paint Removal",
"body": "All lead must be removed from the brown and red paint.",
"price": "150.21",
"in_stock": "false",
"category": "instructional"}
PUT /searchml_test/_doc/doc_d
{
"id": "doc_d",
"title": "The Three Little Pigs Revisted",
"price": "3.51",
"in_stock": "true",
"body": "The big, bad wolf huffed and puffed and blew the house down. The end.",
"category": "childrens"}
GET /searchml_test/_search
{
"query": {
"match_all": {}
}
}
# let’s run a baseline, a simple match all query with a filter for “childrens”, which should yield two docs
GET /searchml_test/_search
{
"query": {"bool": {
"must": [
{"match_all": {}}
],
"filter": [
{"term": {
"category": "childrens"
}}
]
}}
}
# run a rescoring query
## original score matches (“query_weight”) are worth one times their score (1x) and our second query scores are worth two times their score (2x). We also have said to only rescore the first document, via the “window_size” parameter.
## (1 x 1) + 2 x price = (1x1) + 2 * 5.99 = 12.98
POST /searchml_test/_search
{
"query": {
"bool": {
"must": [
{
"match_all": {}
}
],
"filter": [
{
"term": {
"category": "childrens"
}
}
]
}
},
"rescore": {
"query": {
"rescore_query": {
"function_score": {
"field_value_factor": {
"field": "price",
"missing": 1
}
}
},
"query_weight": 1.0,
"rescore_query_weight": 2.0
},
"window_size": 1
}
}
## What is the meaning of cascade over 2 results.
## Cascading twice
POST searchml_test/_search
{
"query": {
"match_all": {}
}
}
## rescore
## Here, our cascade went over 2 results then 1. Hence, ranks 3 (“doc_c”) and 4 (“doc_d”) both have their original score of 1.0. Rank 2 (“doc_b”) is boosted just by the phrase match (“Fox jumped”). Finally, rank 1 (“doc_a”) gets boosted by both the phrase match and the price boost.
POST searchml_test/_search
{
"query": {"match_all": {}},
"rescore": [
{
"query": {
"rescore_query":{
"match_phrase": {"body": {"query": "Fox jumped"}}
},
"query_weight": 1.0,
"rescore_query_weight": 2.0
},
"window_size": 2
},
{
"query": {
"rescore_query":{
"function_score": {"field_value_factor": {"field": "price", "missing": 1}}
},
"query_weight": 1.0,
"rescore_query_weight": 4.0
},
"window_size": 1
}
]
}
## manually specified results
### example shows us pinning doc_d to the top of the results while still executing the rest of the search
POST searchml_test/_search
{
"query": {
"bool": {
"should": [
{
"constant_score": {
"filter": {"term": {
"id.keyword": "doc_d"
}},
"boost": 10000
}
},
{"match_all": {}}
]
}
},
"rescore": [
{
"query": {
"rescore_query":{
"match_phrase": {"body": {"query": "Fox jumped"}}
},
"query_weight": 1.0,
"rescore_query_weight": 2.0
},
"window_size": 2
},
{
"query": {
"rescore_query":{
"function_score": {"field_value_factor": {"field": "price", "missing": 1}}
},
"query_weight": 1.0,
"rescore_query_weight": 4.0
},
"window_size": 1
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment