Created
June 26, 2018 01:03
-
-
Save fulmicoton/b6387330593b7f4fe97085ea55794b65 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
So if I understand correctly, you want your docs to be sorted against something like | |
f(textual document content, some field content contains a specific value). | |
You did not make it clear if you were looking for a lexicographical sort | |
f(text, somefield) -> (text, somefield) | |
Or if you wanted to sort by the field | |
f(text, somefield) -> somefield | |
Or if you wanted a best effort solution ... | |
# Solution 1: best effort solution. | |
You can integrate your field in the query as a should query. | |
For instance: +software +engineer freshly_posted=true will return all documents | |
that contain both software and engineer. Having the field freshly posted set to true will | |
give and extra boost in scoring. | |
You can get that effect by manipulating strings before sending them to the queryparser (ugly) | |
or by building a BooleanQuery that takes one clause that looks like | |
[(Must, yourquery), (Should, TermQuery(freshly_posted, "microsoft")] | |
In that case, your extra field needs to be indexed. | |
# Solution 2: Hack a collector | |
There is a more flexible solution that will give you a lot of flexibility | |
at the cost of a little more code. | |
You can code your own collector. A field that needs to be accessed in a collector | |
need to be set as a "fast field" in the schema. | |
On each call of `set_segment` you need to get the fast field reader from the | |
segment, and set it to some internal field. | |
You can then do whatever you like in the `fn collect` function. | |
The [`int_facet_collector`](https://github.com/tantivy-search/tantivy/blob/master/src/collector/int_facet_collector.rs) | |
is rather simple and shows how to use a fast field in a collector. | |
Once you have computed your score. You can push it to a BinaryHeap, as done in the TopCollector | |
https://github.com/tantivy-search/tantivy/blob/master/src/collector/top_collector.rs |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment