Last active
October 21, 2015 18:14
-
-
Save joshdcollins/df2f3e1597fd08de360d to your computer and use it in GitHub Desktop.
SOLR Config - AutoPhrasing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| cel-2000 | |
| CEL-2000 | |
| CEL 2000 | |
| CEL2000 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| document with an entity_name of 'CEL-2000' |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <fieldType name="text_autophrase" class="solr.TextField" positionIncrementGap="100"> | |
| <analyzer type="index"> | |
| <tokenizer class="solr.KeywordTokenizerFactory" /> | |
| <filter class="solr.LowerCaseFilterFactory" /> | |
| <filter class="com.lucidworks.analysis.AutoPhrasingTokenFilterFactory" phrases="autophrases.txt" includeTokens="true" replaceWhitespaceWith="_" /> | |
| <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> | |
| <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" /> | |
| </analyzer> | |
| <analyzer type="query"> | |
| <tokenizer class="solr.KeywordTokenizerFactory" /> | |
| <filter class="solr.LowerCaseFilterFactory" /> | |
| <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" /> | |
| <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true" /> | |
| </analyzer> | |
| </fieldType> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| webapp=/solr path=/autophrase params={q="cel-2000"&defType=dismax&qf=entity_name^100.0+content+entity_author&pf=entity_name+content&rows=100&wt=json&debugQuery=true} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| <requestHandler name="/autophrase" class="solr.SearchHandler"> | |
| <lst name="defaults"> | |
| <str name="echoParams">explicit</str> | |
| <int name="rows">10</int> | |
| <str name="df">_text_</str> | |
| </lst> | |
| <lst name="invariants"> | |
| <str name="defType">autophrasingParser</str> | |
| </lst> | |
| </requestHandler> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| CEL-2000,CEL-SCI,CEL_2000,CEL2000 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| - CEL-2000 - pass, but also returns a lot of 'noise' based on '2000' and 'CEL' | |
| - "CEL-2000" - pass, only matching record found | |
| - CEL 2000 - fail (no results) | |
| - "CEL 2000" - pass, only matching record found | |
| - CEL2000 - fail (no results) | |
| - "CEL2000" - fail (no results) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment