Skip to content

Instantly share code, notes, and snippets.

@mattfield11
Created November 13, 2023 13:55
Show Gist options
  • Save mattfield11/2136a5a7c91ba7d20889562232df0db7 to your computer and use it in GitHub Desktop.
Save mattfield11/2136a5a7c91ba7d20889562232df0db7 to your computer and use it in GitHub Desktop.
Analyzer for SKUs in Elasticsearch
## Analyzer for SKUs
The analyzer below will strip out any of the items defined in the regex. In this case i am removing -#_ and any whitespace.
You can adapt the regex as appropriate. In this case the search WILL be case sensitive.
As a keyword tokenizer is used, then the text will not be split into "words" .
```
PUT my-index-000001
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "keyword",
"char_filter": [
"my_char_filter"
]
}
},
"char_filter": {
"my_char_filter": {
"type": "pattern_replace",
"pattern": "[#_.-]|\\s",
"replacement": ""
}
}
}
}
}
```
You can simulate how the analyzer works by using the call below:
```
POST my-index-000001/_analyze
{
"analyzer": "my_analyzer",
"text": "Mysku123 456-789"
}
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment