copy of the topic in Algolia community: https://discourse.algolia.com/t/searching-for-multiple-matches-in-a-single-item-at-once/13743
Hi, we are looking for a way to use Algolia in a use case that doesn't seem supported, so I'm seeking advice, maybe someone already had a similar use case and found a solution.
UPDATED with a solution in attempt #6
We manage a catalog of products that are "connectors" between cylindrical tubes. The slots for these tubes have a diameter. Any product can have up to 10 slots.
Here's a reduced test case of data, with 3 products having 3 slots each, and 1 product having 4 slots:
[
{
"objectID": "product-1",
"D1": 4.75,
"D2": 8,
"D3": 4.75
},
{
"objectID": "product-2",
"D1": 8,
"D2": 4.75,
"D3": 8
},
{
"objectID": "product-3",
"D1": 8,
"D2": 4.75,
"D3": 11.2
},
{
"objectID": "product-4",
"D1": 3,
"D2": 4.75,
"D3": 11.2,
"D4": 2.8
}
]
People using the search would like to find products to connect several tubes.
Each tube has a diameter.
When searching for a connector for a given number of tubes, diameters must match, and products with more slots are fine too.
Let's say for example we want to perform these searches:
- find any product that has a 11.2mm slot
- find any product that has a 8mm slot, and a 4.75mm slot
- find any product that has a 8mm slot, a 4.75mm slot, and a second 4.75mm slot
multiple-dimensions-0.json
: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-0-json
Simple for search #1: {"filters":"D1=11.2 OR D2=11.2 OR D3=11.2 OR D4=11.2"}
But complex even for only 2 dimensions. For search #2: {"filters":"(D1=8 AND (D2=4.75 OR D3=4.75 OR D4=4.75)) OR (D2=8 AND (D1=4.75 OR D3=4.75 OR D4=4.75)) OR (D3=8 AND (D1=4.75 OR D2=4.75 OR D4=4.75)) OR (D4=8 AND (D1=4.75 OR D2=4.75 OR D3=4.75))"}
And not supported by Algolia: filters: filter (X AND Y) OR Z is not allowed, only (X OR Y) AND Z is allowed.
See https://www.algolia.com/doc/api-reference/api-parameters/filters/#boolean-operators
Idea: the order of slot diameters (D1, D2, D3, etc.) is not important.
multiple-dimensions-1.json
: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-1-json
[
{
"objectID": "product-1",
"D": [8, 4.75, 4.75]
},
{
"objectID": "product-2",
"D": [4.75, 8, 8]
},
{
"objectID": "product-3",
"D": [8, 4.75, 11.2]
},
{
"objectID": "product-4",
"D": [3, 4.75, 2.8, 11.2]
}
]
search | Algolia filter | product-1 |
product-2 |
product-3 |
product-4 |
---|---|---|---|---|---|
11.2 | {"filters":"D=11.2"} |
👍 | 👍 | ||
8.0 and 4.75 | {"filters":"D=8 AND D=4.75"} |
👍 | 👍 | 👍 | |
8.0, 4.75 and 4.75 | {"filters":"D=8 AND D=4.75 AND D=4.75"} |
👍 | ❌ | ❌ |
Legend:
- 👍: value returned as expected
- ❌: returned value that should not be returned
This error is normal, nothing indicates that D=4.75 and D=4.75 are "different"
Idea: Keep numbered slots, but compute all possible combinations in the index
multiple-dimensions-2.json
: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-2-json
[
{
"objectID": "product-1",
"D": [
{
"D1": 8,
"D2": 4.75,
"D3": 4.75
},
{
"D1": 4.75,
"D2": 8,
"D3": 4.75
},
{
"D1": 4.75,
"D2": 4.75,
"D3": 8
}
]
},
{
"objectID": "product-2",
"D": [
{
"D1": 4.75,
"D2": 8,
"D3": 8
},
{
"D1": 8,
"D2": 4.75,
"D3": 8
},
{
"D1": 8,
"D2": 8,
"D3": 4.75
}
]
},
…
]
In theory number of combinations = factorial of the number of dimensions.
product-4
has 24 possible combinations. But here product-1
et product-2
have two identical values, so twice less combinations.
search | Algolia filter | product-1 |
product-2 |
product-3 |
product-4 |
---|---|---|---|---|---|
11.2 | {"filters":"D.D1=11.2"} |
👍 | 👍 | ||
8.0 and 4.75 | {"filters":"D.D1=8 AND D.D2=4.75"} |
👍 | 👍 | 👍 | |
8.0, 4.75 and 4.75 | {"filters":"D.D1=8 AND D.D2=4.75 AND D.D3=4.75"} |
👍 | ❌ | ❌ |
Legend:
- 👍: value returned as expected
- ❌: returned value that should not be returned
The issue here, with product-2
for example, is that D.D2=4.75
matches on { "D1": 8, "D2": 4.75, "D3": 8 }
while D.D3=4.75
matches on the different { "D1": 8, "D2": 8, "D3": 4.75 }
, but Algolia selects product-2
as matching anyway.
multiple-dimensions-3.json
: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-3-json
[
{
"objectID": "product-1",
"DC1": {
"D1": 8,
"D2": 4.75,
"D3": 4.75
},
"DC2": {
"D1": 4.75,
"D2": 8,
"D3": 4.75
},
"DC3": {
"D1": 4.75,
"D2": 4.75,
"D3": 8
}
},
…
]
search | Algolia filter | product-1 |
product-2 |
product-3 |
product-4 |
---|---|---|---|---|---|
11.2 | {"filters":"DC1.D1=11.2 OR DC2.D1=11.2 OR DC3.D1=11.2 OR DC4.D1=11.2"} |
👍 | 👍 | ||
8.0 and 4.75 | {"filters":"(DC1.D1=8 AND DC1.D2=4.75) OR (DC2.D1=8 AND DC2.D2=4.75) OR (DC3.D1=8 AND DC3.D2=4.75) OR (DC4.D1=8 AND DC4.D2=4.75) OR …"} |
error | |||
8.0, 4.75 and 4.75 | {"filters":"(DC1.D1=8 AND DC1.D2=4.75 AND DC1.D3=4.75) OR …"} |
error |
Search #2 would need 24 times (DCn.D1=8 AND DCn.D2=4.75)
, search #3 would need 24 times (DCn.D1=8 AND DCn.D2=4.75 AND DCn.D3=4.75)
.
It's anyway no supported by Algolia, like for attempt #0.
A lot less data to manage, as much as the source.
multiple-dimensions-4.json
: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-4-json
[
{
"objectID": "product-1",
"D1": 4.75,
"D2": 4.75,
"D3": 8
},
…
]
search | Algolia filter | product-1 |
product-2 |
product-3 |
product-4 |
---|---|---|---|---|---|
11.2 | {"filters":"D1=11.2 OR D2=11.2 OR D3=11.2 OR D4=11.2"} |
👍 | 👍 | ||
8.0 and 4.75 | {"filters":"(D1=4.75 AND (D2=8 OR D3=8 OR D4=8)) OR (D2=4.75 AND (D3=8 OR D4=8)) OR (D3=4.75 AND D4=8)"} |
error | |||
8.0, 4.75 and 4.75 | {"filters":"…"} |
error |
Search string a little simpler because values are ordered. For search #2, not need for checking D1
and D2
for the 8
value if D3=4.75
matches.
But the same limitation of Algolia occurs with such mix of AND
and OR
.
multiple-dimensions-5.json
: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-5-json
[
{
"objectID": "product-1",
"D": "ø4.75øø4.75øø8ø"
},
{
"objectID": "product-2",
"D": "ø4.75øø8øø8ø"
},
{
"objectID": "product-3",
"D": "ø4.75øø8øø11.2ø"
},
{
"objectID": "product-4",
"D": "ø2.8øø3øø4.75øø11.2ø"
}
]
We use ø
as a separator to prevent Algolia from cutting it like with punctuation.
More compact format, but a lot less readable.
With SQL, we could use LIKE
with %
to match parts of the string.
But Algolia does not know how to search inside a string, even less with characters in the interval.
For example, it is impossible to search for the strings ø3ø
and ø11.2ø
inside ø2.8øø3øø4.75øø11.2ø
To prevent issue with multiple equal values we have in attempt #1, we store the number of values with the value, like "2-4.75"
. To allow people looking for only one of this value, we also add "1-4.75"
.
This solution was already provided by @sylvain.huprelle from Algolia, I forgot about it… 🤦♂️
multiple-dimensions-6.json
: https://gist.github.com/nhoizey/a087d5b6ae91517eb13f16f8e29ee35b#file-multiple-dimensions-6-json
[
{
"objectID": "product-1",
"D": ["1-8", "1-4.75", "2-4.75"]
},
{
"objectID": "product-2",
"D": ["1-4.75", "1-8", "2-8"]
},
{
"objectID": "product-3",
"D": ["1-8", "1-4.75", "1-11.2"]
},
{
"objectID": "product-4",
"D": ["1-3", "1-4.75", "1-2.8", "1-11.2"]
}
]
search | Algolia filter | product-1 |
product-2 |
product-3 |
product-4 |
---|---|---|---|---|---|
11.2 | {"filters":"D:'1-11.2'"} |
👍 | 👍 | ||
8.0 and 4.75 | {"filters":"D:'1-8' AND D:'1-4.75'"} |
👍 | 👍 | 👍 | |
8.0, 4.75 and 4.75 | {"filters":"D:'1-8' AND D:'2-4.75'"} |
👍 |
It works!
[
{
"objectID": "product-1",
"D1": { "min": 4.7, "max": 4.8 },
"D2": 8,
"D3": 4.75
},
{
"objectID": "product-2",
"D1": 8,
"D2": { "min": 4.7, "max": 4.8 },
"D3": { "min": 7.5, "max": 8.5 }
},
{
"objectID": "product-3",
"D1": 8,
"D2": 4.75,
"D3": 11.2
},
{
"objectID": "product-4",
"D1": 3,
"D2": 4.75,
"D3": 11.2,
"D4": 2.8
}
]
For example, the { "min": 7.5, "max": 8.5 }
range gives "1-7"
and "1-8"
.
[
{
"objectID": "product-1",
"D1": { "min": 4.7, "max": 4.8 },
"D2": 8,
"D3": 4.75,
"D": ["1-4", "2-4", "1-8"]
},
{
"objectID": "product-2",
"D1": 8,
"D2": { "min": 4.7, "max": 4.8 },
"D3": { "min": 7.5, "max": 8.5 },
"D": ["1-8", "2-8", "1-4", "1-7"]
},
{
"objectID": "product-3",
"D1": 8,
"D2": 4.75,
"D3": 11.2,
"D": ["1-8", "1-4", "1-11"]
},
{
"objectID": "product-4",
"D1": 3,
"D2": 4.75,
"D3": 11.2,
"D4": 2.8,
"D": ["1-3", "1-4", "1-2", "1-11"]
}
]
The search should return at least the right products, maybe more, which need to be filtered out in the front (legend ⏳).
search | Algolia filter | product-1 |
product-2 |
product-3 |
product-4 |
---|---|---|---|---|---|
11.2 | {"filters":"D:'1-11'"} |
👍 | 👍 | ||
8.0 and 4.75 | {"filters":"D:'1-8' AND D:'1-4'"} |
👍 | 👍 | 👍 | |
8.0, 4.75 and 4.75 | {"filters":"D:'1-8' AND D:'2-4'"} |
👍 | |||
8.2, 4 | {"filters":"D:'1-8' AND D:'1-4'"} |
⏳ | ⏳ | ⏳ | |
8.2, 4.75 | {"filters":"D:'1-8' AND D:'1-4'"} |
⏳ | 👍 | ⏳ |
It "works", but filtering out in the front-end means we lose accuracy of facets volumes, and pagination.
Rounding values to 1/10th instead of integers would generate much more data, but less "bad" results.
For example, the { "min": 7.5, "max": 8.5 }
range gives "1-7.5"
, "1-7.6"
, "1-7.7"
, "1-7.8"
, "1-7.9"
, "1-8"
, "1-8.1"
, "1-8.2"
, "1-8.3"
, "1-8.4"
and "1-8.5"
.
[
{
"objectID": "product-1",
"D1": { "min": 4.7, "max": 4.8 },
"D2": 8,
"D3": 4.75,
"D": ["1-4.7", "2-4.7", "1-4.8", "1-8"]
},
{
"objectID": "product-2",
"D1": 8,
"D2": { "min": 4.7, "max": 4.8 },
"D3": { "min": 7.5, "max": 8.5 },
"D": [
"1-8",
"1-4.7",
"1-4.8",
"1-7.5",
"1-7.6",
"1-7.7",
"1-7.8",
"1-7.9",
"2-8",
"1-8.1",
"1-8.2",
"1-8.3",
"1-8.4",
"1-8.5"
]
},
{
"objectID": "product-3",
"D1": 8,
"D2": 4.75,
"D3": 11.2,
"D": ["1-8", "1-4.7", "1-11.2"]
},
{
"objectID": "product-4",
"D1": 3,
"D2": 4.75,
"D3": 11.2,
"D4": 2.8,
"D": ["1-3", "1-4.7", "1-2.8", "1-11.2"]
}
]
search | Algolia filter | product-1 |
product-2 |
product-3 |
product-4 |
---|---|---|---|---|---|
11.2 | {"filters":"D:'1-11.2'"} |
👍 | 👍 | ||
8.0 and 4.75 | {"filters":"D:'1-8' AND D:'1-4.7'"} |
👍 | 👍 | 👍 | |
8.0, 4.75 and 4.75 | {"filters":"D:'1-8' AND D:'2-4.7'"} |
👍 | |||
8.2, 4 | {"filters":"D:'1-8.2' AND D:'1-4'"} |
||||
8.2, 4.75 | {"filters":"D:'1-8.2' AND D:'1-4.7'"} |
👍 | |||
8, 4.77 | {"filters":"D:'1-8' AND D:'1-4.7'"} |
👍 | 👍 | ⏳ |
This is much better. We could use a 1/100th rounding to limit even more, if data volume is not an issue.
Let's try with users looking for ranges.
Same as before
Same as attempt #7
search | Algolia filter | product-1 |
product-2 |
product-3 |
product-4 |
---|---|---|---|---|---|
11-11.4 | {"filters":"D:'1-11'"} |
👍 | 👍 | ||
8.0 and 4.5-5 | {"filters":"D:'1-8' AND (D:'1-4' OR D:'1-5')"} |
👍 | 👍 | 👍 | |
8.0, 4.75 and 4.5-5 | {"filters":"D:'1-8' AND (D:'2-4' OR (D:'1-4' AND D:'1-5'))"} |
error | |||
8.1-8.5, 4 | {"filters":"D:'1-8' AND D:'1-4'"} |
⏳ | ⏳ | ⏳ | |
8.2, 8.1-8.5 and 4.75 | {"filters":"D:'2-8' AND D:'1-4'"} |
⏳ | |||
7.1-9.5 | {"filters":"D:'1-7' OR D:'1-8' OR D:'1-9'"} |
👍 | 👍 | 👍 |
We once again get the error because {"filters":"D:'1-8' AND (D:'2-4' OR (D:'1-4' AND D:'1-5'))"}
mixes OR
and AND
in a way Algolia doesn't support.