Indexes are 8 (to make it small enough to read) sorted 32bit integer segments. Each integer represents the record_id that matches the term. Each segment is stored in a Key/Value store.
Index A represents the rows that have the term "Canada" in a Country column. Index B represents the rows that have the term "Ontario" in a Province column.
Segments from both indexes will be read off disk using a Key/Value store and intersected to evaluate a conjunction query.
Index A | Index B
-----------------
Segment 1
-----------------
2 | 18 <--- Should skip intersection until Segment 2 of Index A is decoded?
4 | 20
6 | 22
8 | 24
10 | 26
12 | 28
14 | 30
16 | 48 <--- Notice this record_id.
-----------------
Segment 2
-----------------
18 |
20 |
22 |
24 |
26 |
28 |
30 |
32 |
-----------------
Segment 3
-----------------
34 |
36 |
38 |
40 |
42 |
44 |
46 |
48 |
Questions:
- Index B has integers that match Segment 2 and 3 from Index A. Do I have to intersect Index B twice?
- If the indexes are GB's in size, how do I know which segments need to be rewritten if a row is modified?