Created
September 17, 2020 17:31
-
-
Save monadplus/ccfece014a4c0822ef55ea97786c9ce5 to your computer and use it in GitHub Desktop.
Use the Index Luke
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| _ _ _ _ _ _ _ _ | |
| | | | | | | | | (_) | | | | | | | |
| | | | |___ ___ | |_| |__ ___ _ _ __ __| | _____ __ | | _ _| | _____ | |
| | | | / __|/ _ \ | __| '_ \ / _ \ | | '_ \ / _` |/ _ \ \/ / | | | | | | |/ / _ \ | |
| | |_| \__ \ __/ | |_| | | | __/ | | | | | (_| | __/> < _ | |___| |_| | < __/ | |
| \___/|___/\___| \__|_| |_|\___| |_|_| |_|\__,_|\___/_/\_( ) \_____/\__,_|_|\_\___| https://use-the-index-luke.com/sql/ | |
| |/ | |
| Data is stored in heap with no order. Index are redundant data structures for fast lookup. | |
| Index are composed by: | |
| - Index Leaf Nodes (double-linked list): | |
| - Logical order (not actually physical order) | |
| - Pointer as key to the corresponding database entry | |
| - B-Tree (balanced tree) | |
| - Each index leaf is stored in a block (a unit of storage) | |
| - The biggest index leaf of the block is stored in another block | |
| - We end up with a block that contains pointers to block ... until the leafs | |
| Index lookup steps: | |
| 1. - Tree traversal INDEX UNIQUE SCAN | |
| 2. - Following the leaf node chain INDEX RANGE SCAN | |
| 3. - Fetching the table data TABLE ACCESS BY INDEX ROWID | |
| The where clause: | |
| - Order of index columns matters ! | |
| - Do not overuse indexes, they are expensive to maintain | |
| - index on last_name doens't work for UPPER(last_name) | |
| - You need to CREATE INDEX index_up_name ON users (UPPER(last_name)) | |
| - current time [SYSDATE] (directly or by function) doesn't work | |
| - index must be deterministics | |
| - Rule of thumb: index for equality first—then for ranges. | |
| - Acces predicate: reduce the range index scan // this is what we want | |
| - Filter predicates: don't | |
| - LIKE: only characters before % 'wildcard' are used | |
| - Avoid LIKE expressions with leading wildcards (e.g., '%TERM'). | |
| SELECT first_name, last_name, date_of_birth | |
| FROM employees | |
| WHERE UPPER(last_name) < ? | |
| AND date_of_birth < ? | |
| - Possible solutions: | |
| 1) Create an index for (UPPER(last_name), date_of_birth) | |
| 2) Create one index for UPPER(last_name) and another for date_of_birth | |
| - One index scan is faster than two 😇 | |
| - Partial index: | |
| SELECT message | |
| FROM messages | |
| WHERE processed = 'N' | |
| AND receiver = ? | |
| CREATE INDEX messages_todo | |
| ON messages (receiver) | |
| WHERE processed = 'N' | |
| - The Oracle database does not include rows in an index if all indexed columns are NULL. | |
| That means that every index is a partial index—like having a where clause: | |
| CREATE INDEX idx | |
| ON tbl (A, B, C, ...) | |
| WHERE A IS NOT NULL | |
| OR B IS NOT NULL | |
| OR C IS NOT NULL | |
| ... | |
| - If all date_of_birth are NULL this index won't work: | |
| SELECT first_name, last_name | |
| FROM employees | |
| WHERE date_of_birth IS NULL | |
| - SOLUTION 😄 | |
| CREATE INDEX emp_dob ON employees (date_of_birth, '1') | |
| - Dates: | |
| SELECT ... | |
| FROM sales | |
| WHERE TRUNC(sale_date) = TRUNC(sysdate - INTERVAL '1' DAY) | |
| Solution: | |
| CREATE INDEX index_name | |
| ON sales (TRUNC(sale_date)) | |
| - Numeric strings: DO NOT USE THEM ! | |
| numeric is access predicate but string is not | |
| Implicit conversion wont use the index | |
| - Math: | |
| SELECT a, b | |
| FROM table_name | |
| WHERE 3*a + 5 = b | |
| Solution CREATE INDEX math ON table_name (3*a - b) | |
| SELECT a, b | |
| FROM table_name | |
| WHERE 3*a - b = -5 | |
| - Index clustering: | |
| Suppose we have an index for subsidiary_id | |
| SELECT first_name, last_name, subsidiary_id, phone_number | |
| FROM employees | |
| WHERE subsidiary_id = ? // access predicate | |
| AND UPPER(last_name) LIKE '%INA%' // Table access | |
| Let's cluster the data to avoid full table access | |
| CREATE INDEX empsubupnam ON employees | |
| (subsidiary_id, UPPER(last_name)) | |
| Now the query is faster because the UPPER(last_name) is in order. | |
| - Avoid Table access, Index-Only Scan: | |
| SELECT SUM(eur_value) | |
| FROM sales | |
| WHERE subsidiary_id = ? | |
| - Solution: | |
| CREATE INDEX sales_sub_eur | |
| ON sales | |
| ( subsidiary_id, eur_value ) | |
| - Be aware ! This query won't use the index at all ! | |
| SELECT SUM(eur_value) | |
| FROM sales | |
| WHERE subsidiary_id = ? | |
| AND sale_date > ? | |
| - Order by: an order by clause do not need to sort the result explicitly if the relevant index already delivers the rows in the required order. | |
| CREATE INDEX sales_dt_pr ON sales (sale_date, product_id) | |
| SELECT sale_date, product_id, quantity | |
| FROM sales | |
| WHERE sale_date = TRUNC(sysdate) - INTERVAL '1' DAY | |
| ORDER BY sale_date, product_id | |
| ASC/DESC matters ? No, if all the statements follow the same order | |
| - Databases can read indexes in both directions. | |
| Mixed-order: | |
| - CREATE INDEX sales_dt_pr ON sales (sale_date ASC, product_id DESC) | |
| - Group by: same as Order by wihtout ASC/DESC issues |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment