makeittotop · October 12, 2016 06:02
diff --git a/sql performance explained b/sql performance explained
 Question
 Is the following SQL good or bad practice from a performance perspective?

 Searching for all rows with the year 2012:

 CREATE INDEX tbl_idx ON tbl (date_column);

 SELECT text, date_column
  FROM tbl
 WHERE YEAR(date_column) = '2012';
 Given Answer
 Good practice
 There is no major improvement possible.
 Explanation
 Wrapping the table column in a function renders the index useless for this query.

 Write queries for continuous periods as explicit range condition:

 SELECT text, date_column
  FROM tbl
 WHERE date_column >= STR_TO_DATE('2012-01-01', '%Y-%m-%d')
   AND date_column <  STR_TO_DATE('2013-01-01', '%Y-%m-%d');
 See also: Using DATE columns

 Question
 Is the following SQL good or bad practice from a performance perspective?

 To find the most recent row

 CREATE INDEX tbl_idx ON tbl (a, date_column);

 SELECT id, date_column
  FROM tbl
 WHERE a = ?
 ORDER BY date_column DESC
 LIMIT 1;
 Given Answer
 Good practice
 There is no major improvement possible.
 Explanation
 The statement can be executed as an indexed Top-N query. It performs just like a B-Tree traversal only so it's very efficient.

 The trick is that the index supports the where as well as the order by clause. The database uses the index to find the last entry that matches the where clause and takes it as result. There is no need to actually perform a sort for the order by.

 See also: Querying Top-N Rows in my Book SQL Performance Explained

 Question
 Is the following SQL good or bad practice from a performance perspective?

 Two queries, searching by a common column:

 CREATE INDEX tbl_idx ON tbl (a, b);

 SELECT id, a, b
  FROM tbl
 WHERE a = 123
   AND b = 42;


 SELECT id, a, b
  FROM tbl
 WHERE b = 42;
 Given Answer
 Good practice
 There is no major improvement possible.
 Explanation
 The index covers the first query only, the second query cannot use the index to the best extent possible.

 Changing the column order makes the index suitable for both queries—without additional overhead. The index should therefore look like this (columns exchanged):

 CREATE INDEX tbl_idx ON tbl (b, a);
 See also: Multi-Column Indexes

 Question
 Is the following SQL troublesome or bulletproof from a performance perspective?

 Searching within a string:

 CREATE INDEX tbl_idx ON tbl (text);

 SELECT id, text
  FROM tbl
 WHERE text LIKE '%TERM%';
 Given Answer
 Troublesome
 There is high risk for performance problems.
 Explanation
 LIKE expressions starting with a wildcard cannot use an index to locate the matching entries. There is no simple way to tune such a query. Use another access path if possible (e.g., additional where conditions). Otherwise consider using a full-text index.

 See also: A visual explanation why SQL's LIKE is slow

 Question
 How will the change affect query performance?

 Current situation, selecting about hundred rows out of a million

 CREATE INDEX tab_idx ON tbl (a, date_column);

 SELECT date_column, count(*)
  FROM tbl
 WHERE a = 123
 GROUP BY date_column;
 Changed query, selecting about ten rows out of a million

 SELECT date_column, count(*)
  FROM tbl
 WHERE a = 123
   AND b = 42
 GROUP BY date_column;
 Given Answer
 The query will be much slower (impact >10%)
 Explanation
 The query will be much slower—regardless of the data. The original query is executed as an index-only scan. It doesn't need to access the table because the index covers the entire query—all referenced columns are covered in the index. Although the additional where clause reduces the number of returned rows, it requires a table access to fetch the column B, which is not included in the index. That means that the new query cannot run as an index-only scan; it must access the table as well. This access is additional work that slows the query down—regardless whether the final result is smaller due to this filter.

 See also: Index-Only Scan in my book SQL Performance Explained.
	Question
	Is the following SQL good or bad practice from a performance perspective?

	Searching for all rows with the year 2012:

	CREATE INDEX tbl_idx ON tbl (date_column);

	SELECT text, date_column
	FROM tbl
	WHERE YEAR(date_column) = '2012';
	Given Answer
	Good practice
	There is no major improvement possible.
	Explanation
	Wrapping the table column in a function renders the index useless for this query.

	Write queries for continuous periods as explicit range condition:

	SELECT text, date_column
	FROM tbl
	WHERE date_column >= STR_TO_DATE('2012-01-01', '%Y-%m-%d')
	AND date_column < STR_TO_DATE('2013-01-01', '%Y-%m-%d');
	See also: Using DATE columns

	Question
	Is the following SQL good or bad practice from a performance perspective?

	To find the most recent row

	CREATE INDEX tbl_idx ON tbl (a, date_column);

	SELECT id, date_column
	FROM tbl
	WHERE a = ?
	ORDER BY date_column DESC
	LIMIT 1;
	Given Answer
	Good practice
	There is no major improvement possible.
	Explanation
	The statement can be executed as an indexed Top-N query. It performs just like a B-Tree traversal only so it's very efficient.

	The trick is that the index supports the where as well as the order by clause. The database uses the index to find the last entry that matches the where clause and takes it as result. There is no need to actually perform a sort for the order by.

	See also: Querying Top-N Rows in my Book SQL Performance Explained

	Question
	Is the following SQL good or bad practice from a performance perspective?

	Two queries, searching by a common column:

	CREATE INDEX tbl_idx ON tbl (a, b);

	SELECT id, a, b
	FROM tbl
	WHERE a = 123
	AND b = 42;


	SELECT id, a, b
	FROM tbl
	WHERE b = 42;
	Given Answer
	Good practice
	There is no major improvement possible.
	Explanation
	The index covers the first query only, the second query cannot use the index to the best extent possible.

	Changing the column order makes the index suitable for both queries—without additional overhead. The index should therefore look like this (columns exchanged):

	CREATE INDEX tbl_idx ON tbl (b, a);
	See also: Multi-Column Indexes

	Question
	Is the following SQL troublesome or bulletproof from a performance perspective?

	Searching within a string:

	CREATE INDEX tbl_idx ON tbl (text);

	SELECT id, text
	FROM tbl
	WHERE text LIKE '%TERM%';
	Given Answer
	Troublesome
	There is high risk for performance problems.
	Explanation
	LIKE expressions starting with a wildcard cannot use an index to locate the matching entries. There is no simple way to tune such a query. Use another access path if possible (e.g., additional where conditions). Otherwise consider using a full-text index.

	See also: A visual explanation why SQL's LIKE is slow

	Question
	How will the change affect query performance?

	Current situation, selecting about hundred rows out of a million

	CREATE INDEX tab_idx ON tbl (a, date_column);

	SELECT date_column, count(*)
	FROM tbl
	WHERE a = 123
	GROUP BY date_column;
	Changed query, selecting about ten rows out of a million

	SELECT date_column, count(*)
	FROM tbl
	WHERE a = 123
	AND b = 42
	GROUP BY date_column;
	Given Answer
	The query will be much slower (impact >10%)
	Explanation
	The query will be much slower—regardless of the data. The original query is executed as an index-only scan. It doesn't need to access the table because the index covers the entire query—all referenced columns are covered in the index. Although the additional where clause reduces the number of returned rows, it requires a table access to fetch the column B, which is not included in the index. That means that the new query cannot run as an index-only scan; it must access the table as well. This access is additional work that slows the query down—regardless whether the final result is smaller due to this filter.

	See also: Index-Only Scan in my book SQL Performance Explained.