Last active
November 19, 2020 06:39
-
-
Save AndreKR/1e1570633fa4daf97e98566149a30e80 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
To help you with the process of converting a MySQL app to PostgreSQL, I collected a list of differences between MySQL and PostgreSQL (PG). | |
Important changes: | |
* Strings are quoted with '...' or with $token$...$token$. Single-quotes are escaped with single-quotes. | |
* Identifiers are folded to lowercase, unless they are quoted with "..." which makes them case-sensitive | |
* The max identifier length is 63 | |
* There are some system columns that cannot be used as an identifier (probably not a problem) | |
* Expressions are evaluated in arbitrary order, so WHERE x > 0 AND y/x > 1.5 can lead to a division by zero and has to be replaced with WHERE CASE WHEN y > 0 THEN y/x > 1.5 ELSE FALSE END | |
* SELECT CASE WHEN x > 0 THEN x ELSE 1/0 END would still lead to a division by zero because the constant 1/0 is evaluated even before the query starts | |
* Permissions can only be managed on the database level. (And row level with Row Security Policies.) | |
Types: | |
* SERIAL instead of AUTO_INCREMENT | |
* VARCHAR (for text) doesn't need a length limit, is called text and does not allow 0-bytes and invalid UTF-8 characters | |
* VARCHAR (for binary data) doesn't need a length limit and is called bytea | |
* DATETIME is called timestamp | |
* Casts are done by simply prepending the type (or appending ::type) | |
* There's a proper boolean type | |
* ENUM types have to be created explicitly or use text (see below) | |
* There is a jsonb type (and json which preserves the original text) | |
* There is a GUID type | |
* There are IP address types | |
* There are geometric types | |
* There are range types (time ranges, price ranges, etc.) that can be unioned, etc. | |
Features we should use to replace equivalent MySQL constructs: | |
* timestamp with time zone | |
* FULL OUTER JOIN | |
* Foreign keys (`something_id integer references somethings` points to the primary key of `somethings`) | |
* FILTER for aggregate functions | |
* Additional constraints with CHECK | |
* VALUES() subselects instead of SELECT 'a' UNION SELECT 'b' ... | |
* Condition pushdown and subquery caching | |
* Functionally dependent columns are properly detected, so we don't have to give all columns in GROUP BY if we give a unique key | |
* Grouping sets are WITH ROLLUP on steroids | |
* We can define distinctness with SELECT DISTINCT ON ... | |
* Besides UNION we also have intersection and non-symmetric diff | |
* Expression indexes | |
* Partial indexes | |
* RECURSIVE queries | |
* DELETE/UPDATE + RETURNING | |
* DELETE/UPDATE + RETURNING + WITH (note that they are running at the same time) | |
* Arbitrary precision (but slow) math with NUMERIC (= DECIMAL) | |
* NUMERIC does not need to be limited in scale and precision | |
* BETWEEN SYMMETRIC (like BETWEEN but works backwards) | |
* IS DISTINCT FROM combines = with IS NULL | |
* There are more string operators and fuctions | |
* Hahaha, there is a square root operator and it's |/ | |
* JSON instead of GROUP_CONCAT | |
* Index sort order is honored | |
* OR is not ridiculously expensive because indexes can be combined to some extent | |
Additional features: | |
* Data migrations with USING | |
* UNIQUE indexes can contain more than one NULL value | |
* Default values can be expressions | |
* RANK() and other aggregate functions | |
* VIEWs | |
* Indexes with operator classes | |
* ARRAYs and unnest() | |
* Multiple unnest()s has two modes: same number of elements and different number of elements | |
* JOINs to set returning functions like unnest() or generate_series() | |
* Foreign keys can have some NULL values (see `MATCH`) | |
* Bit string, text search and XML types | |
* There are array types | |
* There are custom composite types (structs) | |
* We can run EXPLAIN on UPDATE queries if we use ROLLBACK | |
Finicky features: | |
* Window functions (GROUPs over other rows that share some property) | |
* Table inheritance | |
* Row security policies | |
* Table subselects can be named with WITH | |
Useless features: | |
* money type | |
Other features: | |
* Owners | |
* If permission has been given with WITH GRANT OPTION, it is revoked when the GRANT OPTION is taken away | |
ENUMs: | |
There are three typical ways to represent ENUMs in PostgreSQL: | |
* User-defined ENUM type | |
* text with CHECK constraint | |
* ID and lookup table | |
The text solution isn't as bad as one might think: | |
ENUM text | |
Key ---------------------------------- ----------------------------------- | |
Length Rows Storage* Index Toast Table Storage* Index Toast Table Loss of text | |
-------- --------- ---------- ------- ------- ------- ---------- ------- ------- ------- -------------- | |
1 1000000 4 21 0 42 2 21 0 42 0 MB | |
4 1000000 4 21 0 42 5 21 0 42 0 MB | |
7 1000000 4 21 0 42 8 21 0 42 0 MB | |
10 1000000 4 21 0 42 11 21 0 50 8 MB | |
7 5000000 4 107 0 211 8 107 0 211 0 MB | |
* Storage requirement according to docs |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment