I stumbled upon a rather uncommon feature of PostgreSQL recently, namely, clause RETURNING
. Let's say I have a database which contains a table products
. I needed to copy some products directly in SQL, and then to invoke a function with the ids of both the old and the new products (for some side-effects irrelevant here).
At first I felt quite confident that a simple INSERT
would do the job, but how to get the new ids in the most elegant way? As the doc states:
INSERT conforms to the SQL standard, except that the RETURNING clause is a PostgreSQL extension
So here it is:
INSERT INTO products (name,
account_id,
is_active)
SELECT
name,
1337, -- overload account_id for all newly created products.
is_active
FROM products
WHERE id IN (1, 2, 3, 4) -- ordered, but is that relevant?
RETURNING id;
which returns
id |
---|
5 |
6 |
7 |
8 |
The question is: can we assume that product #1 has been copied onto product #5, and so on? In other words: does INSERT … RETURNING
guarantee the order, or provide a to guarantee it?
First, let's have a simple smoke test. If its result looks right, it doesn't formally guarantee that everything will be alright, but if something gets wrong, it strongly means there here a problem.
SELECT
name,
id
FROM products
WHERE id IN (
1,
2,
3,
4, -- ids of copied products
5,
6,
7,
8 -- ids of the copies
)
ORDER BY id ASC;
which returns:
name | id |
---|---|
'Product A' | 1 |
'Product B' | 2 |
'Product C' | 3 |
'Product D' | 4 |
'Product D' | 5 |
'Product C' | 6 |
'Product A' | 7 |
'Product B' | 8 |
Obvisouly, the order isn't guaranteed. Why that? Because it depends on how postreSQL finds the rows to operate on in the table. the condition is the id
belongs to a collection but this doesn't put any constraint on the order of the result, which more or less depends on the way the table is indexed.
The trick here is to add a clause ORDER BY
in the select. For the sake of clarity, let's pretend we reset the DB to its initial state so we get the same returned ids and they're easier to reason about.
INSERT INTO products (name,
account_id,
is_active)
SELECT
name,
1337,
is_active
FROM products
WHERE id IN (4, 2, 1, 3) -- not ordered
ORDER BY id ASC -- add this
RETURNING id;
which returns
id |
---|
5 |
6 |
7 |
8 |
and now, let's perform the same smoke test:
SELECT
name,
id
FROM products
WHERE id IN (
1,
2,
3,
4, -- ids of copied products
5,
6,
7,
8 -- ids of the copies
)
ORDER BY id ASC;
which outputs:
name | id |
---|---|
'Product A' | 1 |
'Product B' | 2 |
'Product C' | 3 |
'Product D' | 4 |
'Product A' | 5 |
'Product B' | 6 |
'Product C' | 7 |
'Product D' | 8 |
However, this is merely a smoke test. As previously stated, it can only show obvious mistakes, not guarantee any ordering. Let's get more confident about this and read the documentation again:
If the INSERT command contains a RETURNING clause, the result will be similar to that of a SELECT statement containing the columns and values defined in the RETURNING list, computed over the row(s) inserted or updated by the command.
According to the manual, it's enough to add ORDER BY id ASC
to the clause SELECT
in order to guaranteee the order of returned ids. Good to know!
(actually I got a boring bug in production because of this. Now I'm aware of how RETURNING
works ^^)