Skip to content

Instantly share code, notes, and snippets.

@davetapley
Last active December 23, 2015 00:19
Show Gist options
  • Save davetapley/6553216 to your computer and use it in GitHub Desktop.
Save davetapley/6553216 to your computer and use it in GitHub Desktop.
Seemingly identical queries returning different results
-- In first version WHERE is outside JOIN:
SELECT * FROM t1
INNER JOIN t2 ON t1.t2_id = t2.id
WHERE t2.some_boolean;
-- In second version condition is lifted into JOIN condition
SELECT * FROM t1
INNER JOIN t2 ON t1.t2_id = t2.id
AND t2.some_boolean;
-- First version:
=# WITH q_geocodes(name, id, utc_offset) AS (VALUES ('SF', 6096070, -7),('NYC', 6164533, -4))
SELECT
q_geocodes.name,
COUNT(events.*)
FROM q_geocodes
INNER JOIN places ON q_geocodes.id = places.geocode_id
INNER JOIN events AS events
ON events.place_id = places.id
AND events.category_id = 8
AND ((EXTRACT(EPOCH FROM end_time) - EXTRACT(EPOCH FROM start_time)) BETWEEN (4*3600) AND (12*3600)
AND EXTRACT(HOUR FROM end_time) BETWEEN (4 - utc_offset) AND (12 - utc_offset)) AND EXTRACT(DOW FROM end_time) IN (6,0)
WHERE places.category_id = 2 AND (place_events_count > 2)
GROUP BY q_geocodes.name, q_geocodes.utc_offset;
name | count
------+-------
SF | 3423
NYC | 2950
(2 rows)
-- Second version:
=# WITH q_geocodes(name, id, utc_offset) AS (VALUES ('SF', 6096070, -7),('NYC', 6164533, -4))
SELECT
q_geocodes.name,
COUNT(events.*)
FROM q_geocodes
INNER JOIN places
ON q_geocodes.id = places.geocode_id
AND places.category_id = 2 AND place_events_count > 2
INNER JOIN events AS events
ON events.place_id = places.id
AND ((EXTRACT(EPOCH FROM end_time) - EXTRACT(EPOCH FROM start_time)) BETWEEN (4*3600) AND (12*3600)
AND EXTRACT(HOUR FROM end_time) BETWEEN (4 - utc_offset) AND (12 - utc_offset)) AND EXTRACT(DOW FROM end_time) IN (6,0)
GROUP BY q_geocodes.name, q_geocodes.utc_offset;
name | count
------+-------
SF | 3933
NYC | 3478
(2 rows)
@davetapley
Copy link
Author

The real queries are identical except that places.category_id = 2 AND place_events_count > 2 is given either as a WHERE, or an ON.

To my mind this makes it the same as the simplified example, yet I'm getting different results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment