Looks like filters are pushed down when filtering an "UNION ALL". Also an example on how to use EXPLAIN
in clickhouse and a different view of seeing what is going on with the traces, this lines show how much data clickhouse is reading:
Selected 1 parts by date, 1 parts by key, 2 marks by primary key, 2 marks to read from 1 ranges
Reading approx. 16384 rows with 1 streams
The example
➜ tests clickhouse client -q "select version()"
20.8.2.3
➜ tests cat union_all_push_down.sql
drop table if exists a;
drop table if exists b;
create table a (
a Date,
b Int32
) Engine = MergeTree() order by a;
create table b (
a Date,
b Int32
) Engine = MergeTree() order by a;
insert into a select today() - toIntervalDay(number/1000) as a, number as b from numbers(1000000);
insert into b select today() - toIntervalDay(number/1000) as a, number as b from numbers(1000000);
explain plan
select count() from (
select * from a
union all
select * from b
)
where a > today() - interval 3 day;
➜ tests cat union_all_push_down.sql | clickhouse client --send_logs_level=trace -mn 2>&1 | grep SelectExecutor
[xyz] 2020.11.18 13:23:13.662368 [ 46219 ] {e3edaa54-3a22-4e06-8119-4e6a2b2912c3} <Debug> default.a (SelectExecutor): Key condition: (column 0 in [18582, +inf))
[xyz] 2020.11.18 13:23:13.662407 [ 46219 ] {e3edaa54-3a22-4e06-8119-4e6a2b2912c3} <Trace> default.a (SelectExecutor): Used optimized inclusion search over index for part all_1_1_0 with 16 steps
[xyz] 2020.11.18 13:23:13.662416 [ 46219 ] {e3edaa54-3a22-4e06-8119-4e6a2b2912c3} <Debug> default.a (SelectExecutor): Selected 1 parts by date, 1 parts by key, 2 marks by primary key, 2 marks to read from 1 ranges
[xyz] 2020.11.18 13:23:13.662453 [ 46219 ] {e3edaa54-3a22-4e06-8119-4e6a2b2912c3} <Trace> default.a (SelectExecutor): Reading approx. 16384 rows with 1 streams
[xyz] 2020.11.18 13:23:13.662567 [ 46219 ] {e3edaa54-3a22-4e06-8119-4e6a2b2912c3} <Debug> default.b (SelectExecutor): Key condition: (column 0 in [18582, +inf))
[xyz] 2020.11.18 13:23:13.662588 [ 46219 ] {e3edaa54-3a22-4e06-8119-4e6a2b2912c3} <Trace> default.b (SelectExecutor): Used optimized inclusion search over index for part all_1_1_0 with 16 steps
[xyz] 2020.11.18 13:23:13.662600 [ 46219 ] {e3edaa54-3a22-4e06-8119-4e6a2b2912c3} <Debug> default.b (SelectExecutor): Selected 1 parts by date, 1 parts by key, 2 marks by primary key, 2 marks to read from 1 ranges
[xyz] 2020.11.18 13:23:13.662618 [ 46219 ] {e3edaa54-3a22-4e06-8119-4e6a2b2912c3} <Trace> default.b (SelectExecutor): Reading approx. 16384 rows with 1 streams
➜ tests cat union_all_push_down.sql | clickhouse client -mn
Expression (Projection)
Expression (Before ORDER BY and SELECT)
Aggregating
Expression (Before GROUP BY)
Filter (WHERE)
Union
Expression (Projection)
Expression (Before ORDER BY and SELECT)
Filter (WHERE)
ReadFromStorage (Read from MergeTree)
Expression (Projection)
Expression (Before ORDER BY and SELECT)
Filter (WHERE)
ReadFromStorage (Read from MergeTree)