Skip to content

Instantly share code, notes, and snippets.

View den-crane's full-sized avatar
πŸ™€
What's happening?

Denny [DBA at Innervate] den-crane

πŸ™€
What's happening?
View GitHub Profile
https://fiddle.clickhouse.com/fabe3448-a682-4866-af76-90bdf816d74a
CREATE TABLE dummy (date Date, id String )
ENGINE = Null;
CREATE TABLE t (
date Date,
idstate13 AggregateFunction(uniqCombined64(13), String),
idstate17 AggregateFunction(uniqCombined64(17), String),
https://fiddle.clickhouse.com/afd6bac1-a4c1-4d51-bed8-2043012861c1
insert into function file(currentDatabase()||'_parallelize_output_from_storages_dedup_test.tsv', 'TSV')
select number, toString(arrayMap(x->cityHash64(x), range(number%11)))
from numbers(10000) order by cityHash64(number);
set min_insert_block_size_bytes=1000000,
min_chunk_bytes_for_parallel_parsing = 10000;
set parallelize_output_from_storages = 0;
@den-crane
den-crane / Clickhouse_index_leverage_monotonic_functions.md
Created January 31, 2025 23:39
Clickhouse index leverage for monotonic functions

Let me explain how ClickHouse can utilize an index in this case, despite the expression intDiv(X, 10) = ? appearing non-sargable at first glance.

The key lies in ClickHouse's ability to perform monotonic function analysis. Here's how it works:

First, let's understand what makes intDiv(X, 10) special. The integer division function is monotonic - as X increases, intDiv(X, 10) either increases or stays the same, never decreases. For example:

X = 45 -> intDiv(X, 10) = 4
X = 46 -> intDiv(X, 10) = 4
X = 50 -> intDiv(X, 10) = 5
@den-crane
den-crane / gist:4676ad82e7c6db0db4866ddbd111159b
Last active June 2, 2024 22:52
uuid4, uuid7, SnowflakeID, ULID compression rate

https://fiddle.clickhouse.com/35bf07da-101e-4e55-a408-1a597099f99b

CREATE TABLE t(
  int64 UInt64 codec(Delta, ZSTD(3)),
  ulid FixedString(26) default generateULID() codec(ZSTD(3)),
  uuid4 UUID default generateUUIDv4() codec(ZSTD(3)),
  Snowflake UInt64 default generateSnowflakeID()  codec(Delta,ZSTD(3)),
  uuid7 UUID default generateUUIDv7() codec(ZSTD(3))
)ENGINE = MergeTree() order by int64;
create table t Engine=Memory empty as select * from format('JSONEachRow', '{
"ID": "<id>",
"FrameUri": "<uri>",
"Detections": [
{
"CategoryId": 1,
"ID": "<id>",
"SegmentationPoints": [
{
"X": 1,
2024.03.08 03:54:38.022312 [ 141144 ] {} <Trace> SystemLog (system.query_log): Flushing system log, 2 entries to flush up to offset 86304192
2024.03.08 03:54:38.024139 [ 140747 ] {} <Trace> BaseDaemon: Received signal 11
2024.03.08 03:54:38.024287 [ 930727 ] {} <Fatal> BaseDaemon: ########## Short fault info ############
2024.03.08 03:54:38.024679 [ 930727 ] {} <Fatal> BaseDaemon: (version 24.1.3.31 (official build), build id: E65ACEFD4C4A4F209A1529998C6032754B52A0FC, git hash: 135b08cbd28a5832e9e70c3b7d09dd4134845ed3) (from thread 141144) Received signal 11
2024.03.08 03:54:38.024693 [ 930727 ] {} <Fatal> BaseDaemon: Signal description: Segmentation fault
2024.03.08 03:54:38.024700 [ 930727 ] {} <Fatal> BaseDaemon: Address: 0x70. Access: read. Address not mapped to object.
2024.03.08 03:54:38.024705 [ 930727 ] {} <Fatal> BaseDaemon: Stack trace: 0x0000000007234f2e 0x0000000010d3bdd3 0x00000000114c0215 0x00000000114c39f2 0x000000001170b275 0x000000000c931314 0x000000000c8ee83e 0x00007f5233abdfd4 0x00007f5233b
@den-crane
den-crane / wierd_compression.sql
Created October 30, 2023 21:20
wierd_compression
CREATE TABLE t
(
key_zstd1 Int64 CODEC(ZSTD(1)),
key_lz4 Int64 CODEC(LZ4),
key_t64_zstd Int64 CODEC(T64,ZSTD(1)),
key_int8 Int8 CODEC(ZSTD(1)),
r Int32
)
ENGINE = MergeTree
ORDER BY (key_zstd1, r);
@den-crane
den-crane / uniqHLL12_vs_uniqTheta
Created October 25, 2023 21:48
uniqHLL12 vs uniqTheta
create table x (a Int64, b Int64, c Int64,
u AggregateFunction(uniqHLL12, String),
t AggregateFunction(uniqTheta, String))
Engine=MergeTree order by (a,b,c);
insert into x select number%11111 a, number%44 b, number%3 c, uniqHLL12State(number::String), uniqThetaState(number::String)
from numbers(1e7) group by a,b,c;
optimize table x final;
@den-crane
den-crane / ttl_merge_of_wide_rows_mem_usage.md
Last active September 6, 2023 22:05
ttl_merge_of_wide_rows_mem_usage
CREATE TABLE wide_rows2(`A` Int64, `D` Date, `S` String)
ENGINE = MergeTree PARTITION BY toYYYYMM(D) ORDER BY A
settings merge_with_ttl_timeout=300, materialize_ttl_recalculate_only = 1;

insert into wide_rows2 select number, toDate('2020-01-01')+ number%20, arrayMap(i-> cityHash64((number*i)%999), range(2000)) from numbers(1e5);
insert into wide_rows2 select number, toDate('2020-01-01')+ number%20, arrayMap(i-> cityHash64((number*i)%899), range(2000)) from numbers(1e5);
insert into wide_rows2 select number, toDate('2020-01-01')+ number%20, arrayMap(i-> cityHash64((number*i)%799), range(2000)) from numbers(1e5);
insert into wide_rows2 select number, toDate('2020-01-01')+ number%20, arrayMap(i-> cityHash64((number*i)%699), range(2000)) from numbers(1e5);
set materialize_ttl_after_modify=0;
@den-crane
den-crane / ch_vs_sr.md
Last active July 20, 2023 21:54
ch vs sr

CH 1-node

SELECT
    count(),
    min(paramkey),
    max(paramkey),
    min(ind),
    max(ind)
FROM data_table
β”Œβ”€β”€β”€count()─┬────────min(paramkey)─┬───────max(paramkey)─┬─min(ind)─┬──max(ind)─┐