Skip to content

Instantly share code, notes, and snippets.

@ryogrid
Last active November 13, 2024 16:25
Show Gist options
  • Select an option

  • Save ryogrid/4496db0e9fffa5695a7b6f13f3536f9a to your computer and use it in GitHub Desktop.

Select an option

Save ryogrid/4496db0e9fffa5695a7b6f13f3536f9a to your computer and use it in GitHub Desktop.
citusをdocker-composeで動かす

環境

  • WSL Ubuntu 20.04 (WSL2)
  • Docker version 24.0.5, build ced0996

Dockerイメージを作る

docker-composeでクラスタを起動

  • docker-compose -p citus up --scale worker=5
    • "${POSTGRES_PASSWORD}" となっている箇所を "${POSTGRES_PASSWORD:-postgres}" に書き換えた
    • 修正は必要なかったかもしれない

構築したクラスタに接続

  • psql -h 127.0.0.1 -p 5432 -U postgres
    • パスワード入力は不要
    • psqlコマンドは使えるようにしておくこと
      • psql --version
        • psql (PostgreSQL) 12.16 (Ubuntu 12.16-0ubuntu0.20.04.1)

お試し手順を実行してみる

 
postgres=# SELECT master_get_active_worker_nodes();
 master_get_active_worker_nodes
--------------------------------
 (citus_worker_2,5432)
 (citus_worker_4,5432)
 (citus_worker_5,5432)
 (citus_worker_3,5432)
 (citus_worker_1,5432)
(5 rows)

postgres=# -- insert some events
postgres=# INSERT INTO events (device_id, data)
postgres-# SELECT s % 100, ('{"measurement":'||random()||'}')::jsonb FROM generate_series(1,1000000) s;
INSERT 0 1000000

postgres=# -- get the last 3 events for device 1, routed to a single node
postgres=# SELECT * FROM events WHERE device_id = 1 ORDER BY event_time DESC, event_id DESC LIMIT 3;
 device_id | event_id |          event_time           |                 data
-----------+----------+-------------------------------+--------------------------------------
         1 |   999901 | 2024-11-13 14:05:31.833112+00 | {"measurement": 0.916419629801811}
         1 |   999801 | 2024-11-13 14:05:31.833112+00 | {"measurement": 0.5658473618927078}
         1 |   999701 | 2024-11-13 14:05:31.833112+00 | {"measurement": 0.04750641050129345}
(3 rows)

postgres=# SELECT count(*) FROM events;
  count
---------
 1000000
(1 row)

postgres=# -- explain plan for a query that is parallelized across shards, which shows the plan for a query one of the shards and how the aggregation across shards is done
postgres=# EXPLAIN (VERBOSE ON) SELECT count(*) FROM events;
                                               QUERY PLAN
---------------------------------------------------------------------------------------------------------
 Aggregate  (cost=250.00..250.02 rows=1 width=8)
   Output: COALESCE((pg_catalog.sum(remote_scan.count))::bigint, '0'::bigint)
   ->  Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=100000 width=8)
         Output: remote_scan.count
         Task Count: 32
         Tasks Shown: One of 32
         ->  Task
               Query: SELECT count(*) AS count FROM public.events_102008 events WHERE true
               Node: host=citus_worker_1 port=5432 dbname=postgres
               ->  Aggregate  (cost=727.00..727.01 rows=1 width=8)
                     Output: count(*)
                     ->  Seq Scan on public.events_102008 events  (cost=0.00..652.00 rows=30000 width=0)
                           Output: device_id, event_id, event_time, data
(13 rows)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment