Index | Index Size | Execution Time | Time % |
---|---|---|---|
(No Index) | Data: 62 GB | 251,086.270 ms (約 4 分 11 秒) | 100.00% |
pg0206_pk(a,t,p) | 18 GB | 1,376,481.224 ms (約 22 分 56 秒) | 548.21% |
pg0206_idx1(a) | 6.1 GB | 22,179.966 ms (約 22.2 秒) | 8.83% |
pg0206_idx2(r) | 1.4 GB | 5,447.446 ms (約 5.4 秒) | 2.17% |
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1944764.47..1944764.48 rows=1 width=8) (actual time=5411.797..5446.435 rows=1 loops=1)
-> Gather (cost=1944764.05..1944764.46 rows=4 width=8) (actual time=5411.520..5446.421 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=1943764.05..1943764.06 rows=1 width=8) (actual time=5361.116..5361.117 rows=1 loops=5)
-> Parallel Index Only Scan using pg0206_idx2 on pg0206 (cost=0.57..1809782.59 rows=53592584 width=0) (actual time=0.308..3485.449 rows=42874123 loops=5)
Heap Fetches: 16004
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=2631471.56..2631471.57 rows=1 width=8) (actual time=22078.116..22178.107 rows=1 loops=1)
-> Gather (cost=2631471.14..2631471.55 rows=4 width=8) (actual time=22077.746..22178.083 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=2630471.14..2630471.15 rows=1 width=8) (actual time=21974.459..21974.460 rows=1 loops=5)
-> Parallel Index Only Scan using pg0206_idx1 on pg0206 (cost=0.57..2496490.09 rows=53592420 width=0) (actual time=10.389..19741.532 rows=42874035 loops=5)
Heap Fetches: 12451
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=8769687.97..8769687.98 rows=1 width=8) (actual time=251075.617..251083.959 rows=1 loops=1)
-> Gather (cost=8769687.55..8769687.96 rows=4 width=8) (actual time=251075.601..251083.948 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=8768687.55..8768687.56 rows=1 width=8) (actual time=251006.759..251006.761 rows=1 loops=5)
-> Parallel Seq Scan on pg0206 (cost=0.00..8634469.44 rows=53687244 width=0) (actual time=9.456..245508.069 rows=42873831 loops=5)
Planning Time: 0.151 ms
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=4329749.33..4329749.34 rows=1 width=8) (actual time=1376247.195..1376479.448 rows=1 loops=1)
-> Gather (cost=4329748.91..4329749.32 rows=4 width=8) (actual time=1376246.901..1376479.423 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=4328748.91..4328748.92 rows=1 width=8) (actual time=1376206.398..1376206.405 rows=1 loops=5)
-> Parallel Index Only Scan using pg0206_pk on pg0206 (cost=0.57..4194530.80 rows=53687244 width=0) (actual time=0.053..1371797.942 rows=42873650 loops=5)
Heap Fetches: 6314
匯入檔案資料, 如果在 INSERT 時有遇到唯一性的衝突時, 就會改成 UPDATE 該筆資料
- 建立暫存資料表
- 先把 csv 內容載入到暫存資料表
- 找出具有 UNIQUE 限制的欄位 (包含 PRIMARY KEY 和 UNIQUE INDEX)
- 依序比對暫存資料表與目標資料表的內容, 刪除目標資料表中重覆 KEY 值的資料列
- 將暫存資料庫所有資料 INSERT 到目標資料表
- 移除暫存資料表
- 使用 podman 管理 PostgreSQL Service
- 透過 systemctl 管理 podman
- 外部 volume path: /data/postgres.5555/data
- port 5555 可自訂, 用於突顯內外不同 port
- selinux disable 比較方便
$ sudo systemctl enable postgres.5555
$ sudo systemctl start postgres.5555
- 使用ssh跳版機(Bastion server)連線到 remote host
ssh -J bastion_user@bastion_host remote_user@remote_host
- 使用ssh跳版機執行 scp
scp -oProxyJump=bastion_user@bastion_host remote_user@remote_host:/tmp/test.txt .
- 把指定來源的 repository 檔案都抓回來
wget -r -np --no-check-certificate http://fedora.cs.nctu.edu.tw/epel/8/Everything/x86_64/
sudo mokutil --disable-validation