Index | Index Size | Execution Time | Time % |
---|---|---|---|
(No Index) | Data: 62 GB | 251,086.270 ms (約 4 分 11 秒) | 100.00% |
pg0206_pk(a,t,p) | 18 GB | 1,376,481.224 ms (約 22 分 56 秒) | 548.21% |
pg0206_idx1(a) | 6.1 GB | 22,179.966 ms (約 22.2 秒) | 8.83% |
pg0206_idx2(r) | 1.4 GB | 5,447.446 ms (約 5.4 秒) | 2.17% |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- https://www.postgresql.org/docs/current/infoschema-columns.html | |
SELECT | |
cols.column_name, | |
( | |
SELECT | |
pg_catalog.col_description(c.oid, cols.ordinal_position::int) | |
FROM pg_catalog.pg_class c | |
WHERE | |
c.oid = ((cols.table_schema||'.'||cols.table_name)::regclass::oid) AND | |
c.relname = cols.table_name |
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=1944764.47..1944764.48 rows=1 width=8) (actual time=5411.797..5446.435 rows=1 loops=1)
-> Gather (cost=1944764.05..1944764.46 rows=4 width=8) (actual time=5411.520..5446.421 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=1943764.05..1943764.06 rows=1 width=8) (actual time=5361.116..5361.117 rows=1 loops=5)
-> Parallel Index Only Scan using pg0206_idx2 on pg0206 (cost=0.57..1809782.59 rows=53592584 width=0) (actual time=0.308..3485.449 rows=42874123 loops=5)
Heap Fetches: 16004
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=2631471.56..2631471.57 rows=1 width=8) (actual time=22078.116..22178.107 rows=1 loops=1)
-> Gather (cost=2631471.14..2631471.55 rows=4 width=8) (actual time=22077.746..22178.083 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=2630471.14..2630471.15 rows=1 width=8) (actual time=21974.459..21974.460 rows=1 loops=5)
-> Parallel Index Only Scan using pg0206_idx1 on pg0206 (cost=0.57..2496490.09 rows=53592420 width=0) (actual time=10.389..19741.532 rows=42874035 loops=5)
Heap Fetches: 12451
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=8769687.97..8769687.98 rows=1 width=8) (actual time=251075.617..251083.959 rows=1 loops=1)
-> Gather (cost=8769687.55..8769687.96 rows=4 width=8) (actual time=251075.601..251083.948 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=8768687.55..8768687.56 rows=1 width=8) (actual time=251006.759..251006.761 rows=1 loops=5)
-> Parallel Seq Scan on pg0206 (cost=0.00..8634469.44 rows=53687244 width=0) (actual time=9.456..245508.069 rows=42873831 loops=5)
Planning Time: 0.151 ms
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=4329749.33..4329749.34 rows=1 width=8) (actual time=1376247.195..1376479.448 rows=1 loops=1)
-> Gather (cost=4329748.91..4329749.32 rows=4 width=8) (actual time=1376246.901..1376479.423 rows=5 loops=1)
Workers Planned: 4
Workers Launched: 4
-> Partial Aggregate (cost=4328748.91..4328748.92 rows=1 width=8) (actual time=1376206.398..1376206.405 rows=1 loops=5)
-> Parallel Index Only Scan using pg0206_pk on pg0206 (cost=0.57..4194530.80 rows=53687244 width=0) (actual time=0.053..1371797.942 rows=42873650 loops=5)
Heap Fetches: 6314
匯入檔案資料, 如果在 INSERT 時有遇到唯一性的衝突時, 就會改成 UPDATE 該筆資料
- 建立暫存資料表
- 先把 csv 內容載入到暫存資料表
- 找出具有 UNIQUE 限制的欄位 (包含 PRIMARY KEY 和 UNIQUE INDEX)
- 依序比對暫存資料表與目標資料表的內容, 刪除目標資料表中重覆 KEY 值的資料列
- 將暫存資料庫所有資料 INSERT 到目標資料表
- 移除暫存資料表
- 使用 podman 管理 PostgreSQL Service
- 透過 systemctl 管理 podman
- 外部 volume path: /data/postgres.5555/data
- port 5555 可自訂, 用於突顯內外不同 port
- selinux disable 比較方便
$ sudo systemctl enable postgres.5555
$ sudo systemctl start postgres.5555
- 使用ssh跳版機(Bastion server)連線到 remote host
ssh -J bastion_user@bastion_host remote_user@remote_host
- 使用ssh跳版機執行 scp
scp -oProxyJump=bastion_user@bastion_host remote_user@remote_host:/tmp/test.txt .
- 把指定來源的 repository 檔案都抓回來
wget -r -np --no-check-certificate http://fedora.cs.nctu.edu.tw/epel/8/Everything/x86_64/