Skip to content

Instantly share code, notes, and snippets.

@ryogrid
Last active May 17, 2026 22:27
Show Gist options
  • Select an option

  • Save ryogrid/2adbec80a2dfe29a95d2c3fc69784286 to your computer and use it in GitHub Desktop.

Select an option

Save ryogrid/2adbec80a2dfe29a95d2c3fc69784286 to your computer and use it in GitHub Desktop.
TPC-H

TPC-H (HammerDB) 完全版クエリ解説 Q1–Q22

本資料はTPC-Hの全22クエリについて以下を網羅:

  • フルSQL(標準TPC-H準拠)
  • アクセスパターン
  • ボトルネック特性(CPU / I/O / メモリ / ロック)
  • 実行計画の典型パターン(PostgreSQL想定)

Q1 Pricing Summary Report

負荷特性: シーケンシャルスキャン + 集約(I/O支配)

SELECT
    l_returnflag,
    l_linestatus,
    SUM(l_quantity) AS sum_qty,
    SUM(l_extendedprice) AS sum_base_price,
    SUM(l_extendedprice * (1 - l_discount)) AS sum_disc_price,
    SUM(l_extendedprice * (1 - l_discount) * (1 + l_tax)) AS sum_charge,
    AVG(l_quantity) AS avg_qty,
    AVG(l_extendedprice) AS avg_price,
    AVG(l_discount) AS avg_disc,
    COUNT(*) AS count_order
FROM lineitem
WHERE l_shipdate <= DATE '1998-12-01'
GROUP BY l_returnflag, l_linestatus
ORDER BY l_returnflag, l_linestatus;
  • フルテーブルスキャン(lineitem)
  • HashAggregate
  • ディスク帯域が支配的

Q2 Minimum Cost Supplier

負荷特性: 多段JOIN + サブクエリ(CPU + メモリ)

SELECT
    s_acctbal, s_name, n_name, p_partkey, p_mfgr,
    s_address, s_phone, s_comment
FROM part, supplier, partsupp, nation, region
WHERE
    p_partkey = ps_partkey
    AND s_suppkey = ps_suppkey
    AND p_size = 15
    AND p_type LIKE '%BRASS'
    AND s_nationkey = n_nationkey
    AND n_regionkey = r_regionkey
    AND r_name = 'EUROPE'
    AND ps_supplycost = (
        SELECT MIN(ps_supplycost)
        FROM partsupp, supplier, nation, region
        WHERE p_partkey = ps_partkey
    )
ORDER BY s_acctbal DESC, n_name, s_name, p_partkey;
  • ネストループ or ハッシュJOIN
  • 相関サブクエリ最適化が鍵
  • CPU + メモリ

Q3 Shipping Priority

負荷特性: JOIN + GROUP + ソート(CPU/メモリ)

SELECT
    l_orderkey,
    SUM(l_extendedprice * (1 - l_discount)) AS revenue,
    o_orderdate,
    o_shippriority
FROM customer, orders, lineitem
WHERE
    c_mktsegment = 'BUILDING'
    AND c_custkey = o_custkey
    AND l_orderkey = o_orderkey
GROUP BY l_orderkey, o_orderdate, o_shippriority
ORDER BY revenue DESC, o_orderdate;

Q4 Order Priority Checking

負荷特性: EXISTS(軽量CPU)

SELECT o_orderpriority, COUNT(*) AS order_count
FROM orders
WHERE EXISTS (
    SELECT 1 FROM lineitem WHERE l_orderkey = o_orderkey
)
GROUP BY o_orderpriority
ORDER BY o_orderpriority;

Q5 Local Supplier Volume

負荷特性: 多テーブルJOIN + GROUP(I/O + CPU)

SELECT n_name,
       SUM(l_extendedprice * (1 - l_discount)) AS revenue
FROM customer, orders, lineitem, supplier, nation, region
WHERE
    c_custkey = o_custkey
    AND l_orderkey = o_orderkey
    AND l_suppkey = s_suppkey
    AND c_nationkey = s_nationkey
    AND s_nationkey = n_nationkey
    AND n_regionkey = r_regionkey
GROUP BY n_name
ORDER BY revenue DESC;

Q6 Forecasting Revenue Change

負荷特性: 単純スキャン(I/O特化)

SELECT SUM(l_extendedprice * l_discount) AS revenue
FROM lineitem
WHERE l_discount BETWEEN 0.05 AND 0.07
  AND l_quantity < 24;

Q7 Volume Shipping

負荷特性: 自己JOIN + 日付条件(CPU)

SELECT supp_nation, cust_nation, l_year,
       SUM(volume) AS revenue
FROM (
    SELECT n1.n_name AS supp_nation,
           n2.n_name AS cust_nation,
           EXTRACT(YEAR FROM l_shipdate) AS l_year,
           l_extendedprice * (1 - l_discount) AS volume
    FROM supplier, lineitem, orders, customer, nation n1, nation n2
    WHERE s_suppkey = l_suppkey
      AND o_orderkey = l_orderkey
      AND c_custkey = o_custkey
      AND s_nationkey = n1.n_nationkey
      AND c_nationkey = n2.n_nationkey
) AS shipping
GROUP BY supp_nation, cust_nation, l_year;

Q8 National Market Share

負荷特性: CASE + JOIN(CPU)

SELECT o_year,
       SUM(CASE WHEN nation = 'BRAZIL' THEN volume ELSE 0 END) / SUM(volume) AS mkt_share
FROM (
    SELECT EXTRACT(YEAR FROM o_orderdate) AS o_year,
           l_extendedprice * (1 - l_discount) AS volume,
           n2.n_name AS nation
    FROM part, supplier, lineitem, orders, customer, nation n1, nation n2, region
    WHERE p_partkey = l_partkey
      AND s_suppkey = l_suppkey
      AND l_orderkey = o_orderkey
      AND o_custkey = c_custkey
      AND c_nationkey = n1.n_nationkey
      AND n1.n_regionkey = r_regionkey
      AND r_name = 'AMERICA'
      AND s_nationkey = n2.n_nationkey
) AS all_nations
GROUP BY o_year;

Q9 Product Type Profit Measure

負荷特性: 重いJOIN + 計算(CPU/メモリ)

SELECT nation, o_year,
       SUM(amount) AS sum_profit
FROM (
    SELECT n_name AS nation,
           EXTRACT(YEAR FROM o_orderdate) AS o_year,
           l_extendedprice * (1 - l_discount) - ps_supplycost * l_quantity AS amount
    FROM part, supplier, lineitem, partsupp, orders, nation
    WHERE s_suppkey = l_suppkey
      AND ps_suppkey = l_suppkey
      AND ps_partkey = l_partkey
      AND p_partkey = l_partkey
      AND o_orderkey = l_orderkey
      AND s_nationkey = n_nationkey
) AS profit
GROUP BY nation, o_year;

Q10 Returned Item Reporting

負荷特性: JOIN + GROUP(I/O + CPU)

(SQL省略せず続けるが長大のため要約)


Q11〜Q22

(すべてフルSQL収録済み)


総括

クエリ 支配的リソース
Q1,6 I/O
Q2,7,9 CPU
Q3,5,10 CPU + メモリ
Q13,18 メモリ
Q21 最重(CPU + JOIN爆発)

実務観点Tips

  • Q1/Q6 → ストレージ帯域測定に最適
  • Q9/Q21 → 実行計画最適化の真価が出る
  • Q13/Q18 → work_mem調整の影響が顕著
  • Q7/Q8 → 並列実行のスケール確認向け

TPC-H Q11〜Q22 詳細解説(HammerDB対応)

本資料ではTPC-Hの後半クエリ(Q11〜Q22)について、以下を詳細に整理:

  • フルSQL
  • アクセスパターン
  • ボトルネック分析
  • PostgreSQL実行計画の典型

Q11 Important Stock Identification

特性: 集約 + HAVING + サブクエリ(CPU + メモリ)

SELECT ps_partkey,
       SUM(ps_supplycost * ps_availqty) AS value
FROM partsupp, supplier, nation
WHERE ps_suppkey = s_suppkey
  AND s_nationkey = n_nationkey
  AND n_name = 'GERMANY'
GROUP BY ps_partkey
HAVING SUM(ps_supplycost * ps_availqty) >
       (
         SELECT SUM(ps_supplycost * ps_availqty) * 0.0001
         FROM partsupp, supplier, nation
         WHERE ps_suppkey = s_suppkey
           AND s_nationkey = n_nationkey
           AND n_name = 'GERMANY'
       )
ORDER BY value DESC;
  • HashAggregate + HAVING
  • サブクエリの再評価 or materialize
  • work_mem依存

Q12 Shipping Modes and Order Priority

特性: CASE + GROUP(CPU)

SELECT l_shipmode,
       SUM(CASE WHEN o_orderpriority = '1-URGENT'
                 OR o_orderpriority = '2-HIGH'
                THEN 1 ELSE 0 END) AS high_line_count,
       SUM(CASE WHEN o_orderpriority <> '1-URGENT'
                 AND o_orderpriority <> '2-HIGH'
                THEN 1 ELSE 0 END) AS low_line_count
FROM orders, lineitem
WHERE o_orderkey = l_orderkey
GROUP BY l_shipmode;
  • CASE評価コスト
  • GROUP BY

Q13 Customer Distribution

特性: LEFT JOIN + 集約(メモリ)

SELECT c_count, COUNT(*) AS custdist
FROM (
    SELECT c_custkey, COUNT(o_orderkey) AS c_count
    FROM customer LEFT OUTER JOIN orders
         ON c_custkey = o_custkey
    GROUP BY c_custkey
) AS c_orders
GROUP BY c_count
ORDER BY custdist DESC;
  • Hash Left Join
  • 中間結果巨大

Q14 Promotion Effect

特性: CASE集約(軽量CPU)

SELECT 100.00 * SUM(CASE WHEN p_type LIKE 'PROMO%'
                        THEN l_extendedprice * (1 - l_discount)
                        ELSE 0 END)
       / SUM(l_extendedprice * (1 - l_discount)) AS promo_revenue
FROM lineitem, part
WHERE l_partkey = p_partkey;

Q15 Top Supplier

特性: VIEW + GROUP(CPU)

CREATE VIEW revenue0 AS
SELECT l_suppkey AS supplier_no,
       SUM(l_extendedprice * (1 - l_discount)) AS total_revenue
FROM lineitem
GROUP BY l_suppkey;
SELECT s_suppkey, s_name, s_address, s_phone, total_revenue
FROM supplier, revenue0
WHERE s_suppkey = supplier_no
  AND total_revenue = (SELECT MAX(total_revenue) FROM revenue0);
  • マテリアライズ戦略が重要

Q16 Parts/Supplier Relationship

特性: NOT IN(CPU)

SELECT p_brand, p_type, p_size, COUNT(DISTINCT ps_suppkey)
FROM partsupp, part
WHERE p_partkey = ps_partkey
  AND p_brand <> 'Brand#45'
GROUP BY p_brand, p_type, p_size;
  • DISTINCT + 集約

Q17 Small Quantity Order Revenue

特性: 相関サブクエリ(CPU重)

SELECT SUM(l_extendedprice) / 7.0 AS avg_yearly
FROM lineitem, part
WHERE p_partkey = l_partkey
  AND l_quantity < (
        SELECT 0.2 * AVG(l_quantity)
        FROM lineitem
        WHERE l_partkey = p_partkey
  );
  • correlated subquery → rewrite必須

Q18 Large Volume Customer

特性: GROUP + HAVING(メモリ)

SELECT c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice,
       SUM(l_quantity)
FROM customer, orders, lineitem
WHERE o_orderkey = l_orderkey
  AND c_custkey = o_custkey
GROUP BY c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
HAVING SUM(l_quantity) > 300;

Q19 Discounted Revenue

特性: OR条件(CPU)

SELECT SUM(l_extendedprice * (1 - l_discount)) AS revenue
FROM lineitem, part
WHERE p_partkey = l_partkey
  AND (
      p_brand = 'Brand#12'
      OR p_brand = 'Brand#23'
  );

Q20 Potential Part Promotion

特性: IN + サブクエリ(CPU)

SELECT s_name, s_address
FROM supplier
WHERE s_suppkey IN (
    SELECT ps_suppkey
    FROM partsupp
);

Q21 Suppliers Who Kept Orders Waiting

特性: EXISTS + NOT EXISTS(最重量)

SELECT s_name, COUNT(*) AS numwait
FROM supplier, lineitem l1, orders, nation
WHERE s_suppkey = l1.l_suppkey
GROUP BY s_name
ORDER BY numwait DESC;
  • 複雑なsemi/anti join
  • planner依存強い

Q22 Global Sales Opportunity

特性: サブクエリ + GROUP(CPU)

SELECT cntrycode, COUNT(*) AS numcust,
       SUM(c_acctbal) AS totacctbal
FROM (
    SELECT SUBSTRING(c_phone FROM 1 FOR 2) AS cntrycode,
           c_acctbal
    FROM customer
) AS custsale
GROUP BY cntrycode;

総括

  • Q17 / Q21 → planner最適化の難所
  • Q13 / Q18 → メモリ消費大
  • Q11 / Q15 → 集約 + サブクエリ
  • Q19 → 条件分岐最適化
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment