Skip to content

Instantly share code, notes, and snippets.

@hui-zheng
hui-zheng / progressive-vs-conservative.md
Last active May 15, 2025 21:44
进步主义和保守主义底层逻辑的简单介绍与对比

进步主义 vs 保守主义

政治光谱上的两棵大树

  • 西方社会和文明在当今的两个主流政治思潮
  • 左边:进步主义(Progressivism)和右边:保守主义(Conservatism)
  • 政治是文化和信仰的下游 (当然,进步主义者不一定怎么认为)
@hui-zheng
hui-zheng / cat_with_filenames.sh
Last active May 27, 2020 17:01
bash script recipes for all
# cat multiple files and show filenames
grep ^ /dev/null $@
@hui-zheng
hui-zheng / find_and_display_duplicates
Created March 14, 2020 19:49
[BigQuery Advanced SQL] the most flexible script to detect and display duplicate records and remove duplicates (dedup)
-- base_time has t
WITH rows_by_key AS(
SELECT
surrogate_key,
array_agg(base_table) as _rows,
count(*) as _count
FROM `gcp_project.data_set.original_table` as base_table
WHERE stamp BETWEEN "2020-03-12T00:00:00" AND "2020-03-14T00:00:00"
GROUP BY surrogate_key
)
@hui-zheng
hui-zheng / clean_kubernetes_jobs.sh
Last active March 5, 2020 16:39
kubernetes DevOps Operation Script
BY_LIST_FILE="NONE"
COMPLETED="1/."
AGE="3"
NAME_PATTERN=".*"
while [[ $# -gt 0 ]]; do
key="$1"
case $key in
@hui-zheng
hui-zheng / SQL non-null greatest for multiple columns
Last active February 7, 2024 04:01
[BigQuery Advanced SQL] find greatest/largest/max non-null values among multiple columns
-- Below is a fancy version of non-null-greatest() for multi-columns.
-- it is more extensible for more two columns.
WITH base AS (
SELECT
(SELECT ARRAY_AGG (x IGNORE NULLS) AS Y FROM UNNEST ([col_1, col_2, col_3, col_4]) AS x)
AS array,
FROM source_table AS nl
)
SELECT
(SELECT MAX(y) FROM UNNEST(array) AS Y
@hui-zheng
hui-zheng / BQ_partition_dedup.sql
Last active December 13, 2024 00:57
This list provides BigQuery SQL templates that remove duplicates for large size timestamp partitioned table (using MERGE statement) and for small size table or a non-partition table (Using REPLACE TABLE statement)
-- WARNING: back up the table before this operation
-- FOR large size timestamp partitioned table
-- -------------------------------------------
-- -- To de-duplicate rows of a given range of a partition table, using surrage_key as unique id
-- -------------------------------------------
DECLARE dt_start DEFAULT TIMESTAMP("2019-09-17T00:00:00", "America/Los_Angeles") ;
DECLARE dt_end DEFAULT TIMESTAMP("2019-09-22T00:00:00", "America/Los_Angeles");
MERGE INTO `gcp_project`.`data_set`.`the_table` AS INTERNAL_DEST