Skip to content

Instantly share code, notes, and snippets.

@zbalkan
Last active May 11, 2025 16:39
Show Gist options
  • Save zbalkan/b6b6af0981698ae8c314a384aa3cbed3 to your computer and use it in GitHub Desktop.
Save zbalkan/b6b6af0981698ae8c314a384aa3cbed3 to your computer and use it in GitHub Desktop.
Creates a case-specific DuckDB view over compressed .json.gz log files matching a given date pattern.
#!/usr/bin/env bash
set -o errexit
set -o nounset
set -o pipefail
if [[ "${TRACE-0}" == "1" ]]; then
set -o xtrace
fi
if [[ "${1-}" =~ ^-*h(elp)?$ || $# -lt 2 ]]; then
echo "Usage: $0 <case-name> <date-pattern>"
echo "Example: $0 SI-801 2025-04-0*"
exit 1
fi
cd "$(dirname "$0")"
case_name="$1"
date_pattern="$2"
# Normalize case name to lowercase, snake_case
view_name="$(echo "$case_name" | tr '[:upper:]' '[:lower:]' | tr '-' '_' | tr ' ' '_')"
# Database file
db_file="investigations.db"
# Resolve file paths
file_paths=$(ls /archives/${date_pattern}-siem*.json.gz 2>/dev/null | sed 's/^/"/;s/$/"/' | paste -sd, -)
if [[ -z "$file_paths" ]]; then
echo "No files matched pattern: /archives/${date_pattern}-siem*.json.gz"
exit 1
fi
# Create the view in DuckDB
duckdb "$db_file" <<EOF
CREATE OR REPLACE VIEW $view_name AS
SELECT
CAST(timestamp AS TIMESTAMP) AS timestamp,
agent.id AS agent_id,
agent.name AS agent_name,
location,
decoder.name AS decoder,
full_log
FROM read_ndjson_auto([${file_paths}], ignore_errors = true);
EOF
echo "View '$view_name' created in $db_file"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment