Last active
August 15, 2024 19:11
-
-
Save infotroph/f9fc28c9741d883b4d338576e169acba to your computer and use it in GitHub Desktop.
Enabling Postgres logging during a PEcAn test run
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* build a new local postgis image that wraps mdillon/postgis but invokes `postgres -c log_statement=all` at run | |
(see ckb-notes/postgis_logging/) | |
``` | |
docker build -t pecan/postgis:9.5_log -<<EOF | |
FROM mdillon/postgis:9.5 | |
CMD ["postgres", "-c", "log_statement=all"] | |
EOF | |
``` | |
* modify actions files to use pecan/postgis:9.5_log instead of mdillon/postgis:9.5 | |
* run act with `--pull=false` so it's willing to use local image without hassling with upload | |
`time act --pull=false -s GITHUB_TOKEN="$(gh auth token)" -W .github/workflows/ci.yml -j test --matrix R:4.2` | |
* *while* that's running, stream docker logs from postgis container into a persistent file | |
`docker logs -f $(docker container ls | grep 'postgis:9.5_log' | cut -w -f1) > postgis_run_log.txt` | |
* To analyze: | |
- rejoin hard-wrapped lines for easier filtering. | |
I did this in Sublime Text by searching and replacing `\n\t+` => ' ', | |
to avoid remembering how to do multiline sed/awk patterns. | |
- First ~6k lines are logging initial read-in of the dump. Delete everything up to | |
"PostgreSQL database dump complete" | |
- remove frequently-repeated lines that aren't informative about which records are accessed | |
- health check produces many, many lines of "FATAL: role "root" does not exist' (~1000/9983 lines) | |
- ~every query produces a separate `select attnotnull` for each column of the table (1527 lines) | |
- not interested in set statements (mostly "set datestyle to ISO") (~300 lines) | |
- nor in selections from pg_type | |
- ?? I don't think we care about inserts (at least assuming we'll continue to create all tables empty) | |
- All called in test job appear to be either initial machines/models setup, or inserts into workflows table | |
sed -E \ | |
-e '1,/PostgreSQL database dump complete/d' \ | |
-e '/role.*does not exist/d' \ | |
-e '/select attnotnull/d' \ | |
-e '/SELECT.*FROM pg_type/d' \ | |
-e '/statement: set/d' -e '/statement: SET/d' \ | |
-e '/statement: +INSERT/d' -e '/: +INSERT/d'\ | |
postgis_run_log_unwrapped.txt \ | |
> postgis_run_log_trimmed.txt | |
- now look at just the unique lines: | |
cat postgis_run_log_trimmed.txt | sort | uniq -c | sort -n \ | |
> ckb_notes/postgis_logging/postgis_run_log_trimmed_uniq.txt |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment