Last active
October 25, 2023 18:10
-
-
Save MichaelLeeHobbs/67980d165fc68880eb2ab283c673244b to your computer and use it in GitHub Desktop.
Mirth Postgres Table Size By Channel
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
-- 20201125T1403 EST - Fix bug in xpath not correctly matching values to correct channel | |
-- 20201125T1430 EST - Updated formatting only using: http://poorsql.com/ with minor hand edits | |
-- 20201125T1432 EST - Removed CID from final output as it was only there for debugging xpath issues | |
-- 20210321T1140 EST - Postgres 13+ XPATH works differently see: https://gist.github.com/MichaelLeeHobbs/40b4b7cf70ecbe30b73eed763367e626 | |
-- | |
-- Jon Bartels | |
-- Similar query for SQL Server is here - https://stackoverflow.com/questions/58942371/get-space-used-by-mirth-per-channel-in-sql-server | |
-- | |
SELECT NAME AS CHANNEL_NAME | |
, TOTAL_BYTES | |
, CHANNEL_ID | |
, PRUNE_ENABLED | |
, PRUNE_DAYS | |
, ARCHIVE_ENABLED | |
, TABLE_SCHEMA | |
, TABLE_NAME | |
, MIRTH_ID | |
, MIRTH_TYPE | |
, ROW_ESTIMATE | |
, INDEX_BYTES | |
, TOAST_BYTES | |
, TABLE_BYTES | |
FROM ( | |
WITH RECURSIVE PG_INHERIT(INHRELID, INHPARENT) AS ( | |
SELECT INHRELID, INHPARENT | |
FROM PG_INHERITS | |
UNION | |
SELECT CHILD.INHRELID, PARENT.INHPARENT | |
FROM PG_INHERIT CHILD, PG_INHERITS PARENT | |
WHERE CHILD.INHPARENT = PARENT.INHRELID | |
) | |
, PG_INHERIT_SHORT AS ( | |
SELECT * | |
FROM PG_INHERIT | |
WHERE INHPARENT NOT IN (SELECT INHRELID FROM PG_INHERIT) | |
) | |
SELECT TABLE_SCHEMA | |
, TABLE_NAME | |
, NULLIF(REGEXP_REPLACE(TABLE_NAME, '\D', '', 'g'), '')::NUMERIC AS MIRTH_ID | |
, SUBSTRING(TABLE_NAME FROM 'd_(\D+)\d+') AS MIRTH_TYPE | |
, ROW_ESTIMATE | |
, PG_SIZE_PRETTY(TOTAL_BYTES) AS TOTAL_BYTES | |
, PG_SIZE_PRETTY(INDEX_BYTES) AS INDEX_BYTES | |
, PG_SIZE_PRETTY(TOAST_BYTES) AS TOAST_BYTES | |
, PG_SIZE_PRETTY(TABLE_BYTES) AS TABLE_BYTES | |
FROM ( | |
SELECT *, TOTAL_BYTES - INDEX_BYTES - COALESCE(TOAST_BYTES, 0) AS TABLE_BYTES | |
FROM ( | |
SELECT C.OID | |
, NSPNAME AS TABLE_SCHEMA | |
, RELNAME AS TABLE_NAME | |
, SUM(C.RELTUPLES) OVER (PARTITION BY PARENT) AS ROW_ESTIMATE | |
, SUM(PG_TOTAL_RELATION_SIZE(C.OID)) OVER (PARTITION BY PARENT) AS TOTAL_BYTES | |
, SUM(PG_INDEXES_SIZE(C.OID)) OVER (PARTITION BY PARENT) AS INDEX_BYTES | |
, SUM(PG_TOTAL_RELATION_SIZE(RELTOASTRELID)) OVER (PARTITION BY PARENT) AS TOAST_BYTES | |
, PARENT | |
FROM ( | |
SELECT PG_CLASS.OID | |
, RELTUPLES | |
, RELNAME | |
, RELNAMESPACE | |
, PG_CLASS.RELTOASTRELID | |
, COALESCE(INHPARENT, PG_CLASS.OID) PARENT | |
FROM PG_CLASS | |
LEFT JOIN PG_INHERIT_SHORT ON INHRELID = OID | |
WHERE RELKIND IN ('r', 'p') | |
) C | |
LEFT JOIN PG_NAMESPACE N ON N.OID = C.RELNAMESPACE | |
ORDER BY TOTAL_BYTES DESC | |
) A | |
WHERE OID = PARENT | |
) A | |
) TABLE_SIZES | |
LEFT JOIN D_CHANNELS ON D_CHANNELS.LOCAL_CHANNEL_ID = TABLE_SIZES.MIRTH_ID | |
LEFT JOIN CHANNEL ON CHANNEL.ID = D_CHANNELS.CHANNEL_ID | |
LEFT JOIN ( | |
SELECT (XPATH('string/text()', ENTRY)) [1]::TEXT AS CID | |
, (XPATH('com.mirth.connect.model.ChannelMetadata/enabled/text()', ENTRY)) [1]::TEXT::boolean AS PRUNE_ENABLED | |
, (XPATH('com.mirth.connect.model.ChannelMetadata/pruningSettings/pruneMetaDataDays/text()', ENTRY)) [1]::TEXT::INT AS PRUNE_DAYS | |
, (XPATH('com.mirth.connect.model.ChannelMetadata/pruningSettings/archiveEnabled/text()', ENTRY)) [1]::TEXT::boolean AS ARCHIVE_ENABLED | |
FROM ( | |
SELECT UNNEST(XPATH('/map/entry', VALUE::XML)) AS ENTRY | |
FROM CONFIGURATION | |
WHERE CATEGORY = 'core' AND NAME = 'channelMetadata' | |
) X | |
) AS M ON M.CID = D_CHANNELS.CHANNEL_ID; |
Adding order by pg_size_bytes(total_bytes) DESC
makes this query show the largest tables first.
Similar query for SQL Server is here - https://stackoverflow.com/questions/58942371/get-space-used-by-mirth-per-channel-in-sql-server
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Posting an idea here - I'll play with it as I have time.
A slack user was facing high memory usage issues in MC. One common cause of this is having large objects in map data. This causes high memory usage because ALL of those maps are in memory.
This is relevant to the size query here because the
d_mc
table size could be computed bycontent_type
. This would allow an engineer analysing space usage to determine if:The enum for
content_type
is:I think this would be best implemented as a stored procedure that takes a
local_channel_id
and a date range as an argument and returns a JSON object with the size breakdown bycontent_type
. I worry this query would be inefficient since it has to actually look at table data and not just PG internal data to calculate size.I implemented this:
The content type is part of a composite primary key, so it's indexed.
The query plan still has to do a parallel sequence scan though. I don't think it can be avoided
For my stated use case, perhaps computing the average map content length would be more useful to identify which channels are storing more data in maps on average than in total: