Skip to content

Instantly share code, notes, and snippets.

@lordmatt
Last active February 16, 2026 09:56
Show Gist options
  • Select an option

  • Save lordmatt/31f4f0b508b2dcf353c36ab23f907eea to your computer and use it in GitHub Desktop.

Select an option

Save lordmatt/31f4f0b508b2dcf353c36ab23f907eea to your computer and use it in GitHub Desktop.
Search WordPress comments for the umber of WebMentions per domain excluding anything that came from Twitter., bluesky via brid.gy or Mastodon via brid.gy. Change REFIX for your table prefix. This version uses the source URL in the comment meta for more accurate data.
SELECT
SUBSTRING_INDEX(REPLACE(SUBSTRING_INDEX(`PREFIX_commentmeta`.`meta_value`,'//',-1),'www.',''),'/',1) AS domain,
COUNT(`d8_comments`.`comment_ID`) AS counted
FROM
`REFIX_comments` LEFT OUTER JOIN `REFIX_commentmeta` ON `REFIX_comments`.`comment_ID` = `REFIX_commentmeta`.`comment_id`
WHERE
`PREFIX_comments`.`comment_type`='Mention'
AND `REFIX_comments`.`comment_author_url` NOT LIKE '%twitter.com%'
AND `REFIX_comments`.`comment_author_url` NOT LIKE '%brid.gy%'
AND `REFIX_commentmeta`.`meta_value` NOT LIKE '%brid.gy%'
AND `REFIX_commentmeta`.`meta_key` = 'webmention_source_url'
GROUP BY domain
ORDER BY counted DESC
LIMIT 500;
@lordmatt
Copy link
Author

@pfefferle
Copy link

Your instinct about needing to join the meta table twice is correct. Since version 5.0.0, the plugin no longer stores all webmentions with a single comment_type. Instead, each webmention gets a semantic type (mention, like, repost, bookmark, comment, etc.), and the protocol meta key (set to webmention) is what identifies them as webmentions.

So comment_type = 'Mention' in your query won't work — it would miss most webmentions on a current install.

Here's a corrected version:

SELECT
    SUBSTRING_INDEX(
        REPLACE(SUBSTRING_INDEX(src.meta_value, '//', -1), 'www.', ''),
        '/', 1
    ) AS domain,
    COUNT(c.comment_ID) AS counted
FROM wp_comments c
INNER JOIN wp_commentmeta proto
    ON c.comment_ID = proto.comment_id
    AND proto.meta_key = 'protocol'
    AND proto.meta_value = 'webmention'
INNER JOIN wp_commentmeta src
    ON c.comment_ID = src.comment_id
    AND src.meta_key = 'webmention_source_url'
WHERE
    c.comment_author_url NOT LIKE '%twitter.com%'
    AND c.comment_author_url NOT LIKE '%brid.gy%'
    AND c.comment_author_url NOT LIKE '%bsky.app%'
GROUP BY domain
ORDER BY counted DESC
LIMIT 500;

The main changes:

  • Uses protocol = 'webmention' to find webmentions (the stable, canonical way)
  • Two meta joins: one for protocol, one for webmention_source_url — this is the right approach
  • Filters on comment_author_url rather than source URL for excluding bridged content — brid.gy sets the author URL to the original platform (twitter.com, etc.), so this catches the actual origin
  • INNER JOIN instead of LEFT OUTER JOIN since we only want comments that have these meta values

The protocol and webmention_source_url meta keys are stable and the intended way to query webmentions — no deprecation concerns there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment