Skip to content

Instantly share code, notes, and snippets.

@westonruter
Last active December 10, 2024 21:40
Show Gist options
  • Save westonruter/c96cde34119957ecd06fa5283bd81934 to your computer and use it in GitHub Desktop.
Save westonruter/c96cde34119957ecd06fa5283bd81934 to your computer and use it in GitHub Desktop.
Query list of URLs in HTTP Archive which use Image Prioritizer
CREATE TEMPORARY FUNCTION IS_CMS(technologies ARRAY<STRUCT<technology STRING, categories ARRAY<STRING>, info ARRAY<STRING>>>, cms STRING, version STRING) RETURNS BOOL AS (
EXISTS(
SELECT * FROM UNNEST(technologies) AS technology, UNNEST(technology.info) AS info
WHERE technology.technology = cms
AND (
version = ""
OR ENDS_WITH(version, ".x") AND (STARTS_WITH(info, RTRIM(version, "x")) OR info = RTRIM(version, ".x"))
OR info = version
)
)
);
SELECT DISTINCT
page,
is_root_page,
rank
FROM
`httparchive.all.pages`
WHERE
date = '2024-11-01'
AND IS_CMS(technologies, 'WordPress', '')
AND IS_CMS(technologies, 'Image Prioritizer', '')
ORDER BY rank ASC
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment