Skip to content

Instantly share code, notes, and snippets.

@tjmonsi
Last active June 26, 2019 12:24
Show Gist options
  • Save tjmonsi/01511617cbaea6954d62256e920a1d26 to your computer and use it in GitHub Desktop.
Save tjmonsi/01511617cbaea6954d62256e920a1d26 to your computer and use it in GitHub Desktop.
Counts the sites that delivers hosted fonts
@rviscomi
Copy link

Since these are JSON payloads, I'd like to see the JSON_EXTRACT function used to improve the resiliency.

@tjmonsi
Copy link
Author

tjmonsi commented Jun 26, 2019

@rviscomi I am checking the payload right now...
would it be ok to use Sec-Fetch-Site header to check if it is hosted somewhere else then get the host url for counting?

@tjmonsi
Copy link
Author

tjmonsi commented Jun 26, 2019

@rviscomi I just checked this and it returns all requests, including the sum of all requests from one origin. Is this count correct?

@rviscomi
Copy link

Here's an example of the approach I'd take:

SELECT value AS host, count
FROM (
  SELECT APPROX_TOP_COUNT(NET.HOST(url), 25) AS font_host
  FROM `httparchive.sample_data.summary_requests_desktop_1k`
  WHERE type = 'font'
), UNNEST(font_host)

It finds the 25 most common hosts of font resources.

@rviscomi
Copy link

You can modify this query to handle other metrics, like find the top 25 font formats by querying the ext field:

SELECT value AS ext, count
FROM (
  SELECT APPROX_TOP_COUNT(LOWER(ext), 25) AS font_extension
  FROM `httparchive.sample_data.summary_requests_desktop_1k`
  WHERE type = 'font'
), UNNEST(font_extension)

@tjmonsi
Copy link
Author

tjmonsi commented Jun 26, 2019

Wow, i made a lot more hehe... But thanks for this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment