Analysis of news sites for Speedometer using HTTPArchive
- create-dataset.sql is used to create a smaller table with both desktop and mobile results so that analysis is cheaper to run. Takes about 5TB of processing to generate this table. Note that the set of URLs is gathered from external links on the pages beneath https://en.wikipedia.org/wiki/Wikipedia:News_sources. The data may be skewed due to the lists on Wikipedia not reflecting the most popular content, including non-news sites, and including URL paths which are ignored. The list could be swapped out with a different list and have queries re-run.
- query-dataset.sql creates a summary table the first time it runs, and then queries the results of that along with some reporting
- running-the-same-custom-metrics-in-the-ui.js is meant to be used as a "Custom Metric" on the webpagetest UI as per the instructions at https://github.com/HTTPArchive/custom-metrics/tree/8497c859ef0a7c99924981f369bb53eb3441bd6c#testing
The following custom metrics are used to