Created
November 28, 2011 13:39
-
-
Save rjp/1400429 to your computer and use it in GitHub Desktop.
Top 10 domains in the scribots
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
scribot=# select count(1) as c, 'http://' || domain || '/' as url from (select substring(url from '.*://([^/]*)') as domain from url) as monkey group by domain order by c desc limit 10; | |
c | url | |
------+---------------------------- | |
8578 | http://news.bbc.co.uk/ | |
3014 | http://www.flickr.com/ | |
2260 | http://t.co/ | |
2069 | http://en.wikipedia.org/ | |
1934 | http://www.guardian.co.uk/ | |
1293 | http://www.bbc.co.uk/ | |
1277 | http://twitter.com/ | |
1061 | http://www.youtube.com/ | |
704 | http://flickr.com/ | |
636 | http://rjp.frottage.org/ | |
(10 rows) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Massive rise in t.co due to Twitter enforcement and introduction of twittermoo.