Skip to content

Instantly share code, notes, and snippets.

@chilts
Created October 30, 2013 09:27
Show Gist options
  • Save chilts/7229605 to your computer and use it in GitHub Desktop.
Save chilts/7229605 to your computer and use it in GitHub Desktop.
Getting the Alexa top 1 million sites directly from the server, unzipping it, parsing the csv and getting each line as an array.
var request = require('request');
var unzip = require('unzip');
var csv2 = require('csv2');
request.get('http://s3.amazonaws.com/alexa-static/top-1m.csv.zip')
.pipe(unzip.Parse())
.on('entry', function (entry) {
entry.pipe(csv2()).on('data', console.log);
})
;
@evilpie
Copy link

evilpie commented Jul 28, 2024

This is a link to the Cisco Umbrella popularity list. Archive.org has luckily archived the zip: https://web.archive.org/web/20230401000000*/https://s3.amazonaws.com/alexa-static/top-1m.csv.zip

@muserk1977
Copy link

You can check our website for this notebookdepo

@grayguest
Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment