Skip to content

Instantly share code, notes, and snippets.

@summer4096
Created June 6, 2012 21:21
Show Gist options
  • Save summer4096/2884879 to your computer and use it in GitHub Desktop.
Save summer4096/2884879 to your computer and use it in GitHub Desktop.
This snippet will download the May 2012 wikipedia dump, extract it, convert it to sql, and feed it to mysql, all at the same time.
curl http://dumps.wikimedia.org/enwiki/20120502/enwiki-20120502-pages-articles-multistream.xml.bz2 | bzcat | ./import.pl | mysql -f -uuser -ppass --default-character-set=utf8 wikipedia
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment