Skip to content

Instantly share code, notes, and snippets.

@alexstorer
Created November 13, 2013 21:13
Show Gist options
  • Save alexstorer/7456498 to your computer and use it in GitHub Desktop.
Save alexstorer/7456498 to your computer and use it in GitHub Desktop.

SQL Dump

Goal: Convert .sql file ("sql dump") into CSV files.

Get the data

The data lives in a zip file here:

https://api.whitehouse.gov/v1/downloads/data.sql.zip

If you download it and extract the sql file, you can look at its contents (only first few lines shown)

	-- MySQL dump 10.13  Distrib 5.5.24, for Linux (x86_64)
	--
	-- Host: dbmaster-42    Database: petitions
	-- ------------------------------------------------------
	-- Server version	5.5.24-55-log

	/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
	/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
	/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
	/*!40101 SET NAMES utf8 */;
	/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
	/*!40103 SET TIME_ZONE='+00:00' */;
	/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
	/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
	/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
	/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

	--
	-- Table structure for table wtp_data_issues
	--

	DROP TABLE IF EXISTS wtp_data_issues;
	/*!40101 SET @saved_cs_client     = @@character_set_client */;
	/*!40101 SET character_set_client = utf8 */;
	CREATE TABLE wtp_data_issues (
	  id int(11) NOT NULL DEFAULT '0' COMMENT 'TODO: please describe this field!',
	  name varchar(64) NOT NULL DEFAULT '' COMMENT 'TODO: please describe this field!',
	  PRIMARY KEY (id)
	) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='TODO: please describe this table!';
	/*!40101 SET character_set_client = @saved_cs_client */;

	--
	-- Dumping data for table wtp_data_issues
	--

	LOCK TABLES wtp_data_issues WRITE;
	/*!40000 ALTER TABLE wtp_data_issues DISABLE KEYS */;
	set autocommit=0;
	INSERT INTO wtp_data_issues VALUES (1,'Agriculture'),(2,'Arts and Humanities'),(3,'Budget and Taxes'),(4,'Civil Rights and Liberties'),(8,'Climate Change'),(9,'Consumer Protections'),(10,'Criminal Justice and Law Enforcement'),(12,'Defense'),(13,'Disabilities'),(16,'Economy'),(18,'Education'),(19,'Energy'),(20,'Environment'),(21,'Family'),(22,'Firearms'),(24,'Foreign Policy'),(25,'Health Care'),(26,'Homeland Security and Disaster Relief'),(27,'Housing'),(28,'Human Rights'),(29,'Immigration'),(30,'Innovation'),(97,'Job Creation'),(103,'Labor'),(109,'Natural Resources'),(115,'Postal Service'),(121,'Poverty'),(127,'Regulatory Reform'),(133,'Rural Policy'),(139,'Science and Space Policy'),(145,'Social Security'),(151,'Small Business'),(157,'Technology and Telecommunications'),(163,'Trade'),(169,'Transportation and Infrastructure'),(175,'Urban Policy'),(181,'Veterans and Military Families'),(187,'Women\'s Issues'),(193,'Government Reform');
	/*!40000 ALTER TABLE wtp_data_issues ENABLE KEYS */;
	UNLOCK TABLES;
	commit;
	...

These are all SQL commands which can be used to recreate the original database. Unfortunately, not every version of SQL is interchangeable - these commands will work in mysql.

Local MySQL on a Mac

To install MySQL on a Mac, it's easiest to use homebrew. Install the software, then start the MySQL server, then start the MySQL client as follows:

brew install mysql
mysql.server start
mysql -uroot

Importing the dump

Once in MySQL, you must enter the commands to make a new database and use it:

mysql> CREATE DATABASE pres;
mysql> USE PRES;

And finally execute the code in the sql dump:

mysql> \. data.sql

Exporting the data

We can see what tables are present by using the following command inside MySQL:

mysql> SHOW TABLES;
+-----------------------------+
| Tables_in_pres              |
+-----------------------------+
| wtp_data_issues             |
| wtp_data_petition_issues    |
| wtp_data_petition_responses |
| wtp_data_petitions          |
| wtp_data_responses          |
| wtp_data_signatures         |
+-----------------------------+
6 rows in set (0.00 sec)

For each of the tables, we can export the tables as following:

mysql> SELECT * FROM wtp_data_signatures INTO OUTFILE '/tmp/wtp_data_signatures.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n';

This saves the data in the directory /tmp, and this command must be repeated for each table with a different name for each.

I zipped these csv files, and they come out to 200mb or so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment