- United States on github They have a great US module that has state abbrevs, names, etc. O'Reilly article about the project
- The State Decoded
- Legislative Documents in XML at the United States House of Representatives
- US Government Web Services and XML Data Sources
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
If you were to give recommendations to your "little brother/sister" on things that they need to do to become a data scientist, what would those things be?
I think the "Data Science Venn Diagram" (http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram) is a great place to start. You need three things to be a good data scientist:
- Statistical knowledge
- Programming/hacking skills
- Domain expertise
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
library(mgcv) | |
library(ggplot2) | |
library(dplyr) | |
library(XML) | |
library(weatherData) | |
us.airports.url <- 'http://www.world-airport-codes.com/us-top-40-airports.html' | |
us.airports <- readHTMLTable(us.airports.url)[[1]] %>% | |
filter(!is.na(IATA)) %>% |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
""" | |
The MIT License (MIT) | |
Copyright (c) 2015 Alec Radford | |
Permission is hereby granted, free of charge, to any person obtaining a copy | |
of this software and associated documentation files (the "Software"), to deal | |
in the Software without restriction, including without limitation the rights | |
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell | |
copies of the Software, and to permit persons to whom the Software is |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/** | |
* To get started: | |
* git clone https://github.com/twitter/algebird | |
* cd algebird | |
* ./sbt algebird-core/console | |
*/ | |
/** | |
* Let's get some data. Here is Alice in Wonderland, line by line | |
*/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# When you're sure of the format, it's much quicker to explicitly convert your dates than use `parse_dates` | |
# Makes sense; was just surprised by the time difference. | |
import pandas as pd | |
from datetime import datetime | |
to_datetime = lambda d: datetime.strptime(d, '%m/%d/%Y %H:%M') | |
%time trips = pd.read_csv('data/divvy/Divvy_Trips_2013.csv', parse_dates=['starttime', 'stoptime']) | |
# CPU times: user 1min 29s, sys: 331 ms, total: 1min 29s | |
# Wall time: 1min 30s |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import numpy as np | |
import pandas as pd | |
import datetime | |
import urllib | |
from bokeh.plotting import * | |
from bokeh.models import HoverTool | |
from collections import OrderedDict | |
## Read in our data. We've aggregated it by date already, so we don't need to worry about paging |
NewerOlder