Skip to content

Instantly share code, notes, and snippets.

@JoeGermuska
JoeGermuska / All ACS 2011-1 first column labels by count, excluding those with Total
Last active December 20, 2015 20:59
All first column labels from Census, with count of occurrences. csvgrep -c 4 -r '^1$' acs2011_1yr_Sequence_Number_and_Table_Number_Lookup.csv | csvgrep -c 8 -r '^Total' -i | csvcut -c 8 | sort | uniq -c
1 "Aggregate interest, dividends, or net rental income in the past 12 months (in 2011 inflation-adjusted dollars)"
1 "Aggregate number of vehicles (car, truck, or van) used in commuting:"
1 Aggregate Social Security income in the past 12 months (in 2011 inflation-adjusted dollars)
1 Aggregate Supplemental Security Income (SSI) in the past 12 months (in 2011 inflation-adjusted dollars)
1 Aggregate contract rent
1 Aggregate earnings in the past 12 months (in 2011 inflation-adjusted dollars)
1 Aggregate earnings in the past 12 months (in 2011 inflation-adjusted dollars):
1 Aggregate family income in the past 12 months (in 2011 inflation-adjusted dollars)
3 Aggregate family income in the past 12 months (in 2011 inflation-adjusted dollars):
1 Aggregate gross rent
@JoeGermuska
JoeGermuska / Only tables with 0.5 columsn
Last active December 20, 2015 21:09
non-data columns from ACS 2011-1year
Table ID,Line Number Decimal M Lines,Long Table Title
B01002,,MEDIAN AGE BY SEX
B01002,0.5,Median age --
B01002A,,MEDIAN AGE BY SEX (WHITE ALONE)
B01002A,0.5,Median age --
B01002B,,MEDIAN AGE BY SEX (BLACK OR AFRICAN AMERICAN ALONE)
B01002B,0.5,Median age --
B01002C,,MEDIAN AGE BY SEX (AMERICAN INDIAN AND ALASKA NATIVE)
B01002C,0.5,Median age --
B01002D,,MEDIAN AGE BY SEX (ASIAN ALONE)
Demographics
age
race
seniors
sex (not gender: http://factfinder2.census.gov/help/en/glossary/g/gender.htm)
Social
ancestry
citizenship
disability
education
@JoeGermuska
JoeGermuska / count_impacted_tracts output
Last active December 27, 2015 02:19
Finding fragments of census tracts that are not in a place (in this case, Chicago) when other parts of the tract are in that place. Working against Census Reporter SQL schema as in http://censusreporter.tumblr.com/post/55886690087/using-census-data-in-postgresql
state | county | tract | count
-------+--------+--------+-------
17 | 031 | 010400 | 1
17 | 031 | 020702 | 1
17 | 031 | 030101 | 1
17 | 031 | 030102 | 1
17 | 031 | 030103 | 1
17 | 031 | 030104 | 1
17 | 031 | 030601 | 1
17 | 031 | 030604 | 1
@JoeGermuska
JoeGermuska / poverty_rate.html
Last active January 2, 2016 07:18
Find geographies with a poverty rate of greater than 40% with less than 10% derived MOE (excluding Puerto Rico) -- see https://www.pivotaltracker.com/story/show/63255358 I can't remember what it takes to get HTML rendered in gist, so I included the same data as text also.
<table border="1">
<tr>
<th align="center">geoid</th>
<th align="center">rate</th>
<th align="center">moe</th>
<th align="center">b17001001</th>
<th align="center">b17001002</th>
<th align="center">b17001001_moe</th>
<th align="center">b17001002_moe</th>
</tr>
@JoeGermuska
JoeGermuska / counties_with_low_income_tracts.sql
Created January 5, 2014 22:34
Low earning counties: find all tracts in the US where no one earns more than $50K a year, and list the counties they are contained by and the median income for that county. Source: Table B19001 'Household Income in the Past 12 Months (In 2012 Inflation-adjusted Dollars)' American Community Survey as loaded in Census Reporter Postgres database (h…
select c.geoid, c.name, d2.B19013001, count(*)
from acs2012_5yr.geoheader c,
acs2012_5yr.geoheader g,
acs2012_5yr.B19001 d,
acs2012_5yr.B19013 d2
where B19001011 = 0
and B19001012 = 0
and B19001013 = 0
and B19001014 = 0
and B19001015 = 0

After some discussion about how a map showing that half of US income can be attributed to the major metro areas, I wanted to see if I could work out how public assistance is distributed.

The American Community Survey has a table, B19067, Aggregate Public Assistance Income for Households, which tells the total amount of dollars in public assistance income that go to each geography. As you might guess, the biggest metro areas have the most dollars.

So I downloaded some CSVs from Census Reporter for all CBSA (metro/micropolitan) areas in the US:

@JoeGermuska
JoeGermuska / keybase.md
Created March 7, 2014 20:13
Keybase Proof

Keybase proof

I hereby claim:

  • I am JoeGermuska on github.
  • I am JoeGermuska (https://keybase.io/JoeGermuska) on keybase.
  • I have a public key whose fingerprint is A749 8790 8686 30C1 F76F 91F5 D6AD BFFD 3BB7 20C5

To claim this, I am signing this object:

@JoeGermuska
JoeGermuska / comm_zip_overlap.csv
Created June 6, 2014 15:30
ZIP Code to Chicago Community area coverage map.
zip area_number pct_covered
46320 52 5.65666700010227e-05
46320 55 0.000829988063627611
60007 76 0.0374161100203793
60018 76 0.537940977303293
60068 10 0.000114681511398346
60068 76 6.56825603108841e-06
60068 9 0.00048889250274846
60076 2 0.0021111871228966
60077 12 7.94752653466832e-07
@JoeGermuska
JoeGermuska / gist:38765a72d7b8d443c0ce
Created August 1, 2014 20:01
An attempt to identify tables suitable for use with interpolation method to compute a median for aggregated geographies.

See censusreporter/census-aggregate#1 (comment) for more info.

whee. I think this is a pretty good distillation of tables which might go through the aggregation process which are complicated... this needs more eyes for sure.

Apparent pairs suitable for interpolation B01001 Sex by age (repeat for racial iterations) B01002 Median age by sex

B05003 Sex by Age by Nativity and Citizenship Status B05004 Median Age by Nativity and Citizenship Status by Sex