The CitySDK is a JavaScript library, which combines and abstracts three underlying APIs into a single function. As little as possible was "created" to facilitate the simplest abstraction possible. However, some trade-offs where made in v2 to enable the greatest flexibility of the abstraction, without creating an unsustainable maintanance burden for the library custodians.
The biggest of these trade-offs we've made is that we are not aliasing Census variables here. It is the burden of the user to choose the correct variable for each vintage they seek (as the same variable ID can mean different things between vintages). The primary reason for this choice is that the underlying API is still in flux and we don't want to be blamed for any mis-assigned aliases.
That having been said, here we will give you the fastest path for discovering your variables of interest.
The most influencial API that underlies the CitySDK is the Data API (api.census.gov/data). This is where the ontology of the parameters comes from and what determined the majority of the CitySDK's API. Thus, before we dive into the CitySDK, we first to familiarize ourselves with the data API.
The Census Bureau produces a boat-load of products, so - when I help members of the open data-user community who have never worked with Census data before - I typically refer to a couple of resources to help with discovery. Before this year, I would recommend starting with American Factfinder, but the Census Bureau is phasing that product out in 2019, so the new go-to tool will be data.census.gov.
The first thing you're going to want to do when using this tool is to figure out what topics and levels of geography you're interested in. Let's say (for example) that we're looking for the GINI Index by county (or equivalent) for the US and its territories.
When you begin to type a search term into data.census.gov, there's a nifty type-ahead/smart search result panel that pops up. In this case, the term that pops up based on our query is B19083: GINI INDEX OF INCOME INEQUALITY
If we follow this result, we'll see the most important information we'll need to continue our little journey of discovery right at the top of the page:
Survey/Program: American Community Survey Year: 2017 Estimate: 5-Year TableID: B19083
You can make sure these data are available at your geographic level of interest by expanding the title bar using the dropdown arrow to the right of the info above. Choose "Change Geography" and then make sure you see the geographic level you're interested available (not greyed-out) in the "GEOGRAPHY" column.
Step 2: Go to census.gov/developers
Now that we've got the info above, we can use this to search the "Available APIs". In this case, we're looking for the 2017 vintage of the American Community Survey 5-Year Estimates.
Here we can find some really handy links to get us going. I would recommend you read the API page for your survey of interest in full, just to get acquainted with the ontology of the API as this is - again - will be what you will need to use citysdk.
Now, let's remember what we found in data.census.gov:
[x] Survey/Program: American Community Survey [x] Year: 2017 [x] Estimate: 5-Year [ ] TableID: B19083
So far we're on the right track. We've tracked down the first three parts here. Now were might we find the B19083 table? Let's look at the examples:
What we can see from this page is that this survey contains multiple endpoints (aka sourcePaths):
- Detailed Tables: API Call: https://api.census.gov/data/2017/`acs/acs5`?get=NAME,group(B01001)&for=us:1 ...
- Subject Tables: API Call: https://api.census.gov/data/2017/`acs/acs5/subject`?get=NAME,group(S0101)&for=us:1 ...
- Data Profiles: API Call: https://api.census.gov/data/2017/`acs/acs5/profile`?get=group(DP02)&for=us:1 ...
- Comparison Profile: API Call: https://api.census.gov/data/2017/`acs/acs5/cprofile`?get=group(CP05)&for=us:1 ...
Just looking naively at these examples, we can see that the only one containing a variable that starts with a B (like ours) is the first one. The "Detailed Tables".
This gives us one of our citysdk arguments:
sourcePath: [acs, acs5]
Now, in order to track down the actual variable we want, we'll need to take another step in...
I always recommend users try going from this point to using the groups functionality in api.census.gov/data (the discovery tool). For example, for our table (B19083), we can find the associated variables by using the tool like so:
ANATOMY:
https://api.census.gov/data/2017/acs/acs5/groups/B19083.html
βββ¬βββββββ¬βββββ βββββ¬ββ
vintage sourcePath Table ID
In our case we get the following back from the discovery tool:
| Name | Label | Concept | Required | Attributes | Limit | Predicate Type | Group | Values |
|---|---|---|---|---|---|---|---|---|
| B19083_001E | Estimate!!Gini Index | GINI INDEX OF INCOME INEQUALITY | predicate-only | 0 | float | B19083 | N/A | |
| B19083_001EA | Annotation of Estimate!!Gini Index | predicate-only | 0 | string | B19083 | N/A | ||
| B19083_001M | Margin of Error!!Gini Index | GINI INDEX OF INCOME INEQUALITY | predicate-only | 0 | float | B19083 | N/A | |
| B19083_001MA | Annotation of Margin of Error!!Gini Index | predicate-only | 0 | string | B19083 | N/A |
We'll want the first variable here B19083_001E, which is an estimate of type float (percentage/ratio).
To look throught the available geographic levels, we canuse the examples page of the discovery tool, e.g.:
https://api.census.gov/data/2017/acs/acs5/examples.html
For counties, the examples are:
https://api.census.gov/data/2017/acs/acs5?get=B01001_001E,NAME&for=county:*&key=YOUR_KEY_GOES_HERE
https://api.census.gov/data/2017/acs/acs5?get=B01001_001E,NAME&for=county:*&in=state:*&key=YOUR_KEY_GOES_HERE
https://api.census.gov/data/2017/acs/acs5?get=B01001_001E,NAME&for=county:013&in=state:02&key=YOUR_KEY_GOES_HERE
This tells us that we can get all the counties in the US in a single call (example 1), all the counties in a state (example 2) and a single county - qualified by a single state (example 3).
You can also get a little more detail about the geographic availability of this endpoint by visiting:
https://api.census.gov/data/2017/acs/acs5/geography.json
This will tell you which geographies support wildcards.
Now, let's test out our API call:
https://api.census.gov/data/2017/acs/acs5?get=B19083_001E,NAME&for=county:*
Woohoo! It works!
In summary, we've found everything we need to get started using either the Census API by itself ('raw') or via the citysdk. Either way, you're in a better place. Here's where we are with our discovery:
[x] Survey/Program: American Community Survey [x] Year: 2017 [x] Estimate: 5-Year [x] TableID: B19083
ANATOMY:
https://api.census.gov/data/2017/acs/acs5?get=B19083_001E,NAME&for=county:*
βββ¬βββββββ¬βββββ ββββββββββββ¬βββββ βββ¬βββββ
vintage sourcePath values geoHierarchy
Corresponding to these options in citysdk
[x] sourcePath: [acs, acs5]
[x] vintage: 2017
[x] values: [B19083_001E]
[x] geoHierarchy: { county: "*" }
npm install citysdk
CitySDK v2.0 exports a single function, which takes two arguments:
- The first is an options object with a set of key/value pair parameters (See "Parameters" below)
- The second is a conventional (error, response) node-style callback, which will be called upon completion of the
censusfunction and applied to the response
Brief overview of each argument parameter that can be passed into CitySDK v2.0
| Parameter | Type | Description | Geocodes | Stats | GeoJSON | GeoJSON with Stats |
|---|---|---|---|---|---|---|
vintage |
int/str |
The reference year (typically release year) of the data | β | β | β | β |
geoHierarchy |
object |
The geographic scope and hierarchical path to the data | β | β | β | β |
sourcePath |
array |
Refers to the Census product of interest | β | β | ||
values |
array |
For statistics, values request counts/estimates via variable IDs |
β | β | ||
geoResolution |
str |
Resolution of GeoJSON ("20m", "5m", and "500k" available) |
β | β | ||
predicates |
object |
Used as a filter available on some values |
β* |
β* |
||
statsKey |
str |
You may request a key for Census' statistics API here | β** |
β** |
* : optional
** : optional for < 500 requests daily
With the exception of "microdata" statistics (not yet available via Census' API), all Census data is aggregated to geographic areas of different sizes. As such, all of Census' API's require a set of/unique geographic identifier(s) to return any data (AKA: FIPS GEOIDs). Given that these identifiers are not common knowledge, the CitySDK provides a way for the user to identify their geographic scope of interest using a geographic coordinate (lat + lng).
Under the hood, this functionality calls the TigerWeb Web Mapping Service with the lat & lng provided and pipes the resulting FIPS codes into your options argument with the appropriate GEOIDs for identifying your geographic area of interest.
For a list of geographies currently available for geocoding with this feature, see the Geographies Available by Vintage section below.
There are two ways to scope your geography using this functionality:
- Request a single geographic area
- Request all of a descendant geography-type of a coordinate-specified geographic area
RETURN TYPE: JSON
You may pass a {"lat" : <float>, "lng" : <float>} object as the first and only value for the geoHierarchy key:
import census from 'citysdk'
census({
"vintage" : 2015, // required
"geoHierarchy" : { // required
"county" : {
"lat" : 28.2639,
"lng" : -80.7214
}
}
},
(err, res) => console.log(res)
)
// result -> {"vintage":"2015","geoHierarchy":{"state":"12","county":"009"}}Notice how the function prepends an additional geographic component ("state" : "12") to the options object. In order to fully qualify the geographic area (GEOID) associated with the county, the state is needed. In this example the fully qualified GEOID would be 12009 with the first two digits (12) qualifying the state and 009 qualifying the county within that state. This appropriate geographic hierarchy creation is handled by the function for you.
RETURN TYPE: JSON
import census from 'citysdk'
census({
"vintage" : "2015", // required
"geoHierarchy" : { // required
"state" : {
"lat" : 28.2639,
"lng" : -80.7214
},
"county" : "*" // <- syntax = "<descendant>" : "*"
}
},
(err, res) => console.log(res)
)
// result -> {"vintage":"2015","geoHierarchy":{"state":"12","county":"*"}}All Census-defined geographic areas are composed of Census "Blocks". Some of these composed areas - themselves - compose into higher-order areas. These nested relationships between certain geographic areas allows the Census data user to request all descendants of a particular type.
- In this example, we added a second geographic level to our
geoHierarchyobject ("county" : "*"). It is important to use the"*"expression signifying that you want all of the specified level of descendants within the geography for which you supply a coordinate. No other expression will work. - Internally, the CitySDK converts the
geoHierarchyobject to an ordered set, so this part of your request object must be in descending hierarchical order from parent -> descendant. E.g. - in the above - an object that contained{"county" : "*", "state" : {"lat" <lat> "lng" <lng>}}will not work.
This parameter set will call the Census Statistics API and reformat the results with a couple highly requested features:
- Census statistics are returned as a standard JSON object rather than the csv-like format of the "raw" API
- Statistical values are translated into properly typed numbers (Integers and Floats instead of strings), whereas all values are returned as strings via the "raw" API
- Annotation values (e.g., error codes) that are returned (e.g., American Community Survey error codes) in places where data would be expected are returned as strings (rather than numbers) to make differentiating them from values a simple type check.
There are two ways to request Census statistics using citysdk:
- Calling for
valuesof estimates and other statistical values (required) - Apply a filter by using
predicates(optional)
For both of these options, a sourcePath needs to be supplied. This is the fully qualified path to the product. For more information about how to find the sourcePath to your product of interest, go to the Developers' Microsite and - in any of the examples of making a call - take the path between <vintage>/ and the ?get. For example, for American Community Survey 1-year you'll the first example (2017) shows:
https://api.census.gov/data/2017/acs/acs1?get=NAME,group(B01001)&for=us:1
βββ¬βββββββ¬βββββ
vintage sourcePath
The corresponding sourcePath for this endpoint is ["acs", "acs1"]
RETURN TYPE: JSON
import census from 'citysdk'
census({
"vintage" : 2015, // required
"geoHierarchy" : { // required
"county" : {
"lat" : 28.2639,
"lng" : -80.7214
}
},
"sourcePath" : ["cbp"], // required
"values" : ["ESTAB"] // required
},
(err, res) => console.log(res)
)
// result -> [{"ESTAB":13648,"state":"12","county":"009"}]Here, we added the parameters for sourcePath (the path to the survey and/or source of the statistics) and values (the identifiers of the statistics we're interested in). By including these parameters within your argument object, you trigger the census function to get statistics. This "deploy on parameter set" strategy is how the census function determines your intent.
You're probably thinking: "How am I supposed to know what codes to use inside those parameters?" - or - "Where did that "cbp" & "ESTAB" stuff come from?" The data sets covered by the CitySDK are vast. As such, this is the steepest part of the learning curve. But, don't worry, there are a number of different resources available to assist you in your quest:
- The Census Developers' Microsite <- START HERE
- The Census Discovery Tool.
- Census Slack and Gitter developer communities.
- Data Experts
RETURN TYPE: JSON
census({
"vintage" : 2015, // required
"geoHierarchy" : { // required
"county" : {
"lat" : 28.2639,
"lng" : -80.7214
}
},
"sourcePath" : ["cbp"], // required
"values" : ["ESTAB"], // required
"statsKey" : "<your key here>" // required for > 500 calls per day
},
(err, res) => console.log(res)
)
// result -> [{"ESTAB":13648,"state":"12","county":"009"}]RETURN TYPE: JSON
Predicates are used to create a sub-selection of statistical values based on a given range or categorical qualifyer.
census({
"vintage" : "2017",
"geoHierarchy" : {
"state" : "51",
"county" : "*"
},
"sourcePath" : ["acs", "acs1"],
"values" : ["NAME"],
"predicates" : {
"B01001_001E" : "0:100000" // number range separated by `:`
},
"statsKey" : "<your key here>"
},
(err, res) => console.log(res)
)
/* result:
[
{
"NAME":"Augusta County, Virginia",
"B01001_001E" : 75144,
"state":"51",
"county":"015"
},
{
"NAME":"Bedford County, Virginia",
"B01001_001E" : 77974,
"state":"51",
"county":"019"
},
...
]
*/If you'd like to use "timeseries" data, you may do so for statistics only. Mapping timeseries data is currently unsupported. Note that many timeseries products rely heavily on the "predicates" option:
RETURN TYPE: JSON
census({
"vintage" : "timeseries",// required
"geoHierarchy" : { // required
"county" : {
"lat" : 28.2639,
"lng" : -80.7214
}
},
"sourcePath" : ["asm", "industry"], // required
"values" : ["EMP","NAICS_TTL","GEO_TTL"],
"predicates": {"time": "2016", "NAICS": "31-33"}
},
(err, res) => console.log(res)
)
/* result:
[{"EMP": 11112764,
"NAICS_TTL": "Manufacturing",
"GEO_TTL": "United States",
"time": "2016",
"NAICS": "31-33",
"us":"1"}]
*/For some sources (e.g., the American Community Survey), most of the values can also be used as predicates, but are optional. In others, (e.g., International Trade), predicates are a key part of the statistical query. In either case, at least one value within values must be supplied.
You can also use the CitySDK to retrieve Cartographic Boundary files, which have been translated into GeoJSON. The only additional parameter you'll need to know is a simple declaration of geoResolution of which there are three options:
| Resolution | Map Scale | Benefits | Costs |
|---|---|---|---|
| 500k | 1:500,000 | Greatest variety of summary levels & Most detailed | largest file sizes |
| 5m | 1:5,000,000 | Balance between size and detectable area size | lowest variety of available area types |
| 20m | 1:20,000,000 | Smallest file sizes | lowest level of detail |
See the full available Cartographic GeoJSON in the Geographies Available by Vintage section
Example: Saving the file locally in Node.js using fs
RETURN TYPE: JSON STRING
var fs = require("fs")
census({
"vintage" : 2017,
"geoHierarchy" : {
"metropolitan statistical area/micropolitan statistical area": "*"
},
"geoResolution" : "500k" // required
},
(err, res) => {
fs.writeFile("./directory/filename.json",
JSON.stringify(res),
() => console.log("done")
)}
)This would convert the returned geojson to a string, which allows it to be saved via Node.js' fileSystem API.
census({
"vintage" : "2017",
"geoHierarchy" : {
"state": "51",
"county": "*"
},
"geoResolution" : "500k" // required
},
(err, res) => console.log(res)
)It's important to note that - when querying for these GeoJSON files - you may retrieve a larger area than your request argument specifies. The reason for this is that the files are (currently) stored at two geographic levels: National and by State. Thus, the query above will attempt to resolve, at the state level, all counties, but because counties are stored at the national level in vintage 2017, all the counties in the US will be returned by this query.
If you wish to get back only those geographies you specify, you may do so by using the last and perhaps most useful feature included in the v2.0 release: Getting GeoJSON with statistics included within the "FeatureCollection" properties object!
RETURN TYPE: JSON
There are a number of reasons you might want to merge your statistics into their GeoJSON/geographic boundaries, all of which are relevant when seeking to map Census data:
- Creating choropleth maps of statistics (e.g., using
values) - Mapping only those geographies that meet a certain set of criteria
- Showing a user their current Census geographic context (i.e., leveraging the Geocoding capabilities of CitySDK)
A more dynamic example of using stats merged with GeoJSON on the fly with citysdk can be found here:
Type in a county name and see the unweighted sample count of the population (ACS) for all the Block Groups within that County.
Use Chrome for best results (mapbox-gl geocoder caveat)
census({
"vintage" : "2017",
"geoHierarchy" : {
"county": "*"
},
"sourcePath" : ["acs", "acs5"],
"values" : ["B19083_001E"], // GINI index
"statsKey" : "<your key here>",
"geoResolution" : "500k"
},
)In this example, we use citysdk to create the payload and then save it via Nodes fs.writeFileSync and then serve it via a Mapbox-GL map.
census({
"vintage" : "2017",
"geoHierarchy" : {
"zip-code-tabulation-area" : "*"
},
"sourcePath" : ["acs", "acs5"],
"values" : ["B19083_001E"], // GINI index
"statsKey" : "<your key here>",
"geoResolution" : "500k"
},
)This is a very large request, in fact, one of the largest you could possibly make in a single citysdk function call. It is so large, in fact that it currently only works on Node and only if you increase your node --max-old-space-size=4096. With large merges (such as all counties or zctas), it is recommended not to try to use citysdk dynamically, but - rather - to munge your data before hand with citysdk and then serve it statically to your mapping library, as was done here:
// Call the WMS only
{
"vintage": 2014,
"geoHierarchy": { "state": { "lat": 28.2639, "lng": -80.7214 }, "county": '*' }
}
// Getting the stats for a single county filtering out any county with population over 100,000
{
"vintage": 2016,
"geoHierarchy": { "county": { "lat": 28.2639, "lng": -80.7214 } },
"sourcePath": [ "acs", "acs5" ],
"values": [ "B01001_001E" ]
"predicates": { "B00001_001E": "0:100000" },
}
// strings are valid as vintages as well
{
"vintage": "2015",
"geoHierarchy": { "county": { "lat": 28.2639, "lng": -80.7214 } },
"sourcePath": [ "cbp" ],
"values": [ "ESTAB" ]
}
// Just geojson for all the counties within a state located by a given coordinate
{
"vintage": 2014,
"geoHierarchy": { "state": { "lat": 28.2639, "lng": -80.7214 }, "county": "*" },
"geoResolution": "500k"
}
// For large request expect to have to increase `node --max-old-space-size=4096`
{
"vintage": 2016,
"sourcePath": [ "acs", "acs5" ],
"values": [ "B25001_001E" ],
"geoHierarchy": { "zip-code-tabulation-area": "*" },
"geoResolution": "500k"
}The Census Bureau publishes both high and low accuracy geographic area files to accommodate the widest possible variety of user needs (within feasibility). Cartography Files are simplified representations of selected geographic areas from the Census Bureauβs Master Address File/Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) system. These boundary files are specifically designed for small scale thematic mapping (i.e., for visualizations).
For a while now, we have published our cartography files in the .shp format. More recently, we expanded our portfolio of available formats to .kml. It is with this release that we follow suit with the community at large to release these boundaries in .json (GeoJSON) format.
The most comprehensive set of geographies and vintages can be found within the 500k set.
Some vintages - 103 through 110 - are references to sessions of Congress and only contain a single geographic summary level: "congressional district"
The following tables represent the availability of various geographic summary levels through the remaining vintages:
| Geographic Area Type | 1990 | 2000 | 2010 | 2012 | 2013 - 2015 | 2016 - 2017 |
|---|---|---|---|---|---|---|
"alaska native regional corporation" |
β | β | β | β | β | |
"american indian-area/alaska native area/hawaiian home land" |
β | β | β | β | β | |
"block group" |
β | β | β | β | β | |
"combined new england city and town area" |
β | β | ||||
"combined statistical area" |
β | β | β | |||
"congressional district" |
β | β | β | β | ||
"consolidated cities" |
β | β | β | β | ||
"county" |
β | β | β | β | β | |
"county subdivision" |
β | β | β | β | β | |
"division" |
β | β | β | β | ||
"metropolitan statistical area/micropolitan statistical area" |
β | β | β | |||
"new england city and town area" |
β | β | β | |||
"place" |
β | β | β | β | β | |
"public use microdata area" |
β | β | ||||
"region" |
β | β | β | β | ||
"school district (elementary") |
β | β | β | |||
"school district (secondary") |
β | β | β | |||
"school district (unified") |
β | β | β | |||
"state" |
β | β | β | β | β | |
"state legislative district (lower chamber") |
β | β | β | β | β | |
"state legislative district (upper chamber") |
β | β | β | β | β | |
"tract" |
β | β | β | β | β | |
"urban area" |
β | β | β | β | β | |
"us" |
β | β | β | |||
"zip code tabulation area" |
β | β | β |
- For more information about the files translated herein please visit the Census Bureau's Cartographic Boundary File Description Page
- For a comparison of the available formats of geographic area files, please visit the Census Bureau's TIGER Products page
- Join us on Gitter
- Join us on Slack
- Send us an email: cnmp.developers.list@census.gov
If you're new to Census data and need some help figuring out which of the many products Census curates for public use, don't hesitate to reach out to these contacts for help:
- Ryan Dolan: ryan.s.dolan@census.gov
- Gerson Vasquez: gerson.d.vasquez@census.gov
- Alexandra Barker: alexandra.s.barker@census.gov
