kgjenkins, 2020-06-26
The US Census has PUMA boundary shapefiles available by state. Here's the directory of all the 2010 PUMAs, 2019 vintage: https://www2.census.gov/geo/tiger/TIGER2019/PUMA/
Download manually one at a time, or automate using wget:
wget -A img -r -l1 ftp://ftp2.census.gov/geo/tiger/TIGER2019/PUMA/
-A
limits to specific file extensions
-r
means recursive
-l1
limits recursiveness to one level
QGIS can read the zipfiles without us having to extract them. (Although extracting is usually recommended if you are not immediately saving to another format.)
- In QGIS, under the Processing menu, open the Toolbox.
- Search for "Merge vector layers" and double-click it.
- For "Input layers", click the
...
button and "Add Directory", selecting the folder where the zipfiles were saved. - Once you have see all the state files listed, click "OK"
- Leave the "Destination CRS" blank (it will use the CRS from the files).
- Leave "Merged" set to "[Create temporary layer]" so we can check the output before saving.
- Click "Run"
After a few seconds, you should see the US on the map, and a new temporary layer called "Merged" in the list of layers.
If it looks good, right-click "Merged" > "Make Permanent..."
- GeoPackage (the default format) is a fine format -- just one file!
- For "File name", click the
...
to specify the output location and filename (something like 'puma2010.gpkg') - Ignore all the other options, and click "OK"
- Right-click "Merged" > "Rename layer" and rename it "PUMA"
The GEOID10 column contains the nationwide unique IDs for each PUMA. We want to add a new column that will contain a list of all the IDs of PUMAs that are adjacent to a given PUMA. We can do this using an aggregate function in the field calculator, as described by Ujaval Gandhi's 2019-05-23 blog post, "Find Neighbor Polygons using Summary Aggregate Function in QGIS".
- Right-click "PUMA" > "Open Attribute Table"
- In the table toolbar, click the "Field Calculator" button (towards the right, looks like an abacus)
- Set "Output field name" = "adjacent"
- Set "Output field type" = "Text (string)"
- Set the expression (large white area on left) to the following:
aggregate(
layer:= 'PUMA',
aggregate:='concatenate',
expression:=GEOID10,
concatenator:=',',
filter:=touches($geometry, geometry(@parent))
)
Note that we use the function touches()
instead of intersects
(which would include the PUMA itself). Census boundary data is pretty good, with adjacent polygons sharing the same exact boundary, so touches
should work well.
Notice the preview below your expression. It should look something like '0200300,0200200,0200102'.
Click "OK" and wait for the calculation to finish. It may take several minutes...