Skip to content

Instantly share code, notes, and snippets.

View hepplerj's full-sized avatar
💭
building the history web

Jason Heppler hepplerj

💭
building the history web
View GitHub Profile
@hepplerj
hepplerj / census_cleanup.R
Created March 11, 2020 19:36
An example script for census data in R
library(tidyverse)
library(tidycensus)
# My recommendation is to use the tidycensus library to make getting this data
# easier than reading in the data from the Census website.
#
# Before you can begin, you'll need to get an API key from the Census Bureau.
# You can acquire one here:
#
# Once you have the API key, run the following in RStudio:
@hepplerj
hepplerj / messy.R
Created February 20, 2020 20:58
Messy data in R, for teaching the tidyverse
library(charlatan)
library(salty)
library(magrittr)
library(readr)
messydata <- ch_generate('name','job','phone_number', n = 200)
messydata <- messydata %>%
mutate(job = salt_capitalization(job)) %>%
mutate(phone_number = salt_na(phone_number)) %>%
@hepplerj
hepplerj / frequency_to_list.R
Created December 11, 2019 18:11
Turn a frequency table into a list of individual items
library(tidyverse)
library(readxl)
data <- readxl::read_xlsx("data.xlsx")
reshaped <- data %>% gather(word, freq, 2:21)
reshaped <- reshaped %>% drop_na()
cleaned <- reshaped %>%
uncount(freq)
@hepplerj
hepplerj / geofilter.R
Created November 8, 2019 15:12
Checking points and filtering incorrect or unneeded data.
library(tidyverse)
library(maps)
library(mapdata)
data <- read_csv("~/Desktop/nplsuperfund.csv")
names(data) <- c("lat","lon","date")
# Filter down to USA extent to remove extraneous points
tidy <- data %>%
filter(lat < -67, lat > -125) %>%
@hepplerj
hepplerj / hex_logo.R
Last active January 14, 2020 19:32
Hex logo generator for R User Group
library(hexSticker)
library(tidyverse)
library(tidycensus)
library(sf)
library(viridis)
options(tigris_use_cache = TRUE)
nebraska_raw <- get_acs(state = "NE",
geography = "tract",
@hepplerj
hepplerj / pandas.py
Created May 23, 2018 14:45
An evolving set of pandas snippets I find useful
# Unique values in a dataframe column
df['column_name'].unique()
# Grab dataframe rows where column = value
df = df.loc[df.column == 'some_value']
# Grab dataframe rows where column value is present in a list
value_list = ['value1', 'value2', 'value3']
df = df.loc[:,df.columns.isin(valuelist)]
# or grab rows where a value is not present in a list
@hepplerj
hepplerj / README.md
Last active May 10, 2018 20:12
Add leaflet points on click

Click on the map to add points. See the console for lat/long output.

@hepplerj
hepplerj / index.html
Created March 25, 2018 02:30
WebGL Mapping
<!DOCTYPE html>
<head>
<meta charset="utf-8">
<script src="https://d3js.org/d3.v4.min.js"></script>
<script src="http://www.webglearth.com/v2/api.js"></script>
<script>
function map() {
var options = { zoom: 1.5, position: [47.19537,8.524404] };
var earth = new WE.map('earth_div', options);
@hepplerj
hepplerj / batch.sh
Created October 20, 2017 17:09
Batch compress PDFs for Omeka
# This requires the use of GhostScript
# On macOS, the easiest way to get started is install with Homebrew
# brew install ghostscript
#
# This file should live in the directory that contains the PDFs. From
# the command line, just running `bash batch.sh` will compress the PDFs
# and fix any issues that might be present with JPEG2000 images. The
# compression process should preserve the OCR and will likely reduce the
# size of the PDF as well.
#
@hepplerj
hepplerj / asc_crawler.py
Created October 20, 2017 15:33
Using tweepy to crawl for archives, special collections, and library users.
import tweepy
# OAuth is the preferred method for authenticating to Twitter
# Consumer keys are under the application's Details page at
# http://dev.twitter.com/apps
consumer_key = ""
consumer_secret = ""
# Access tokens are found on your applications' Details page
# at http://dev.twitter.com/apps.