Skip to content

Instantly share code, notes, and snippets.

@DustinAlandzes
Created September 22, 2018 00:25
Show Gist options
  • Save DustinAlandzes/64e1d8dab58db380f806c95a2fa92a8e to your computer and use it in GitHub Desktop.
Save DustinAlandzes/64e1d8dab58db380f806c95a2fa92a8e to your computer and use it in GitHub Desktop.
attempt to reuse dfsnow's gtfs feed visualization in R: https://gist.github.com/dfsnow/9ec07791771edfaca2cfce9ca304b5a5
R version 3.5.1 (2018-07-02) -- "Feather Spray"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library(tidyverse)
-- Attaching packages --------------------------------------- tidyverse 1.2.1 --
v ggplot2 3.0.0.9000 v purrr 0.2.5
v tibble 1.4.2 v dplyr 0.7.6
v tidyr 0.8.1 v stringr 1.3.1
v readr 1.1.1 v forcats 0.3.0
-- Conflicts ------------------------------------------ tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag() masks stats::lag()
> library(imputeTS)
> library(ggplot2)
> library(gganimate)
> library(gtfsr)
> library(sf)
Linking to GEOS 3.6.1, GDAL 2.2.3, PROJ 4.9.3
>
> # Download the relavant GTFS feed
> url <- "http://www.transitchicago.com/downloads/sch_data/google_transit.zip"
> download.file(url, "gtfs.zip")
trying URL 'http://www.transitchicago.com/downloads/sch_data/google_transit.zip'
Content type 'application/x-zip-compressed' length 36014103 bytes (34.3 MB)
downloaded 34.3 MB
WARNING: You are configured to use the CRAN mirror at https://cran.rstudio.com/. This mirror supports secure (HTTPS) downloads however your system is unable to communicate securely with the server (possibly due to out of date certificate files on your system). Falling back to using insecure URL for this mirror.
To learn more and/or disable this warning message see the "Use secure download method for HTTP" option in Tools -> Global Options -> Packages.
> # Unpack the gtfs feed into separate dataframes
> gtfs <- import_gtfs("gtfs.zip", local = TRUE)
Unzipped the following files to directory 'C:/Users/HYSTOU/Documents/gtfs'...
[1] "agency.txt"
[2] "calendar.txt"
[3] "calendar_dates.txt"
[4] "developers_license_agreement.htm"
[5] "frequencies.txt"
[6] "routes.txt"
[7] "shapes.txt"
[8] "stop_times.txt"
[9] "stops.txt"
[10] "transfers.txt"
[11] "trips.txt"
Reading agency.txt
Reading calendar.txt
Reading calendar_dates.txt
Reading developers_license_agreement.htm
File developers_license_agreement.htm not recognized. No meta data exists. Reading file as csv.
Warning: 181 parsing failures.
row # A tibble: 5 x 5 col row col expected actual file expected <int> <chr> <chr> <chr> <chr> actual 1 1 NA 3 columns 1 colu~ 'C:/Users/HYSTOU/D~ file 2 2 NA 3 columns 1 colu~ 'C:/Users/HYSTOU/D~ row 3 3 NA 3 columns 9 colu~ 'C:/Users/HYSTOU/D~ col 4 4 NA 3 columns 1 colu~ 'C:/Users/HYSTOU/D~ expected 5 5 NA 3 columns 1 colu~ 'C:/Users/HYSTOU/D~
... ................................. ... ................................................... ........ ................................................................................................................................................................................ [... truncated]
Reading frequencies.txt
Reading routes.txt
Reading shapes.txt
|=====================================| 100% 19 MB
Reading stop_times.txt
|=====================================| 100% 173 MB
|=====================================| 100% 173 MB
Reading stops.txt
Reading transfers.txt
Reading trips.txt
...done.
Testing data structure...
...passed. Valid GTFS object.
NOTE: Parsing errors and warnings while importing data can be extracted from any given dataframe with `attr(df, "problems")`.
Warning messages:
1: In unzip_gtfs_files(., quiet = quiet) :
Extraction folder already exists. Overwriting.
2: In file.remove(folder) :
cannot remove file 'C:/Users/HYSTOU/Documents/gtfs', reason 'Permission denied'
3: In dir.create(ex_dir) : 'C:\Users\HYSTOU\Documents\gtfs' already exists
4: In rbind(names(probs), probs_f) :
number of columns of result is not a multiple of vector length (arg 1)
5: Missing column names filled in: 'X7' [7], 'X9' [9]
>
> View(gtfs)
>
> # Create a temp dataframe of routes only (no buses)
> temp_df <- gtfs[["routes_df"]] %>%
+ filter(route_type == 1) %>%
+ select(route_id, route_long_name, route_color)
> # Create an sf dataframe containing static linestrings of routes
> lines_df <- temp_df %>%
+ inner_join(gtfs$trips_df, by = "route_id") %>%
+ distinct(route_id, shape_id) %>%
+ left_join(gtfs$shapes_df, by = "shape_id") %>%
+ st_as_sf(coords = c("shape_pt_lon", "shape_pt_lat"), crs = 4326) %>%
+ group_by(shape_id) %>%
+ summarize(do_union = FALSE) %>%
+ st_cast("LINESTRING") %>%
+ left_join(gtfs$trips_df, by = "shape_id") %>%
+ group_by(route_id) %>%
+ summarize() %>%
+ left_join(temp_df, by = "route_id") %>%
+ mutate(route_color = paste0("#", route_color))
> # Create an sf dataframe with a static point for each train stop
> stops_df <- temp_df %>%
+ inner_join(gtfs$trips_df, by = "route_id") %>%
+ left_join(gtfs$stop_times_df, by = "trip_id") %>%
+ left_join(gtfs$stops_df, by = "stop_id") %>%
+ distinct(route_id, stop_id, route_color, stop_lon, stop_lat, stop_sequence) %>%
+ mutate(route_color = paste0("#", route_color)) %>%
+ rename(
+ lat = stop_lat,
+ lon = stop_lon
+ )
> # Create a sample of points from each train line, these points
> # are then used as interpolation points to keep cars on the tracks
> shape_df <- temp_df %>%
+ inner_join(gtfs$trips_df, by = "route_id") %>%
+ distinct(route_id, shape_id) %>%
+ left_join(gtfs$shapes_df, by = "shape_id") %>%
+ group_by(shape_id) %>%
+ rename(
+ lat = shape_pt_lat,
+ lon = shape_pt_lon,
+ dist = shape_dist_traveled
+ ) %>%
+ select(-shape_pt_sequence)
> # Create a dataframe of all trips, then sequence them by time and group
> # gganimate will cycle through this dataframe according to time
> trips_df <- temp_df %>%
+ inner_join(gtfs$trips_df, by = "route_id") %>%
+ left_join(gtfs$stop_times_df, by = "trip_id") %>%
+ left_join(gtfs$stops_df, by = "stop_id") %>%
+ distinct(
+ route_id, route_color, shape_id, trip_id,
+ stop_lat, stop_lon, arrival_time, shape_dist_traveled) %>%
+ rename(
+ lat = stop_lat,
+ lon = stop_lon,
+ dist = shape_dist_traveled
+ ) %>%
+ mutate(time = as.POSIXct(
+ arrival_time,
+ format = "%H:%M:%S",
+ tz = "UTC")) %>%
+ na.omit()
> # Create a second temp dataframe containing all interpolated
> # points for each trip
> temp_df_2 <- trips_df %>%
+ group_by(trip_id, shape_id) %>%
+ summarize() %>%
+ left_join(shape_df, by = "shape_id")
> # Roll the interpolated points into the main trip dataframe, order by
> # distance traveled and trip, then interpolate the missing times for each
> trips_df <- trips_df %>%
+ bind_rows(temp_df_2) %>%
+ group_by(trip_id) %>%
+ arrange(trip_id, dist) %>%
+ mutate(time = as.POSIXct(
+ na.interpolation(
+ as.numeric(time),
+ option = "stine"),
+ origin = '1970-01-01', tz = 'UTC')
+ ) %>%
+ fill(arrival_time) %>%
+ select(trip_id, arrival_time, lat, lon, time, dist) %>%
+ group_by(trip_id, dist) %>%
+ filter(row_number() == 1)
Error in mutate_impl(.data, dots) :
Evaluation error: Input data needs at least 2 non-NA data point for applying na.interpolation.
> View(trips_df)
> # Roll the interpolated points into the main trip dataframe, order by
> # distance traveled and trip, then interpolate the missing times for each
> trips_df <- trips_df %>%
+ bind_rows(temp_df_2) %>%
+ group_by(trip_id) %>%
+ arrange(trip_id, dist) %>%
+ mutate(time = as.POSIXct(
+ na.interpolation(
+ as.numeric(arrival_time),
+ option = "stine"),
+ origin = '1970-01-01', tz = 'UTC')
+ ) %>%
+ fill(arrival_time) %>%
+ select(trip_id, arrival_time, lat, lon, time, dist) %>%
+ group_by(trip_id, dist) %>%
+ filter(row_number() == 1)
Error in mutate_impl(.data, dots) :
Evaluation error: Input data needs at least 2 non-NA data point for applying na.interpolation.
In addition: Warning message:
In na.interpolation(as.numeric(arrival_time), option = "stine") :
NAs introduced by coercion
> # Use gganimate to create and tween plot frames
> plot <- ggplot() +
+ geom_point(
+ data = stops_df,
+ aes(x = lon, y = lat, color = route_id),
+ size = 2.5,
+ show.legend = FALSE
+ ) +
+ geom_sf(
+ data = lines_df,
+ aes(color = route_id),
+ show.legend = FALSE
+ ) +
+ geom_point(
+ data = trips_df,
+ aes(x = lon, y = lat),
+ size = 1.5,
+ shape = 15
+ ) +
+ scale_color_manual(values = lines_df$route_color) +
+ transition_components(trip_id, time) +
+ ease_aes("sine-in-out") +
+ theme_bw() +
+ labs(
+ title = "Chicago Rail ('L') System Map",
+ subtitle = '{frame_time}') +
+ theme(
+ line = element_blank(),
+ rect = element_blank(),
+ axis.text = element_blank(),
+ axis.title = element_blank(),
+ plot.title = element_text(
+ face = "bold",
+ size = 24,
+ margin = margin(b = -70, t = 42)),
+ plot.subtitle = element_text(
+ size = 18,
+ margin = margin(b = -122, t = 80)),
+ panel.grid.major = element_line(colour = "transparent")
+ ) +
+ annotate(
+ "text",
+ x = -87.9350395,
+ y = 41.734339,
+ label = "Created by Dan Snow \ngithub.com/dfsnow",
+ hjust = 0,
+ size = 5,
+ color = "grey60")
> # Pass the plot object to gganimate, which renders each frame
> # as a single png saved to /tmp
> frames <- as.numeric(length(unique(trips_df$arrival_time))) * 6
> plot_mg <- animate(plot, frames, fps = 50, width = 1024, height = 1024)
Error in mapply(FUN = f, ..., SIMPLIFY = FALSE) :
zero-length inputs cannot be mixed with those of non-zero length
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment