Skip to content

Instantly share code, notes, and snippets.

@cwickham
Last active August 16, 2019 13:49
Show Gist options
  • Select an option

  • Save cwickham/0ddd1cf435df253d59e4e85e65908c44 to your computer and use it in GitHub Desktop.

Select an option

Save cwickham/0ddd1cf435df253d59e4e85e65908c44 to your computer and use it in GitHub Desktop.
A line by line stye of reading in the hurricane data
library(tidyverse)
library(xml2)
url <- "http://www.aoml.noaa.gov/hrd/hurdat/hurdat2-nepac.html"
# Import ------------------------------------------------------------------
hurricanes <- read_html(url) %>%
xml_find_first(".//pre") %>%
xml_text() %>%
write_file("hurricanes.csv")
hurricanes_file <- file("hurricanes.csv", open = "r")
readLines(hurricanes_file, n = 1) # first line is empty
# Something to hold data, don't know before reading how
# long this should but it's less than total number
# of lines
data_blocks <- vector("list", 2000)
block <- 1
line <- readLines(hurricanes_file, n = 1)
while(length(line) > 0){
# parse header line
header <- scan(text = line,
what = list(
id = character(),
name = character(),
nrows = integer()),
sep = ",")
# get corresponding data lines
data_lines <- readLines(hurricanes_file,
n = header$nrows)
# parse data lines
data <- read.csv(text = data_lines, header = FALSE,
stringsAsFactors = FALSE)
data$hurricane <- header$id
data$name <- header$name
data_blocks[[block]] <- data
# increment
line <- readLines(hurricanes_file, n = 1)
block <- block + 1
}
close(hurricanes_file)
hurricanes <- do.call(rbind, data_blocks)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment