Skip to content

Instantly share code, notes, and snippets.

View jmcastagnetto's full-sized avatar
🇵🇪
Focusing

Jesus M. Castagnetto jmcastagnetto

🇵🇪
Focusing
View GitHub Profile
@jmcastagnetto
jmcastagnetto / test-readr-datatable-vroom.R
Created July 1, 2021 15:39
Testing readr::read_csv(), data.table::fread() and vroom::vroom()
# Test done to check/answer the question at https://stackoverflow.com/questions/68211842/why-is-vroom-so-slow
# Downloaded CSV file on 2021-07-01 from:
# https://www.datosabiertos.gob.pe/dataset/vacunaci%C3%B3n-contra-covid-19-ministerio-de-salud-minsa
# and then compressed it with gzip
# $ zcat vacunas_covid.csv.gz | wc -l
# 7311644
library(readr)
library(vroom)
library(data.table)
@jmcastagnetto
jmcastagnetto / descargar_datos.R
Created June 18, 2021 00:45
Descargar a CSV los datos de elecciones 2021 (Perú) de ronderos.pe
library(tidyverse)
base_url <- "https://ronderos.pe/datasette/elecciones_peru_2021/presidencial.csv?_size=1000&_next={next_val}"
spec <- cols(
.default = col_integer(),
mesa = col_character(),
v2_OBSERVACION = col_character(),
v2_OBSERVACION_TXT = col_character(),
v1_OBSERVACION = col_character(),
library(tidyverse)
library(rvest)
library(V8)
url <- "https://www.greatschools.org/new-york/new-york/schools/?view=table"
xpath <- "/html/head/script[1]"
ctx <- v8()
txt <- read_html(url) %>%
@jmcastagnetto
jmcastagnetto / README.md
Created January 19, 2021 03:24
RStudio Global 2021 events
# based on https://gist.github.com/andrewheiss/5cb3ec07be2b1bf5dea8806dfaa755e4
# With minor tweaks and translation to Spanish
library(tidyverse)
library(showtext)
font_add_google("Fira Sans Condensed", "firasanscond")
font_add_google("Fira Sans Extra Condensed", "firasansextracond")
showtext_auto()
@jmcastagnetto
jmcastagnetto / test_new_km_survminer.R
Last active March 3, 2020 23:35
Trying to make something like Fig 5 from http://dx.doi.org/10.1136/bmjopen-2019-030215 using gridExtra, cowplot and survminer (ref: https://twitter.com/tmorris_mrc/status/1234946869362601984)
library(tidyverse)
library(survminer)
library(survival)
library(cowplot)
library(gridExtra)
fit <- survfit(Surv(futime, fustat) ~ rx, data = ovarian)
p1 <- ggsurvplot(
fit,
@jmcastagnetto
jmcastagnetto / stemming-snowballc-hunspell.R
Created February 12, 2020 21:54
Stemming with SnowballC vs hunspell
#ref: https://github.com/juliasilge/tidytext/issues/17
library(dplyr)
library(hunspell)
library(SnowballC)
w <- tibble(
palabras = c(
"celebra",
"celebré",
@jmcastagnetto
jmcastagnetto / 20200127-Parti_ciu_TODOS_DETALLE.csv
Last active January 28, 2020 21:30
Gráfico simple de las tasas de ausentismo/abstencionismo electoral -- Elecciones Congreso 2020 (Perú)
We can make this file beautiful and searchable if this error is corrected: It looks like row 4 should actually have 1 column, instead of 5 in line 3.
#Participación Ciudadana Elecciones Congresales Extraordinarias 2020
#Ubigeo: TODOS
#ACTUALIZADO EL 27/01/2020 A LAS 12:08 h
#,PARTICIPACIÓN,,AUSENTISMO,
"DEPARTAMENTO/CONTINENTE","TOTAL ASISTENTES","% TOTAL ASISTENTES","TOTAL AUSENTES","% TOTAL AUSENTES","ELECTORES HÁBILES"
"AFRICA","0","0.00","0","0.00","0"
"AMAZONAS","157,470","63.68","89,827","36.32","247,297"
"AMERICA","0","0.00","0","0.00","0"
"ANCASH","570,090","73.25","208,165","26.75","778,255"
"APURIMAC","200,529","69.24","89,105","30.76","289,634"
@jmcastagnetto
jmcastagnetto / peru-uit-cambio.R
Created January 15, 2020 16:34
Cambio del valor de una UIT en el Perú a lo largo del tiempo
library(tidyverse)
library(rvest)
url <- "http://www.sunat.gob.pe/indicestasas/uit.html"
uit_table <- html(url) %>%
html_node(xpath = "/html/body/div[2]/div/div[1]/div[4]/div[1]/div[1]/center/table") %>%
html_table(header = TRUE) %>%
janitor::clean_names()
uit_df <- uit_table %>%