Skip to content

Instantly share code, notes, and snippets.

View mskyttner's full-sized avatar

Markus Skyttner mskyttner

View GitHub Profile
@mskyttner
mskyttner / .Rprofile
Created February 12, 2018 12:43
ALA4R config customized to use national Atlas data services
server_config <- list(
max_occurrence_records = 500000,
server_max_url_length = 8150,
brand = "ALA4R",
notify = "Please use https://github.com/AtlasOfLivingAustralia/ALA4R/issues/ or email to [email protected]",
support_email = "[email protected]",
reasons_function = "ala_reasons",
fields_function = "ala_fields",
occurrences_function = "occurrences",
config_function = "ala_config",

Filling out PDF files using staplr

Markus Skyttner 2019-06-10

# to render this into GitHub markdown:
# rmarkdown::render("name_of_this_file.R")

library(staplr)
library(vikingr)
library(dplyr)
library(purrr)
library(stringr)
log <- read_ais_log(vikingr_example("vikingr-visby-2019-ais-2"))
log_tail <-
log$message %>%
str_replace("(.*?,){5}(.*?)", "\\2")
@mskyttner
mskyttner / Dockerfile for plumber:v0.4.6
Last active May 11, 2020 19:03
Dockerfile for rstudio/plumber on r-ver:3.6.1
FROM rocker/r-ver:3.6.1
# this is the trestletech/plumber layers, now on a versioned R base
RUN apt-get update -qq && apt-get install -y --no-install-recommends \
git-core \
libssl-dev \
libcurl4-gnutls-dev \
curl \
libsodium-dev \
@mskyttner
mskyttner / duck_copy_csv.R
Created February 14, 2021 19:46
Load .tsv file into duckdb in chunkwise fashiong using R and vroom
library(dplyr)
library(duckdb)
library(vroom)
duckdb_version <- function() {
con <- duckdb::dbConnect(duckdb::duckdb())
on.exit(duckdb::dbDisconnect(con, shutdown = TRUE))
res <- DBI::dbGetQuery(con, "PRAGMA version;")
parse_semver <- function(x) {
re <- "(\\d)+\\.(\\d+)\\.(\\d+).*$"
@mskyttner
mskyttner / duckdb-load.sh
Created February 14, 2021 20:07
Bash script to bulk load a tsv file into a duckdb database file
#!/bin/bash
# usage: ./duckdb-load.sh data.tsv duckdb tablename
# for example:
# ./duckdb-load.sh ark/hcaf_species_native.tsv duckdb_database hcaf_species_native
# TODO set pragma journal_mode=off or equiv settings
# if using .import and .sep '\t', an error appears:
# Error: multi-character column separators not allowed for import
@mskyttner
mskyttner / duckstream.R
Last active November 16, 2023 00:34
Shell wrapper for duckdb to support reading from stdin
#!/usr/bin/env Rscript
# usage for example:
# cat data/mydatafile.tsv | head -n 1000 | ./duckstream.R --sql "select mycolumn from stdin;"
library(optparse)
library(readr)
suppressPackageStartupMessages(library(duckdb))
option_list <- list(
@mskyttner
mskyttner / pivot.R
Created October 26, 2022 19:44
Pivot longer and wider using R
# from SO post at https://stackoverflow.com/questions/72922418/create-rows-from-part-of-column-names/72939299
library(readr)
library(tidyr)
library(dplyr)
library(knitr)
so_blurb <-
"id|Date (05/19/2020)|Type (05/19/2020)|Date (06/03/2020)|Type (06/03/2020)|Type (10/23/2020|Date (10/23/2020)|Type (10/23/2020)
10629465|null|null|06/01/2020|E
@mskyttner
mskyttner / duckdb_buenavista_psql.sh
Created January 16, 2023 06:48
duckdb exposed by buenavista proxy and accessed with psql
# start a proxy server for a duckdb in-memory db, using the postgres wire format
docker run -it --rm -p 8080:5433 -e BUENAVISTA_HOST=0.0.0.0 -e BUENAVISTA_PORT=5433 ghcr.io/jwills/buenavista
# somewhere else (or here on the same host), start psql to issue a query against the proxy server
docker run --network host -it --rm postgres:latest psql -h $(hostname) -p 8080 -c "select 42"
@mskyttner
mskyttner / cookies_returned.qmd
Created January 11, 2024 17:54
Cookies returned?
---
title: "Quarto Dashboard Layout"
format: dashboard
---
## Row {height=50%}
### Column {.tabset}
```{r}