Skip to content

Instantly share code, notes, and snippets.

@infotroph
infotroph / make_package_stub.R
Last active November 2, 2017 03:16
How to create a package namespace without saving any files! ...Wait, why would you though
#' Generate a minimal fake package namespace
#'
#' Mocks up a tiny package namespace and monkey-patches it into the current R
#' sessions's namespace registry. This abuses some R internals and has high
#' potential to break things for the remainder of your session. Use it with
#' great caution, or maybe not at all.
#'
#' The intended use case was to provide nonfunctional skeletons of selected
#' functions from packages that are not installed, solely so that they could
#' then be replaced by test stubs. Embarrassingly soon after writing this
@infotroph
infotroph / gist:e54c9a4f945b616701a02be368922dea
Created September 15, 2017 04:10
~10x speedup from lazy-loading standard_vars
library(microbenchmark)
# PEcAn.utils::to_ncvar, revision 79ef207
to_ncvar_current <- function(varname,dims){
standard_vars <- read.csv(system.file("data/standard_vars.csv", package="PEcAn.utils"), stringsAsFactors = FALSE)
var <- standard_vars[which(standard_vars$Variable.Name == varname),]
#check var exists
if(nrow(var)==0){
PEcAn.logger::logger.severe(paste("Variable",varname,"not in standard_vars"))
}
@infotroph
infotroph / Makefile
Last active September 9, 2017 02:52
Guess the result!
# > ls -R ./dirs
# a b c d e
# ./dirs/a:
# file1 file2 file3
# ./dirs/b:
# fileOne fileThree fileTwo
# Overthinking a speed comparison. The task at hand is:
# "if this column contains values greater than 1, assume they're percentages and divide them by 100"
library(microbenchmark)
library(data.table)
library(dplyr)
library(ggplot2)
# We'll generate 20 columns for realistic size, but only column 10 used in this test
newdata <- function(nrow, max_1 = TRUE){
@infotroph
infotroph / gist:f9b97127d7c35fc14641bbd891d77328
Created August 18, 2017 00:52
trying to understand (mis)initialization
struct A{
T x;
A(): x{default_T} {};
A(T t): x{t} {};
};
// Produces two fully initialized A's,
// with a->x == default_T a2->x == my_T.
// Good, fine.
A* a = new A{};
# Have: file with observation ids that reset every time the instrument is restarted
# Want: Session identifiers that increment each restart
# Base R approach shown below. Is there a "standard" Tidyverse approach to this?
dat = read.csv(text = "
id,y
RESET
1,1.2
2,3.5
3,2.8
@infotroph
infotroph / gist:9dfd096df14564ef09150fbefb367d3f
Created June 23, 2017 03:52
unnesting multi-column dataframes
# Machine A: Working as desired.
> devtools::session_info()
Session info -------------------------------------------------------------------
setting value
version R version 3.3.1 (2016-06-21)
system x86_64, linux-gnu
ui X11
language (EN)
collate en_US.UTF-8
```
library("PEcAn.data.atmosphere")
library("dplyr")
ne = download.Geostreams(
outfolder="~/gstest",
sitename="UIUC Energy Farm - NE",
start_date="2016-03-01",
end_date="2016-03-31")
# Have:
# A scatterplot with points grouped by binning a continuous variable.
(ggplot(mtcars, aes(wt, mpg, color=carb>2))
+geom_point()
+geom_smooth(aes(color=carb>2), method="lm"))
# Also have:
# A scatterplot with the smoothers grouped but points colored continuously
(ggplot(mtcars, aes(wt, mpg, color=carb))
+geom_point()
@infotroph
infotroph / datetimes.md
Last active April 27, 2017 19:31
The R datetime behavior I want (that probably doesn't exist)

What I want

Basically I want a timestring parsing function whose output behaves like the result from

x = as.POSIXct("2001-01-01 01:00:00 -0600", tz="America/Chicago")
y = as.POSIXct("2001-01-01 07:00:00Z", tz="UTC")
z = as.POSIXct("2001-01-01 13:00:00 +0600", tz="Asia/Omsk")