Jonathan Carroll jonocarroll

👨‍💻

Learning all the languages

Recovering theoretical physicist / ongoing coffee addict / continually improving data scientist. I'm interested in open-source data projects, mainly in R.

271 followers · 78 following

View GitHub Profile

Recently created

Least recently created

Recently updated

Least recently updated

jonocarroll / mutate_with_attrs.R

Created February 21, 2017 23:52

hack version of dplyr::mutate which preserves custom attributes

	mutate_with_attrs <- function(df, ...) {

	olddf <- df
	newdf <- mutate(df, ...)

	newnames <- setdiff(names(attributes(olddf)), names(attributes(newdf)))

	sapply(seq_along(newnames), function(x) attr(newdf, newnames[x]) <<- attr(olddf, newnames[x]))

	return(newdf)

jonocarroll / purrr_force.R

Created February 23, 2017 22:37

Use the force(x), Vince

	library(purrr)

	foo <- function(x) {
	function(y) {
	y + x
	}
	}

	args <- list(1, 2)
	foos_map <- map(args, foo)

jonocarroll / warnings.Rprofile

Created March 9, 2017 12:02

get outta here, warnings

	if ("stringi" %in% utils::installed.packages()) {
	flip <- stringi::stri_unescape_unicode('(\\u256f\\u00b0\\u25a1\\u00b0\\uff09\\u256f\\ufe35 \\u253b\\u2501\\u253b')
	warnings <- function(...) base::warnings(flip)
	}

jonocarroll / emoticons.Rmd

Last active March 10, 2017 01:48

Emoticons in rmarkdown (which interprets direct HTML)

	---
	title: "☢ FYI, rmarkdown allows direct HTML ☠"
	author: "⚛"
	date: "10 March 2017"
	output: html_document
	---

	```{r setup, include=FALSE}
	knitr::opts_chunk$set(echo = TRUE)
	```

jonocarroll / mean_var.R

Created March 14, 2017 02:45

Historical curiosity: why does var() allow as.character(<numeric>) input?

	# using a data.frame with a character column (not uncommon)
	d <- data.frame(name = c("alpha", "beta", "gamma"),
	x = runif(3),
	y = runif(3))
	d
	#> name x y
	#> 1 alpha 0.8577527 0.4611686
	#> 2 beta 0.2989644 0.9660795
	#> 3 gamma 0.7221540 0.1730588

jonocarroll / assertr_regex.R

Created April 3, 2017 12:01

Regex on assertr output

	library(dplyr)
	library(assertr)

	# if we have the command
	data.frame(x = 1:2) %>% assert(in_set(1), x)
	#> Column 'x' violates assertion 'in_set(1)' 1 time
	#> index value
	#> 1 2 2
	#> Error: assertr stopped execution

jonocarroll / SA4 Population by Age Group - March 2017.csv

Last active May 23, 2017 01:56

geofacet AUS grid example

Region Name	Age Group	Population	Population Distribution (%)
New South Wales	15 to 24	1,005,900	16.0
New South Wales	25 to 34	1,121,900	17.8
New South Wales	35 to 44	1,028,300	16.3
New South Wales	45 to 54	993,100	15.8
New South Wales	55 to 64	910,900	14.5
New South Wales	65 and over	1,239,900	19.7
Victoria	15 to 24	804,500	16.2
Victoria	25 to 34	939,500	18.9
Victoria	35 to 44	827,600	16.6

jonocarroll / read.tscv.R

Created November 20, 2017 06:10

Read a transposed (variables in rows) CSV file into R correctly

	## Based on
	## https://stackoverflow.com/a/17289991/4168169
	read.tcsv = function(file, header=TRUE, sep=",", ...) {

	n = max(count.fields(file, sep=sep), na.rm=TRUE)
	x = readLines(file)

	.splitvar = function(x, sep, n) {
	var = unlist(strsplit(x, split=sep))
	length(var) = n

jonocarroll / keep_levels.R

Created March 7, 2018 06:03

Functions keep_levels and discard_levels to filter with validation

	#' Keep only certain groups/levels of a column
	#'
	#' @param .d a data.frame or tibble
	#' @param .g column containing groups/levels to be kept or discarded
	#' @param .l groups/levels to be kept or discarded as character vector
	#'
	#' @return a data.frame or tibble with the same or fewer rows after filtering
	#' @export
	#' @importFrom rlang enquo
	#'

jonocarroll / boyermoor.py

Last active November 26, 2021 22:41

Boyer-Moore Implementations for @coolbutuseless' comparisons

	def alphabet_index(c):
	"""
	Returns the index of the given character in the English alphabet, counting from 0.
	"""
	return ord(c.lower()) - 97 # 'a' is ASCII character 97

	def match_length(S, idx1, idx2):
	"""
	Returns the length of the match of the substrings of S beginning at idx1 and idx2.
	"""

Older Newer