Robert M Flight rmflight

Communication and Education via Social Media

I actively maintain Twitter and GitHub accounts, as well as a blog to interact with other scientists both junior and senior, and provide links to works that may be useful to others, as well as reply to questions about programming in the R statistical language, often advising users I do not know who use the #rstats hashtag. Interactions on Twitter directly resulted in my (and others) suggesting improvements to the draft version of Ten Simple Rules for Taking Advantage of Git and GitHub in PLOS Comp Bio, resulting in my co-authorship of a paper with 19,000 views and 9 citations. My Twitter activities also directly lead to my involvement in the ROpenSci organization. I was accepted to attend the ROpenSci un-conference in May 2017, which resulted in the generation of the testRmd R package (authored with 3 others), as well as my open review of the gitlabr R package, which is expected to improve an R package already used by many to interface with the GitLab softwa

	fit_square_root = function(x, y){
	sr_x = sqrt(x)
	# assumes you have an intercept, so need to add the ones to the matrix
	X = matrix(c(rep(1, length(x)), sr_x), nrow = length(x), ncol = 2, byrow = FALSE)

	sr_fit = stats::lm.fit(X, y)
	names(sr_fit$coefficients) = NULL

	sr_fit$coefficients
	}

	# so I have this code in two different packages now, and I'm thinking of making a single
	# package that they (and other packages) could easily depend on, that lets the developer
	# enable the use of furrr::future_map when a user has multi-processing available.
	#
	# basically the way this works right now, is after loading the package, if you want to use furrr::future_map
	# you do:
	# set_internal_map(furrr::future_map)
	# plan(multiprocess)
	# and magically you have multiprocessing everywhere I have
	# internal_map$map_function()

	ex_data = data.frame(A = c("A", "C", "E", "F", "G", "H", "I"),
	B = c("B", "D", "A", "E", "I", "J", "K"),
	C = "C",
	stringsAsFactors = FALSE)

	irow = 2
	consider_cols = c("A", "B")
	all_entries = unlist(ex_data[1, consider_cols], use.names = FALSE)
	while (irow <= nrow(ex_data)) {
	message(c(irow, nrow(ex_data)))

	x = rnorm(500)
	library(microbenchmark)
	microbenchmark(
	replicate(5000, sample(x)),
	do.call(c, purrr::map(seq(1, 5000), function(.x){sample(x)}))
	)

	#Unit: milliseconds
	#expr
	#replicate(5000, sample(x))

	#!/usr/bin/Rscript
	#
	# Installation:
	#
	# Copy this file to an accessible location, and then do a chmod u+x last_modified_files
	#
	# Make sure you have docopt installed: install.packages("docopt")
	#
	# License: MIT. Copyright Robert M Flight, 2018.
	#

	---
	title: "Vignette Title"
	author: "Vignette Author"
	package: PackageName
	output:
	BiocStyle::html_document2
	vignette: >
	%\VignetteIndexEntry{Vignette Title}
	%\VignetteEngine{knitr::rmarkdown}
	%\VignetteEncoding{UTF-8}

	data <- tbl_df(data.frame(values = rnorm(100), id = rep(c("a", "b"), 50)))
	data

	group_by(data, id) %>% summarise(mean = mean(values))

	> library(UpSetR)

	> library(org.Hs.eg.db)

	> all_genes <- keys(org.Hs.eg.db)

	> n_gene <- c(2000, 500, 1000, 900)

	> # create a list, where each entry is the vector of Gene IDs that were diff
	> # expressed in that condition

	testit:
	script:
	- R CMD INSTALL .
	- Rscript run_tests.R