Skip to content

Instantly share code, notes, and snippets.

View klmr's full-sized avatar
📦
Making your R code easier to reuse

Konrad Rudolph klmr

📦
Making your R code easier to reuse
View GitHub Profile
@klmr
klmr / gist:6152098
Last active December 20, 2015 15:09
When not to use point-free style: Composed functions don’t always auto-vectorise well. As a consequence, we’re left with monsters such as this one. It might be worth investigating whether the function composition tools could be changed (transparently) to make auto-vectorisation more universal. In particular, the problem lies with `fapply`, which…
capitalize <-
p(fapply, toupper %.% p(substring, 1, 1), p(substring, 2)) %|>%
p(lapply, p(paste, collapse = '')) %|>% unlist
# versus
capitalize <- function (str) paste(toupper(substring(str, 1, 1)),
substring(str, 2), sep = '')
# Usage (in both cases):
@klmr
klmr / resize.sh
Last active December 24, 2015 10:48
Resize input files – but only if they exceed 800px width
#!/usr/bin/env bash
maxwidth=800
resize () {
convert "$1" -resize $maxwidth "$2"
}
for image in large/*; do
width=$(identify -format '%w' "$image")
@klmr
klmr / download-transcripts.R
Last active December 25, 2015 02:29
Downloading and storing on disk a Fasta record from an archived Ensembl BioMart
xsource(rcane.functional) # for `%|%`
# !!! IMPORTANT !!!
# The Ensembl Biomart archive server is extremely slow. Therefore this code is
# FOR EXPOSITION ONLY. Use the transcripts file shipped with the data for this
# code instead.
downloadTranscripts <- function (target) {
require(biomaRt)
ensMart <- useMart('ENSEMBL_MART_ENSEMBL',
'mmusculus_gene_ensembl',
@klmr
klmr / pvalue.R
Created December 29, 2013 16:14
Pretty-print p-values for plots in R
#' Print relevant digits of a p-value.
#'
#' The idea is, under the assumption that H0 holds, to print as many digits of
#' p as are strictly necessary to demonstrate H0. Turning this around, if we
#' need to print many digits this probably means that we can reject H0.
#'
#' @param pvalue a single p-value
#' @param max_precision limit the maximum number of leading zeros to display
#' @param p typographic representation of the variable “p”
pretty_p <- function (pvalue, max_precision = 5, p = bquote(italic(p))) {
@klmr
klmr / vim-notes.md
Created January 22, 2014 11:46
Notes for the seminar “Using Vim” for the Predoc Lunch Seminar series at EMBL-EBI.

Vim Notes

Setup

The Vim on the server is horribly outdated. Many of the things below won’t work, or will require extensive setup. In order to ease the pain, I suggest using the version installed and maintained by Micha via the [EBI-predoc config][config].

@klmr
klmr / bm.R
Created January 30, 2014 20:44
Benchmarking performance of row selection via `-which(…)` vs. `!…`.
> library(microbenchmark)
> m <- data.frame(a = sample.int(10, 100000, replace=T),
b = sample.int(10, 100000, replace=T))
> microbenchmark(m[-which(m[, 1] == 4 & m[, 2] == 5), ], m[! (m[, 1] == 4 & m[, 2] == 5), ])
Unit: milliseconds
expr min lq median uq max neval
m[-which(m[, 1] == 4 & m[, 2] == 5), ] 27.59480 29.64628 30.09091 30.69976 42.04620 100
m[!(m[, 1] == 4 & m[, 2] == 5), ] 29.00876 29.70103 30.41829 33.10700 42.29671 100
> m <- as.matrix(m)
microbenchmark(m[-which(m[, 1] == 4 & m[, 2] == 5), ], m[! (m[, 1] == 4 & m[, 2] == 5), ])
@klmr
klmr / a.rb
Created January 31, 2014 14:35
$ ninja examples
[4/7] AR
FAILED: ar rcs bin/libnonius.a
ar: no archive members specified
usage: ar -d [-TLsv] archive file ...
ar -m [-TLsv] archive file ...
ar -m [-abiTLsv] position archive file ...
ar -p [-TLsv] archive [file ...]
ar -q [-cTLsv] archive file ...
ar -r [-cuTLsv] archive file ...
@klmr
klmr / a.rb
Created January 31, 2014 15:04
$ ninja examples
[5/6] LINK obj/examples/example1.o
FAILED: g++ -Wall -Wextra -pedantic -Werror -std=c++11 -g -flto obj/examples/example1.o -o bin/examples/example1 -lboost_system
Undefined symbols for architecture x86_64:
"__istype(int, unsigned long)", referenced from:
std::ctype<char>::is(unsigned long, char) const in example1.o
ld: symbol(s) not found for architecture x86_64
collect2: error: ld returned 1 exit status
[5/6] LINK obj/examples/example2.o
FAILED: g++ -Wall -Wextra -pedantic -Werror -std=c++11 -g -flto obj/examples/example2.o -o bin/examples/example2 -lboost_system
@klmr
klmr / brew-config
Created January 31, 2014 16:15
Files related to gcc48 issue on 10.9
HOMEBREW_VERSION: 0.9.5
ORIGIN: https://github.com/Homebrew/homebrew.git
HEAD: 8d8763699f3b7d35b27e571e856984f49392b741
HOMEBREW_PREFIX: /usr/local
HOMEBREW_CELLAR: /usr/local/Cellar
CPU: quad-core 64-bit sandybridge
OS X: 10.9.1-x86_64
Xcode: 5.0.2
CLT: 5.0.1.0.1.1382131676
Clang: 5.0 build 500
@klmr
klmr / Review.markdown
Last active August 29, 2015 13:56 — forked from mschubert/unix.R

Bugfixes

  • cmdescape fixes the use of parameters which contain special characters of the terminal (or spaces)
  • Regression: the updated version does not work with custom specified arguments. If this is desired, there should be an optional list parameter for this purpose.

Style fixes & Improvements

  • Single expressions do not need (and should not have) parentheses. This goes universally (and in particular, since R is a functional programming language, for functions).
  • There’s no need to forward %sed% to sed (and same for %grep%), it can be declared as an alias. In fact, the original function is somewhat redundant since operators can be called as functions.
  • Use `…` instead of '…' to specify unusual names: the former is syntactically a name while the latter is a string! It only “happens to work” because R has special lookup rules for functions when encountering a string. The same won’t work for other (non-function) objects though.