caching.md

Suppose you have a long running calculation:

f <- function(x) {
  message("Evaluating slow function")
  Sys.sleep(5) # sleep 5 seconds to simulate long running time
  x
}

Which is used like so:

f(10)

However, you only want to rerun f() sometimes (say when an upsteam data source changes). I usually do something like this:

run.cached <- function(expr, filename, regenerate=FALSE) {
  if ( file.exists(filename) && !regenerate ) {
    res <- readRDS(filename)
  } else {
    res <- eval.parent(substitute(expr))
    saveRDS(res, file=filename)
  }
  res
}

This is a simple caching function; tries to load the .rds file indicated by filename if it exists, otherwise it runs the expression in expr and saves the output in the file filename. If you specify regenerate=TRUE it will rerun the expression

Simple caching; run 'expr' and save the output in 'filename'; if 'filename' already exists just load that. If regenerate is TRUE, it always runs the expression.

So you can do this:

run.cached(f(5), 'mycache.rds') # runs the slow function
run.cached(f(5), 'mycache.rds') # won't run, returns cached result
run.cached(f(10), 'mycache.rds', TRUE) # runs the slow function

When I want to make sure everything works correctly for the final published version, I delete the .rds files, which forces everything to be recalculated.

There are a variety of packages on CRAN that do this already, apparently: R.cache, SOAR, and (for Sweave) cacheSweave. These may be more robust!

richfitz/caching.md

Select an option

No results found

Select an option

No results found

davharris commented Feb 20, 2013

Uh oh!

richfitz commented Feb 22, 2013

Uh oh!