devtools is a terrific package that makes creating, developing, and debugging R packages fast and easy. I'll step through a ten minute introduction here. This is a short talk for labmates demonstrating how easy it is to create packages (with the right tools); most of this is just pointing Hadley's terrific book on R packages.
First, I add the following line to my ~/.zshrc file:
alias rpkg="Rscript -e 'library(devtools); create(commandArgs(trailing=TRUE)[1], rstudio=FALSE)'"
This allows me to do the following from the command line:
$ rpkg mstools
This creates an empty package skeleton for an R packaged called mstools.
See the directions in Hadley's R
Packages book. Beware of the
difference between Depends and Imports -- there's a good
section in Hadley's
book. Basically:
-
Depends: These packages must be installed for your package to work, and they will be attached when your package is loaded. Usually, you should use
Importsrather thanDependsto avoid namespace collisions. The exception is if your package really does build off another package extensively, e.g. I often have packages thatDependonGenomicRanges. This is because my package isn't just calling one function fromGenomicRanges, it's building off of it. -
Imports: Packages that must be installed for your package to work. The difference between
DependsandImportis that packages inImportare not attached (e.g. withlibrary()) when you load your R package. If your package's functions need to use a function from an imported package, you'll need to use the syntaxpackage::function(). If you call a function a lot, you can explicitly import this function using namespaces.
Again, this is all really simple. Let's step through an example.
I'll download an R function I wrote to parse MS output from Gist into R/:
$ curl https://gist.githubusercontent.com/vsbuffalo/6e78546735bd1006f66f/raw/7a5cc4d8e408c4882fcee9c7b6ef8e0df39e8386/parseMS.R > R/parse.R
Then, let's document this using roxygen2:
#' Parse output from MS
#'
#' \code{parseMS} parses results from an MS simulation, returning a list of results.
#'
#' @param file filename to MS simulation results.
#'
#' @return A list containing each simulation's data.
#'
#' res <- parseMS(system.file("extdata", "ms-01.sim", package="mstools"))
#' summary(sapply(res, function(x) x$segsites))
#'
#' @export
parseMS <- function(file) { ... }
See more about roxygen2 syntax from Hadley's book. Karl Broman's tutorial is good too.
Then, we just need to reload our package and create documentation. We do this with:
> load_all() # from root package directory
> document()
I've noticed that sometimes devtools gets angry when creating NAMESPACE.
Since this file is generated programmatically, just rm NAMESPACE and rerun
load_all() and document(); this usually takes care of it.
We can run our example with run_examples().
If you want your data as .RData files use devtools's function
devtools::use_data(data1, data2, ...). If your data are large, set LazyData: true in your DESCRIPTION file.
In our case, we want to package an MS simulation for testing. We put raw data
like this in inst/extdata.
$ mkdir -p inst/extdata
$ ms 10 300 -t 10 > inst/extdata/ms-01.sim
inst/ files are moved to the root package directory when the package is
loaded. Since where your package is installed depends on your system, you'll
need to refer to these data using the function system.file():
$ system.file("extdata", "ms-01.sim", package="mstools")
Remember, you need to document your data too! See Hadley's book for more details. It's a good book, and thorough.
Again, devtools makes this painfully simple:
> check()
> install()
If you have test code, you can test it with test(). You can also build your package using:
> build()
bash() can be used to open up a Bash prompt to interact with Git. I usually
prefer to have another terminal tab open. You can use
gh to quickly create as a Github repository.
$ git init
$ git add DESCRIPTION NAMESPACE R/parse.R README.md inst/extdata/ms-01.sim man/parseMS.Rd
$ git status
$ echo "*seedms" > .gitignore && git add .gitignore
$ gh create -d "some tools I use in working with MS results" -h "" # APIs are beautiful, aren't they?
$ git commit -am "initial import" && git push origin master
[master (root-commit) 3013ca6] initial import
7 files changed, 4283 insertions(+)
create mode 100644 .gitignore
create mode 100644 DESCRIPTION
create mode 100644 NAMESPACE
create mode 100644 R/parse.R
create mode 100644 README.md
create mode 100644 inst/extdata/ms-01.sim
create mode 100644 man/parseMS.Rd
$ git push origin master
Counting objects: 13, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (8/8), done.
Writing objects: 100% (13/13), 37.30 KiB | 0 bytes/s, done.
Total 13 (delta 0), reused 0 (delta 0)
To [email protected]:vsbuffalo/mstools.git
* [new branch] master -> master
Programmers are lazy people, and we can all benefit from this. I pushed this
README.md file to Gist using gist.
$ gist README.md