devtools is a terrific package that makes creating, developing, and debugging R packages fast and easy. I'll step through a ten minute introduction here. This is a short talk for labmates demonstrating how easy it is to create packages (with the right tools); most of this is just pointing Hadley's terrific book on R packages.
First, I add the following line to my ~/.zshrc
file:
alias rpkg="Rscript -e 'library(devtools); create(commandArgs(trailing=TRUE)[1], rstudio=FALSE)'"
This allows me to do the following from the command line:
$ rpkg mstools
This creates an empty package skeleton for an R packaged called mstools
.
See the directions in Hadley's R
Packages book. Beware of the
difference between Depends
and Imports
-- there's a good
section in Hadley's
book. Basically:
-
Depends: These packages must be installed for your package to work, and they will be attached when your package is loaded. Usually, you should use
Imports
rather thanDepends
to avoid namespace collisions. The exception is if your package really does build off another package extensively, e.g. I often have packages thatDepend
onGenomicRanges
. This is because my package isn't just calling one function fromGenomicRanges
, it's building off of it. -
Imports: Packages that must be installed for your package to work. The difference between
Depends
andImport
is that packages inImport
are not attached (e.g. withlibrary()
) when you load your R package. If your package's functions need to use a function from an imported package, you'll need to use the syntaxpackage::function()
. If you call a function a lot, you can explicitly import this function using namespaces.
Again, this is all really simple. Let's step through an example.
I'll download an R function I wrote to parse MS output from Gist into R/
:
$ curl https://gist.githubusercontent.com/vsbuffalo/6e78546735bd1006f66f/raw/7a5cc4d8e408c4882fcee9c7b6ef8e0df39e8386/parseMS.R > R/parse.R
Then, let's document this using roxygen2:
#' Parse output from MS
#'
#' \code{parseMS} parses results from an MS simulation, returning a list of results.
#'
#' @param file filename to MS simulation results.
#'
#' @return A list containing each simulation's data.
#'
#' res <- parseMS(system.file("extdata", "ms-01.sim", package="mstools"))
#' summary(sapply(res, function(x) x$segsites))
#'
#' @export
parseMS <- function(file) { ... }
See more about roxygen2 syntax from Hadley's book. Karl Broman's tutorial is good too.
Then, we just need to reload our package and create documentation. We do this with:
> load_all() # from root package directory
> document()
I've noticed that sometimes devtools gets angry when creating NAMESPACE
.
Since this file is generated programmatically, just rm NAMESPACE
and rerun
load_all()
and document()
; this usually takes care of it.
We can run our example with run_examples()
.
If you want your data as .RData
files use devtools
's function
devtools::use_data(data1, data2, ...)
. If your data are large, set LazyData: true
in your DESCRIPTION
file.
In our case, we want to package an MS simulation for testing. We put raw data
like this in inst/extdata
.
$ mkdir -p inst/extdata
$ ms 10 300 -t 10 > inst/extdata/ms-01.sim
inst/
files are moved to the root package directory when the package is
loaded. Since where your package is installed depends on your system, you'll
need to refer to these data using the function system.file()
:
$ system.file("extdata", "ms-01.sim", package="mstools")
Remember, you need to document your data too! See Hadley's book for more details. It's a good book, and thorough.
Again, devtools
makes this painfully simple:
> check()
> install()
If you have test code, you can test it with test()
. You can also build your package using:
> build()
bash()
can be used to open up a Bash prompt to interact with Git. I usually
prefer to have another terminal tab open. You can use
gh to quickly create as a Github repository.
$ git init
$ git add DESCRIPTION NAMESPACE R/parse.R README.md inst/extdata/ms-01.sim man/parseMS.Rd
$ git status
$ echo "*seedms" > .gitignore && git add .gitignore
$ gh create -d "some tools I use in working with MS results" -h "" # APIs are beautiful, aren't they?
$ git commit -am "initial import" && git push origin master
[master (root-commit) 3013ca6] initial import
7 files changed, 4283 insertions(+)
create mode 100644 .gitignore
create mode 100644 DESCRIPTION
create mode 100644 NAMESPACE
create mode 100644 R/parse.R
create mode 100644 README.md
create mode 100644 inst/extdata/ms-01.sim
create mode 100644 man/parseMS.Rd
$ git push origin master
Counting objects: 13, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (8/8), done.
Writing objects: 100% (13/13), 37.30 KiB | 0 bytes/s, done.
Total 13 (delta 0), reused 0 (delta 0)
To [email protected]:vsbuffalo/mstools.git
* [new branch] master -> master
Programmers are lazy people, and we can all benefit from this. I pushed this
README.md
file to Gist using gist.
$ gist README.md