As you all know R users tend to install packages from CRAN using “install.packages”. Making DR available there would greatly help adoption.
- Source package
- Binary package
- Hybrid
In this option we build everything from source. This includes the third party libraries and our own code.
In this option we provide a binary package. We can generate the binaries in an old distro (e.g. Centos5) for better compatibility. If we compile in a new distro the binaries will depend on newer symbols from GLIBC that are not available on older distros.
Ship binaries of the third party libraries but compile our own platform code.
The package can be directly installed from Github from the R console like this:
library(devtools)
devtools::install_github('vertica/distributedR')
We don't need approval from R-core.
For this one the package needs to be approved. This is the policy. It seems they only like binaries for Windows and Mac, not Linux. Our package is Linux only.
- Move all sources (or binaries/shared libraries) under
platform/master
. - Change Makevars to reference only internal directories (under
platform/master
). - Fix R warnings (tons). This is reported by running
R CMD check --as-cran
. - Fix c++ warnings (many). Already done.
- Get rid of /opt/hp/distributed and replace all references in the code to
/opt/hp/distributed
by<r_packages_dir>/distributedR/
. - Fix errors in vignettes:
dL <- dlist(partitions = 3) When sourcing ‘Tutorial.R’: Error: unused argument (partitions = 3)
- Fixing build system and c++: I can do this myself, will take 1 or 2 days.
- Fixing R warnings, vignettes: Any volunteer?
- Get the package into CRAN. Probably we need to do a bunch of iterations and some convincing so it will probably take few weeks.