Skip to content

Instantly share code, notes, and snippets.

@HenrikBengtsson
Last active November 17, 2017 16:45
Show Gist options
  • Save HenrikBengtsson/76cec678e769111750721eb09579ec06 to your computer and use it in GitHub Desktop.
Save HenrikBengtsson/76cec678e769111750721eb09579ec06 to your computer and use it in GitHub Desktop.
Local source-only mirrors of CRAN and Bioconductor R package repositories

R Package Repositories - Local Mirrors

by Henrik Bengtsson (2017-11-09 -- 2017-11-17)

Usage

make ## same as 'make sync-all'
make debug
make summary
make sync-all
make sync-cran
make sync-bioconductor

CRAN

Calling

$ make sync-cran

will create/update a local CRAN package repository under cran/ (in the current directory) with the following structure:

$ tree -d cran/
cran/
└── src
    └── contrib
        ├── PACKAGES
        ├── PACKAGES.gz
        ├── PACKAGES.in
        ├── PACKAGES.rds
        ├── A3_1.0.0.tar.gz
        ├── abbyyR_0.5.1.tar.gz
		:
        └── zyp_0.10-1.tar.gz

Bioconductor

Calling

$ make sync-bioconductor

will create/update three local Bioconductor package repositories under bioconductor/$BIOC_VERSION/ (in the current directory), where $BIOC_VERSION is the Bioconductor version (as automatically queried from Bioconductor):

  • BioCsoft: Bioconductor Software packages (bioconductor/$BIOC_VERSION/bioc)
  • BioCann: Bioconductor Annotation Data packages (bioconductor/$BIOC_VERSION/data/annotation)
  • BioCexp: Bioconductor Experimental Data packages (bioconductor/$BIOC_VERSION/data/experiment)

Example of tree structure:

$ tree -d bioconductor/$BIOC_VERSION/bioc
cran/
└── src
    └── contrib
        ├── PACKAGES
        ├── PACKAGES.gz
        ├── PACKAGES.in
        ├── PACKAGES.rds
        ├── a4_1.26.0.tar.gz
        ├── a4Base_1.26.0.tar.gz
		:
        └── zlibbioc_1.24.0.tar.gz

Statistics

As of 2017-11-09, the above package repositories contains:

  • CRAN: 11,795 packages, 5.8 GiB disk space
  • BioCsoft: 1,476 packages, 4.2 GiB disk space
  • BioCann: 910 packages, 61 GiB disk space
  • BioCexp: 324 packages, 31 GiB disk space
## Query Bioconductor for the current release version
BIOC_VERSION := $(shell curl --silent https://www.bioconductor.org/config.yaml | grep -F "release_version:" | sed -E 's/.*release_version:[ ]*"([0-9.]+)".*/\1/g')
OPTS = --dry-run
OPTS =
all: sync-all
debug:
@echo BIOC_VERSION=$(BIOC_VERSION)
du -c -h .
summary:
@echo "CRAN: $(shell find cran -type f -name *.tar.gz | wc -l) packages ($(shell du -s -h cran | cut -f 1))"
@echo "BioCsoft $(BIOC_VERSION): $(shell find bioconductor/$(BIOC_VERSION)/bioc -type f -name *.tar.gz | wc -l) packages ($(shell du -s -h bioconductor/$(BIOC_VERSION)/bioc | cut -f 1))"
@echo "BioCann $(BIOC_VERSION): $(shell find bioconductor/$(BIOC_VERSION)/data/annotation -type f -name *.tar.gz | wc -l) packages ($(shell du -s -h bioconductor/$(BIOC_VERSION)/data/annotation | cut -f 1))"
@echo "BioCexp $(BIOC_VERSION): $(shell find bioconductor/$(BIOC_VERSION)/data/experiment -type f -name *.tar.gz | wc -l) packages ($(shell du -s -h bioconductor/$(BIOC_VERSION)/data/experiment | cut -f 1))"
sync-all: sync-cran sync-bioconductor
sync-cran:
mkdir -p cran/; \
rsync --verbose --human-readable \
$(OPTS) \
--times \
--recursive \
--delete \
--include 'src/' --include 'src/contrib/' \
--include 'src/contrib/*.tar.gz' \
--include 'src/contrib/PACKAGES*' \
--exclude '*' \
cran.r-project.org::CRAN \
cran/
sync-bioconductor:
mkdir -p bioconductor/$(BIOC_VERSION)/; \
rsync --verbose --human-readable \
$(OPTS) \
--times \
--recursive \
--delete \
--include 'src/' \
--include 'src/contrib/' \
--include 'src/contrib/*.tar.gz' \
--include 'src/contrib/PACKAGES*' \
--include 'bioc/' \
--include 'data/' \
--include 'data/annotation/' \
--include 'data/experiment/' \
--exclude '*' \
master.bioconductor.org::release \
bioconductor/$(BIOC_VERSION)/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment