Robert's Data Science Blog

Building R Packages

When putting an R package into a CRAN the package must be built, that is, packaged into a compressed format. In this post I show a few tricks when building R packages. To make things concrete I use my example R package arithmetic1.

From within R (at least when a package is also an RStudio project opened with RStudio) a package can be built with the pkgbuild package (which I usually call through the devtools package).

Source build

A source build of a package can be made with the R command devtools::build(). From the command line in the folder containing arithmetic1 this command does the same thing:

R CMD build arithmetic1

The output is:

* checking for file ‘arithmetic1/DESCRIPTION’ ... OK
* preparing ‘arithmetic1’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘arithmetic1_0.0.1.tar.gz’

At the time of writing the package version in arithmetic1’s DESCRIPTION file is 0.0.1 and the result of either of these commands is the file arithmetic1_0.0.1.tar.gz.

This tar.gz file allows us to check the package. From the command line this is done with

R CMD check arithmetic1_0.0.1.tar.gz

The output is

* checking for file ‘arithmetic1/DESCRIPTION’ ... OK
* preparing ‘arithmetic1’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘arithmetic1_0.0.1.tar.gz’

From within R this is achieved with devtools::check().

The package can be installed with

R CMD INSTALL arithmetic1_0.0.1.tar.gz

The output is

* installing to library ‘/home/robert/R/x86_64-pc-linux-gnu-library/3.6.1’
* installing *source* package ‘arithmetic1’ ...
** using staged installation
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* DONE (arithmetic1)

Binary build

A binary build carries out the operations performed with installing the package, but saves the result in a compressed file instead of R’s package library. From within R we can use the command devtools::build(binary = TRUE). From the command line we use

R CMD INSTALL --build arithmetic1

R now does much more work:

* installing to library ‘/home/robert/R/x86_64-pc-linux-gnu-library/3.6.1’
* installing *source* package ‘arithmetic1’ ...
** using staged installation
** R
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path
* creating tarball
packaged installation of ‘arithmetic1’ as ‘arithmetic1_0.0.1_R_x86_64-pc-linux-gnu.tar.gz’
* DONE (arithmetic1)

Most of this output is identical to the output of R CMD INSTALL arithmetic1_0.0.1.tar.gz.

If the package contains compiled code this is also compiled. This is the crucial feature that makes it so much faster to install R package from CRAN faster on Windows and Mac than on Linux.

The result is the file arithmetic1_0.0.1_R_x86_64-pc-linux-gnu.tar.gz. Note that my platform information is included in the file name.

The binary build can be installed just as the soruce build.

R CMD INSTALL arithmetic1_0.0.1_R_x86_64-pc-linux-gnu.tar.gz

The main difference is that it is much faster, yielding only this output:

* installing to library ‘/home/robert/R/x86_64-pc-linux-gnu-library/3.6.1’
* * installing *binary* package ‘arithmetic1’ ...
* * DONE (arithmetic1)

It is not possible to perform the check on the binary build, but this is also done as part of build.

A note about CRAN

A CRAN always contains source builds of a package and it should also contain binary builds for Windows and Mac. As demonstrated here it is certainly possible to make binary builds on Linux, but they depend on the distribution and architecture of the computer – of which there are many combinations. This is the reason that the official CRANs do not contain binary versions for Linux.

If we always use identical Linux distributions it is certainly possible to use binary build. One particular usecase is when running R in a Docker container.

A very cool feature in RStudio’s Package Manager is that it contains these binary builds for various Linux distributions, which are obtained by using R installed in a number of different Docker images.