A C library in an R Package
There are good online tutorials for how to get started with the Rcpp package – in particular the documentation for Rcpp.
But what if you want to use the functionality of a C(++) library in an R package?
This simple demonstration package implements a mymean and a mysum function for vectors using the Rcpp package.
The mysum function is what I ultimately want in its own library, while mymean is a function in the R package that uses mysum.
The mysum function is the same throughout this post:
In this post I show how we move from one big cpp file to a mysum library in a separate folder.
If you want an example of including a large C library in an R package, check out the GitHub repo for the haven package.
Create the basic package
Create a minimal package with Rcpp. With RStudio (File > New Project... > R Package using Rcpp) or these commands:
devtools::create()
usethis::use_rcpp()
I prefer to use the roxygen2 package for documentation.
(If the project is created using the point and click way, I first delete the NAMESPACE file.)
I therefore add the following to (e.g.) R/utils.R to update the NAMESPACE file correctly when running devtools::document:
#' @useDynLib mypkg, .registration = TRUE
#' @importFrom Rcpp sourceCpp
NULL
With the mymean.cpp that is introduced shortly, the folder hierarchy in the package directory is now:
mypkg
├── DESCRIPTION
├── man
├── mypkg.Rproj
├── NAMESPACE
├── R
│ └── utils.R
└── src
└── mymean.cpp
When installing the package Hadley Wickham encourages the Build & Reload button in RStudio’s Build pane.
In the first build two extra files are automatically generated by Rcpp: R/RcppExports.R and src/RcppExports.cpp.
The build part related to the C++ code is (sans the special compiler flags):
g++ ... -c RcppExports.cpp -o RcppExports.o
g++ ... -c mymean.cpp -o mymean.o
g++ ... -o mypkg.so RcppExports.o rcpp_hello_world.o
This reads as follows:
mymean.cpp and RcppExports.cpp are each compiled to an object file. The object files are then linked into a shared object file mypkg.so that R can call.
We will see how these compiler commands change during the post.
Only one C++ file
The initial content of src/mymean.cpp are two function – one for summing the elements of a vector and one to compute the average of the elements in a vector:
#include <stddef.h>
#include <Rcpp.h>
using namespace Rcpp;
double mysum(size_t n, double *X) {
double s = 0.0;
for (size_t i = 0; i < n; ++i) {
s += X[i];
}
return s;
}
//' @export
// [[Rcpp::export]]
double mymean(NumericVector x) {
size_t n = x.size();
double total = mysum(n, x.begin());
return total / n;
}
There is one small trick here: mysum's second argument X is a pointer to an array of doubles. This is the same as the pointer to the first element of x in mymean, which is available as x.begin().
Using size_t (and therefore also the stddef header) for the size of X is probably overkill for this demo, but it fells more “C like”.
Include library in separate file
By default, any cpp file in the src folder is compiled when running devtools::install.
You need a header file to make the functions available between files as in any C(++) project, but that is all.
Include following as src/mysum.cpp:
#include <stddef.h>
double mysum(size_t n, double *X) {
double s = 0.0;
for (size_t i = 0; i < n; ++i) {
s += X[i];
}
return s;
}
The header file src/mysum.h defines the mysum function in the include guard of the same name:
#ifndef MYSUM
#define MYSUM
double mysum(size_t n, double *X);
#endif
In src/mean.cpp we replace the mysum function with an include of the header file:
#include <Rcpp.h>
using namespace Rcpp;
#include "mysum.h"
//' @export
// [[Rcpp::export]]
double mymean(NumericVector x) {
int n = x.size();
double total = mysum(n, x.begin());
return total / n;
}
Now mysum.cpp is compiled separately and the object file mysum.o is included in the shared object file.
g++ ... -c RcppExports.cpp -o RcppExports.o
g++ ... -c mymean.cpp -o mymean.o
g++ ... -c mysum.cpp -o mysum.o
g++ ... -o mypkg.so RcppExports.o mymean.o mysum.o
Include library in separate folder
We move on to have mysum in a subfolder of src.
Include library as C++
Now mysum.cpp is moved to the folder src/sum. The header can also be moved to src/sum, but it is not required.
A Makefile is needed now that tells Rcpp which files to compile, what the object files are called and what paths to include. The file is called Makevars on *nix and Makevars.win on Windows and is in the src folder:
CPPFILES = $(wildcard *.cpp sum/*.cpp)
SOURCES = $(CPPFILES)
OBJECTS = $(CPPFILES:.cpp=.o)
PKG_CXXFLAGS = -Isum
The CPPFILES are all the cpp files in src and src/sum.
The OBJECTS files have the same base name as the CPPFILES, but their filetype is o instead of cpp.
Finally, if the header file mysum.h is moved to src/sum this directory must be included in the compiler’s list of directories.
The only difference in the compiler commands is the change of location for the mysum files:
g++ ... -c RcppExports.cpp -o RcppExports.o
g++ ... -c mymean.cpp -o mymean.o
g++ ... -c sum/mysum.cpp -o sum/mysum.o
g++ ... -o mypkg.so mymean.o RcppExports.o sum/mysum.o
Include library as C
In the src/Makevars we now have a list of C++ files and a list of C files.
The union of these are the SOURCES files.
CFILES = $(wildcard sum/*.c)
CPPFILES = $(wildcard *.cpp)
SOURCES = $(CFILES) $(CPPFILES)
OBJECTS = $(CFILES:.c=.o) $(CPPFILES:.cpp=.o)
PKG_CXXFLAGS = -Isum
Using a C library in a C++ library requires a few special lines in the header file, src/sum/mysum.h:
#ifndef MYSUM
#define MYSUM
#ifdef __cplusplus
extern "C" {
#endif
double mysum(size_t n, double *X);
#ifdef __cplusplus
}
#endif
#endif
In the compiler commands the base C compiler gcc is now used instead of the C++ compiler:
gcc ... -c sum/mysum.c -o sum/mysum.o
g++ ... -c mymean.cpp -o mymean.o
g++ ... -c RcppExports.cpp -o RcppExports.o
g++ ... -o mypkg.so sum/mysum.o mymean.o RcppExports.o
The final file structure in mypkg:
mypkg
├── DESCRIPTION
├── man
├── mypkg.Rproj
├── NAMESPACE
├── R
│ ├── RcppExports.R
│ └── utils.R
└── src
├── Makevars
├── mymean.cpp
├── RcppExports.cpp
└── sum
├── mysum.c
└── mysum.h