RStudio As An IDE For C++

Peter von Rohr

2016-07-11

Disclaimer

The aim of this post is to see whether RStudio can be used as an IDE (Integrated Development Environment) for C/C++

Introduction

C++ is a widely used compiled language. As in any other compiled language, software development is structured as a write-compile-link-execute-debug loop. Except for the writing part, IDEs (Integrated Development Environments) can help software developers to ease the burdon of all other steps of the development cycle.

Known IDEs for C++

The choice of IDE can depend on the operating system on which the programs are written and executed. On Mac OS X, XCode is probably the most widely used IDE.

XcodeLogo

XcodeLogo

On Linux there is Kdevelop coming with the KDE window manager. DevC++ is a very convenient and light-weigt IDE on windows. Eclipse CDT is a version of Eclipse specially tailored for C++ development that can be used on any of the mentioned platforms.

EclipseLogo

EclipseLogo

Similarly to Eclipse, there exist C++ plugins for NetBeans which is the open source IDE originally developed by SUN and now distributed by Oracle.

NetBeansLogo

NetBeansLogo

Downside of IDEs

The downside of most IDEs is that they are rather heavy weight. Most users who just want to write small command line programs are lost in the massive number of features offered by the above listed IDEs.

Alternatives

Alternatives to heavy-weight IDEs are slim editors like TextWrangler on the Mac or Notepad++ on Windows. Those editors provide a small set of functionalities but it feels that they are easier to use than the large-scale IDEs.

RStudio - a small IDE

Recently, RStudio has become quite well known in the community of R developers. Since R has several interfaces to be used with C/C++ it is quite likely that R developers also write C/C++ code.

The aim of this post is to make a few tests and check how useful RStudio might be as an IDE for C/C++. Our tests are based on a few assumptions which might affect the relevance of the tests and the usefulnes of this post. Here are our assumptions

  1. Development of C/C++ code using RStudio cannot be done without R. We also assume that fairly recent versions of R and RStudio are installed and available.
  2. The R packages Rcpp and inline are required to bridge between R and C/C++. Those packages can easily be installed using the function install.packages(pkgs = c("Rcpp", "inline")).
  3. Software programs are organized in projects and the structure of a project is determined by the R package functionality. But no further knowledge about R packages is required. The complete overhead that comes with R packages is hidden away by RStudio.

Prerequisites

To run a C++ program, a compiler must be installed on the machine that we want to run the program on. The basic installation of R does not come with a C++ compiler. Since compilers are platform-specific, each operating system platform has different requirements. On Linux, in most cases a C++ compiler is already pre-installed. On MacOs X, the command-line tools of XCode must be available. On Windows, we recommend to install the Rtools program.

First Steps

The easiest way to start a C/C++ - project with RStudio is to create a new R-package and thereby selecting the option ``with Rcpp’’.

PackageTypeWithRcpp

PackageTypeWithRcpp

This automatically creates the directory src inside the project directory. Inside that src directory there are two files called rcpp_hello.cpp and RcppExports.cpp. These files contain some Hello-World example code in C++. The content of rcpp_hello.cpp is shown below

#include <Rcpp.h>
using namespace Rcpp;

// This is a simple function using Rcpp that creates an R list
// containing a character vector and a numeric vector.
//
// Learn more about how to use Rcpp at:
//
//   http://www.rcpp.org/
//   http://adv-r.had.co.nz/Rcpp.html
//
// and browse examples of code using Rcpp at:
// 
//   http://gallery.rcpp.org/
//

// [[Rcpp::export]]
List rcpp_hello() {
  CharacterVector x = CharacterVector::create("foo", "bar");
  NumericVector y   = NumericVector::create(0.0, 1.0);
  List z            = List::create(x, y);
  return z;
}

Call to rcpp_hello()

The function rcpp_hello() can be called from the R console using the function Rcpp::sourceCpp() which makes the content of the source file rcpp_hellp.cpp available to the R system. Please note that in this document the variable sHwCppFn was assigned earlier to the name of the C++ source file which corresponds to ../src/rcpp_hello.cpp.

Rcpp::sourceCpp(sHwCppFn)
## Warning in normalizePath(path.expand(path), winslash, mustWork):
## path[1]="C:/Daten/GitHub/charlotte-ngs/RStudioAsCppEditor/master/
## RStudioAsCppEditor/src/../inst/include": Das System kann den angegebenen
## Pfad nicht finden
rcpp_hello()

[[1]][1] “foo” “bar”

[[2]][1] 0 1

Alternatively, we can load the content of the whole R-package using function load_all() from package devtools.

Create Your Own

Running the Hello World example is fine at the beginning, but we are definitely aiming at something more interesting. We want to write our own functions. This can be done by adding a new .cpp file. New .cpp files can be added by clicking on File > New > C++-File as shown on the screen-shot below

NewCppFile

NewCppFile

This creates an empty file with an example function which looks as follows.

#include <Rcpp.h>
using namespace Rcpp;

// This is a simple example of exporting a C++ function to R. You can
// source this function into an R session using the Rcpp::sourceCpp
// function (or via the Source button on the editor toolbar). Learn
// more about Rcpp at:
//
//   http://www.rcpp.org/
//   http://adv-r.had.co.nz/Rcpp.html
//   http://gallery.rcpp.org/
//

// [[Rcpp::export]]
NumericVector timesTwo(NumericVector x) {
  return x * 2;
}


// You can include R code blocks in C++ files processed with sourceCpp
// (useful for testing and development). The R code will be automatically
// run after the compilation.
//

/*** R
timesTwo(42)
*/

The content of this file can be adjusted according to our needs. Then the file with the new C++ source code is saved under a different name. As an example, we can create a C++ function called vecSquare(x). The function vecSquare(x) computes the scalar square of its argument x. This function might look as follows.

#include <Rcpp.h>
using namespace Rcpp;

// This is a simple example of exporting a C++ function to R. You can
// source this function into an R session using the Rcpp::sourceCpp
// function (or via the Source button on the editor toolbar). Learn
// more about Rcpp at:
//
//   http://www.rcpp.org/
//   http://adv-r.had.co.nz/Rcpp.html
//   http://gallery.rcpp.org/
//

// [[Rcpp::export]]
NumericVector vecSquare(NumericVector x) {
  return x * x;
}


// You can include R code blocks in C++ files processed with sourceCpp
// (useful for testing and development). The R code will be automatically
// run after the compilation.
//

/*** R
vecSquare(42)
*/

The following block of statement first sources the newly created cpp-function using Rcpp’s function sourceCpp(). This immediately executes the R-statement at the end of the source file. calls function vecSquare() with two different arguments. As seen from the last statement in the following code block, the function vecSquare() also accepts a vector as an argument.

cat(" * Source file: ", sVecSquareCppFn, "\n")
Rcpp::sourceCpp(sVecSquareCppFn)
## Warning in normalizePath(path.expand(path), winslash, mustWork):
## path[1]="C:/Daten/GitHub/charlotte-ngs/RStudioAsCppEditor/master/
## RStudioAsCppEditor/src/../inst/include": Das System kann den angegebenen
## Pfad nicht finden

vecSquare(42) [1] 1764

vecSquare(c(1:3))

[1] 1 4 9

Convenient Functionality

Like other IDEs RStudio comes with a set of convenitent functionalities. When it comes to working with cpp-source files, the main focus is on the source pane. When a cpp-source file is open the function bar changes to show specialized icons related to workig with cpp-source files. The icon representing the most relevant functionality is the right most icon on the left side of the source pane. The screen shot below shows the set of functionalities that is available.

CppSourceFunctionality

CppSourceFunctionality

The functionalities range from Code Completion to finding function definition and function usages to refactoring code to formatting comments and code to showing diagnostics. The search for functio definition and usage does also work accross language barriers, i.e., we can search for the definition of a cpp-function out of an R-script. This can easily be done using the Find in Files function under the menu Edit.

FindInFiles

FindInFiles

Seamless R and C++

Dirk Edelbuettel has written a monograph (Eddelbuettel (2013)) on the package Rcpp which was already used in this post. In what follows some examples of that book are reproduced to illustrate how R can be extended using the C++ language.

In the following few sections, some of the material of the book Eddelbuettel (2013) is used to explain, how C++ code can be developed using RStudio and how this code can effectively be used from R.

Using inline - An even simpler method of integrating C++ code

As mentioned in Section 1.2 of Eddelbuettel (2013), the R-package inline provides a wrapper around the compiling, linking and loading steps that are required when using externally compiled code in R. Package inline provides function cxxfunction() as a single entry point to transform a character string into an executable function. How this works is shown below using the example of computing Fibonacci numbers.

incltxt <- '
int cpp_fib(const int x){
  if (x == 0) return(0);
  if (x == 1) return(1);
  return(cpp_fib(x-1) + cpp_fib(x-2));
}
'
# above snippets is used in the follwing function definition
fibR <- inline::cxxfunction(signature(xs = "int"),
                    plugin = "Rcpp",
                    incl = incltxt,
                    body = '
  int x = Rcpp::as<int>(xs);
  return Rcpp::wrap( cpp_fib(x) );'
)

Two arguments are supplied to function cxxfunction()

  1. the pure C++ function which is passed as argument includes. This allows to specify more include directives or even function or class definitions.
  2. the wrapper function which is specified as argument body.

Once the cxxfunction() has been run successfully, the resulting function fibR can be called like any other ordinary R-function.

Using Rcpp attributes

The inline package is very convenient, as it allows to extend functionality by compiled code from within an R session. More recently, inline has been complemented by a new approach that is based on an upcoming feature in the new C++ standard called the “attributes”. As of version 0.10.0 of Rcpp, attributes are implemented internally. The only thing the programmer has to do, is to declare certain “attributes”, notably whether a certain function has to be exported for use from R or from other C++ functions or both.

This approach of declaring attributes in C++ source files was already show at the beginning of this section. RStudio uses “Rcpp attributes” as its default method for making C++ source code available to R when a new source file is added via the RStudio menu “File > New File > C++ File”.

A simple example of using the “Rcpp attributes” framework is shown below.

#include <Rcpp.h>
using namespace Rcpp;

// This is a simple example of exporting a C++ function to R. You can
// source this function into an R session using the Rcpp::sourceCpp
// function (or via the Source button on the editor toolbar). Learn
// more about Rcpp at:
//
//   http://www.rcpp.org/
//   http://adv-r.had.co.nz/Rcpp.html
//   http://gallery.rcpp.org/
//

// [[Rcpp::export]]
int fibonacci(const int x) {
  if (x < 2)
    return x;
  else
    return (fibonacci(x - 1)) + fibonacci(x - 2);
}

The key element in the above code snippet is the “[[Rcpp::export]]” attribute preceding the function declaration. As already shown at the beginning of this section, the C++ function “fibonacci()” can be called like an ordinary R-function.

Rcpp::sourceCpp( file = "../src/fibonacci.cpp" )
## Warning in normalizePath(path.expand(path), winslash, mustWork):
## path[1]="C:/Daten/GitHub/charlotte-ngs/RStudioAsCppEditor/master/
## RStudioAsCppEditor/src/../inst/include": Das System kann den angegebenen
## Pfad nicht finden
fibonacci(10)

[1] 55

A second solution to Fibonacci using memoization

The problem with computing Fibonacci numbers with simple recursions is that many values are computed several times. The idea behind memoization is to store all values as soon as they are computed. New function values are taken from the memory, if they have been computed before. The following piece of code uses memoization for computing Fibonacci numbers.

mincltxt <- '
#include <algorithm>
#include <vector>
#include <stdexcept>
#include <cmath>
#include <iostream>

class Fib {
public:
  Fib(unsigned int n = 1000) {
    memo.resize(n); // reserve n elements
    std::fill( memo.begin(), memo.end(), NAN ); // set to NaN
    memo[0] = 0.0; // initialize for
    memo[1] = 1.0; // n=0 and n=1
  }
  double fibonacci(int x) {
    if (x < 0) // guard against bad input
      return( (double) NAN );
    if (x >= (int) memo.size())
      throw std::range_error(\"x too large for implementation\");
    if (! std::isnan(memo[x]))
      return(memo[x]); // if exist, reuse values
    // build precomputed value via recursion
    memo[x] = fibonacci(x-2) + fibonacci(x-1);
    return( memo[x] ); // and return
  }
private:
  std::vector< double > memo; // internal memory for precomp.
};
'
## now use the snippet above as well as one argument conversion
## in as well as out to provide Fibonacci numbers via C++
mfibRcpp <- inline::cxxfunction(signature(xs="int"),
                        plugin="Rcpp",
                        includes=mincltxt,
                        body='
  int x = Rcpp::as<int>(xs);
  Fib f;
  return Rcpp::wrap( f.fibonacci(x-1) );
')

The above shown code block defines a simple C++ class called “Fib” consisting of three elements.

  1. a constructor which is called upon initialisation
  2. a public function that computes the Fibonacci numbers and
  3. a private vector that holds the memoization values

Hence this example shows how C++ classes can be used in connection with inline. The actual wrapper function just instantiates an object “f” of class “Fib” and invokes the public function to computed the Fibonacci numbers.

A call to this last function can be done the same way as any other R-function is called using

mfibRcpp(14)

[1] 233

A third solution - Iteration

No matter what, the recursive solution has drawbacks whether it is implemented in R or in C++. Those can only be addressed when using an iterative approach to compute the Fibonacci numbers. The following code block shows a solution in C++, again using the function cxxfunction() from package inline.

## linear / iterative solution
fibRcppIter <- inline::cxxfunction(signature(xs="int"),
                           plugin="Rcpp",
                           body='
  int n = Rcpp::as<int>(xs);
  double first = 0;
  double second = 1;
  double third = 0;
  for (int i=0; i<n; i++) {
    third = first + second;
    first = second;
    second = third; 
  }
  return Rcpp::wrap(first);
')

The above shown function definition shows that the iterative computation of the Fibonacci numbers can directly be implemented in the wrapper function (argument body of inline::cxxfunction()). No further includes are required here, hence the argument includes is missing in the above shown definition of fibRcppIter(). This last solution is bound to be the fastest, as loops in C++ are certainly faster than recursions. A rather quick and dirty comparison shows the following results

nFibNum <- 30
system.time(fibR(nFibNum))

user system elapsed 0 0 0

system.time(fibRcppIter(nFibNum))

user system elapsed 0 0 0

A second example

The second example taken from Eddelbuettel (2013) considers so called vector autoregressive processes (VAR). The simplest case of a two-dimensional VAR of order one can be specified as

\[\mathbf{x}_t = A \mathbf{x}_{t-1} + \mathbf{u}_t\]

where \(\mathbf{x}_t\), \(\mathbf{x}_t\) and \(\mathbf{u}_t\) are all vectors of length two and \(A\) is a two by two matrix. Subscripts \(t\) and \(t-1\) stand for two consecutive time points.

A solution in R

VAR systems are usually studied by simulation. For that reason, suitable data has to be generated. Due to the interdependence of the elements in the VAR model, it cannot be vectorized in an easy way. As a result, we have to loop explicitly.

## parameter and error terms used throughout
set.seed(9876)
a <- matrix(c(0.5,0.1,0.1,0.5),nrow=2)
u <- matrix(rnorm(10000),ncol=2)
## Let’s start with the R version
rSim <- function(coeff, errors) {
  simdata <- matrix(0, nrow(errors), ncol(errors))
  for (row in 2:nrow(errors)) {
    simdata[row,] = coeff %*% simdata[(row-1),] + errors[row,]
  }
  return(simdata)
}

rData <- rSim(a, u)
dim(rData)

[1] 5000 2

References

Eddelbuettel, Dirk. 2013. Seamless R and C++ Integration with Rcpp. New York: Springer.