Step by step guide for creating a package that depends on RStan

stopifnot(require(knitr))
opts_chunk$set(
  comment=NA,
  eval = params$EVAL
)

Introduction

In this vignette we will walk through the steps necessary for creating an R package that depends on Stan by creating a package with one function that fits a simple linear regression. Before continuing, we recommend that you first read the other vignette Guidelines for Developers of R Packages Interfacing with Stan.

Creating the package skeleton

To start off, we use rstan_package_skeleton to initialise a bare-bones package directory. The name of our demo package will be rstanlm; it will fit a simple linear regression model using Stan.

library("rstantools")
rstan_package_skeleton(path = 'rstanlm')
library("rstantools")
td <- tempdir()
PATH <- file.path(td, "rstanlm")
rstan_package_skeleton(path = PATH, rstudio=FALSE, open=FALSE)

If we had existing .stan files to include with the package we could use the optional stan_files argument to rstan_package_skeleton to include them. Another option, which we'll use below, is to add the Stan files once the basic structure of the package is in place.

We can now set the new working directory to the new package directory and view the contents. (Note: if using RStudio then by default the newly created project for the package will be opened in a new session and you will not need the call to setwd().)

setwd("rstanlm")
list.files(all.files = TRUE)
list.files(PATH, all.files = TRUE)
file.show("DESCRIPTION")
DES <- readLines(file.path(PATH, "DESCRIPTION"))
cat(DES, sep = "\n")

Some of the sections in the DESCRIPTION file need to be edited by hand (Title, Author, Maintainer, and Description), but rstan_package_skeleton() has added the necessary packages and versions to Depends, Imports, and LinkingTo. It also added the SystemRequirements and NeedsCompilation fields.

Read-and-delete-me file

Before deleting the Read-and-delete-me file in the new package directory make sure to read it because it contains some important instructions about customizing your package:

file.show("Read-and-delete-me")
cat(readLines(file.path(PATH, "Read-and-delete-me")), sep = "\n")

You can move this file out of the directory, delete it, or list it in the .Rbuildignore file if you want to keep it in the directory.

file.remove('Read-and-delete-me')
file.remove(file.path(PATH, 'Read-and-delete-me'))

Stan files

Our package will call RStan's sampling method to use MCMC to fit a simple linear regression model for an outcome variable y with a single predictor x. After writing the necessary Stan program, the file should be saved with a .stan extension in the src/stan_files/ subdirectory. We'll save the following program to src/stan_files/lm.stan:

// Save this file as src/stan_files/lm.stan
data {
  int<lower=1> N;
  vector[N] x;
  vector[N] y;
}
parameters {
  real intercept;
  real beta;
  real<lower=0> sigma;
}
model {
  // ... priors, etc.

  y ~ normal(intercept + beta * x, sigma);
}
stan_prog <- "
data {
  int<lower=1> N;
  vector[N] x;
  vector[N] y;
}
parameters {
  real intercept;
  real beta;
  real<lower=0> sigma;
}
model {
  // ... priors, etc.

  y ~ normal(intercept + beta * x, sigma);
}
"
cat(stan_prog, file = file.path(PATH, "src", "stan_files", "lm.stan"))

The src/stan_files subdirectory can contain additional Stan programs if required by your package. During installation, all Stan programs will be compiled and saved in the list stanmodels that can then be used by R function in the package. The rule is that the Stan program compiled from the model code in src/stan_files/foo.stan is stored as list element stanmodels$foo.

R files

We next create the file R/lm_stan.R where we define the function lm_stan in which our compiled Stan model is being used. A comment block in roxygen2 syntax ensures that the function has a help file and that it is added to the NAMESPACE:

# Save this file as `R/lm_stan.R`

#' Bayesian linear regression with Stan
#'
#' @export
#' @param x Numeric vector of input values.
#' @param y Numberic vector of output values.
#' @param ... Arguments passed to `rstan::sampling` (e.g. iter, chains).
#' @return An object of class `stanfit` returned by `rstan::sampling`
#'
lm_stan <- function(x, y, ...) {
  standata <- list(x = x, y = y, N = length(y))
  out <- rstan::sampling(stanmodels$lm, data = standata, ...)
  return(out)
}
Rcode <- "
#' Bayesian linear regression with Stan
#'
#' @export
#' @param x Numeric vector of input values.
#' @param y Numberic vector of output values.
#' @param ... Arguments passed to `rstan::sampling`.
#' @return An object of class `stanfit` returned by `rstan::sampling`
lm_stan <- function(x, y) {
  out <- rstan::sampling(stanmodels$lm, data=list(x=x, y=y, N=length(y)))
  return(out)
}
"
cat(Rcode, file = file.path(PATH, "R", "lm_stan.R"))

The top-level package file R/rstanlm-package.R has already been created by rstan_package_skeleton() but needs to be modified to decribe the functionality of our package:

file.show(file.path("R", "rstanlm-package.R"))
cat(readLines(file.path(PATH, "R", "rstanlm-package.R")), sep = "\n")

The description section needs to be manually edited but the necessary roxygen2 tags for specifying important parts of the NAMESPACE file have already been included.

Documentation

To update the NAMESPACE file and the rest of the documentation to include lm_stan we need to regenerate the documentation using roxygen2::roxygenise (or devtools::document):

roxygen2::roxygenise(clean=TRUE)
roxygen2::roxygenise(PATH, clean=TRUE)

Install and use

Finally, the package can be installed:

devtools::install(local=FALSE)
devtools::load_all(PATH, recompile=TRUE)

The argument local=FALSE is necessary if you want to recompile the Stan models. If you only made a small change to the R code or the documentation, you can set local=TRUE to speed up the process.

The package can now be loaded and used like any other R package:

library("rstanlm")
fit <- lm_stan(y = rnorm(10), x = rnorm(10), iter = 500)
print(fit)
unlink(PATH, recursive = TRUE)

Links



Try the rstantools package in your browser

Any scripts or data that you put into this service are public.

rstantools documentation built on April 17, 2018, 9:04 a.m.