This is a development version of the scmc
package for the R
programming language.
The package isn’t available on CRAN so the only way to install the
package is to use the devtools
package and run
devtools::install_github("blaza/scmc")
The main function currently implemented is univariate_sampler
which is
a flexible implementation of the method (and thus with a bit more
complicated interface) for generating univariate distributions. We’ll
cover here a couple of basic examples which give an overall picture of
the package capabilities.
We’ll generate variates from the Logistic distribution.
The SCMC method implies interpolating , where
is the target random variable and is a random variable which can
be efficiently generated, and generating the samples
using the formula ,
where are variates from the distribution. In this
example we’ll use the standard normal variable . By default, we use
the RcppZiggurat::zrnorm
function to generate normal variates.
The code to generate the Logistic distribution in the scmc
package is
library(scmc)
# create the sampler
sampler <- univariate_sampler(qlogis, gaussian_nodes(7))
## Loading required package: RcppZiggurat
# generate 10000 random variates
smp <- sampler(1e5)
In its basic form, the univariate_sampler
function requires the
inverse (i.e. the quantile function of ) as the first
argument, and the nodes for the interpolation. In cases where normally
distributed are used, optimal nodes for interpolation are the
nodes of the Gaussian quadrature with respect to the weight function
(density of ). The third argument to
univariate_sampler
is xdist
which is by default "norm"
, indicating
the standard normal distribution.
The quality of the generated sample can be visualized with it’s density
# plot the sample density
plot(density(smp))
# add a plot of the theoretical logistic density
curve(dlogis(x), add = TRUE, col = "green")
The curves are nearly the same, so the approximation is good.
For the next example, we’ll use the gamma
distribution,
specifically . This is a positive distribution, so we
would like to transform it using the log
transform to get a real
valued random variable and then upon sampling use the exp
transform to
get a sample from the original distribution. Basically, we model
instead of and sample to get . The
code example follows
library(scmc)
# create the sampler
sampler <- univariate_sampler(function(x) qgamma(x, 5, 2),
gaussian_nodes(7),
transform = log, # the transformation of
# the quantile function
inv_transform = exp) # the inverse of transform
# generate 10000 random variates
smp <- sampler(1e5)
We can check the quality of the sample distribution by plotting the empirical cdf and the theoretical cdf of or the histogram overlayed with the density.
par(mfrow=c(1, 2))
smp_ecdf <- ecdf(smp)
curve(smp_ecdf(x), xlim = c(0, 7))
curve(pgamma(x, 5, 2), add = TRUE, col = "green")
hist(smp, breaks = 100, probability = TRUE)
curve(dgamma(x, 5, 2), add = TRUE, col = "green")
Again, the approximation is excellent.
Now we’ll deomnstrate using the grid stretching technique for
heavy-tailed distributions. We use the students t
distribution
with 2 degrees of freedom. All that’s needed is to add a gss
argument
which specifies the value in the technique (section 4.1.1. in
Grzelak et al. 2014).
library(scmc)
# create the sampler
sampler <- univariate_sampler(function(x) qt(x, df = 2),
gaussian_nodes(15),
gss = 1.657)
# generate 10000 random variates
smp <- sampler(1e5)
We’ll use the Kolmogorov-Smirnov test now to test the distribution
ks.test(smp, function(x) pt(x, df = 2))
##
## One-sample Kolmogorov-Smirnov test
##
## data: smp
## D = 0.0024044, p-value = 0.6098
## alternative hypothesis: two-sided
A large p-value indicates a good distribution.
This time we demonstrate sampling from a bounded distribution. A good
example is the beta
distribution. It comes
in a wide variety of shapes, and we’ll use the
distribution, with a distinct “U” shape. For this example we use the
chebyshev points for interpolation, and interpolate directly
. This is equivalent to using a uniformly distributed ,
so we’ll set the xdist="unif"
argument.
library(scmc)
# create the sampler
sampler <- univariate_sampler(function(x) qbeta(x, 0.5, 0.5),
chebyshev_nodes(15),
xdist = "unif")
# generate 10000 random variates
smp <- sampler(1e5)
And we visualise the distribution with the histogram
hist(smp, breaks = 100, probability = TRUE)
curve(dbeta(x, 0.5, 0.5), add = TRUE, col = "green")
and run
ks.test
ks.test(smp, function(x) pbeta(x, 0.5, 0.5))
##
## One-sample Kolmogorov-Smirnov test
##
## data: smp
## D = 0.0037263, p-value = 0.1244
## alternative hypothesis: two-sided
which confirms a good approximation.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.