StatComp21064 is a simple R package developed to compare the performance of R and R++ (implemented through the R package Rcpp) for the 'Statistical Computing' course. One function is considered, namely, gibbs (generate random nubers using Gibbs sampler). For each function, both R and Rcpp versions are produced. Namely gibbsR and gibbsC for C++.In addition, according to the requirements of the project, there are another two functions gradient.descent and EM for R.
The R package 'microbenchmark' can be used to benchmark the above R and C++ functions.
The source R code for gibbsR is as follows:
gibbsR <- function(N, thin, n, aa, bb) { mat <- matrix(nrow = N, ncol = 2) x <- y <- 0 for (i in 1:N) { for (j in 1:thin) { x <- rbinom(1,n,y) y <- rbeta(1,x+aa,n-x+bb) } mat[i, ] <- c(x, y) } mat }
The above R code involves two loops, which could be very slow even for R-3.01 or any higher version. The corresponding C++ code is as follows.
NumericMatrix gibbsC(int N, int thin, int n, int aa, int bb) { NumericMatrix mat(N, 2); double x = 0, y = 0; for(int i = 0; i < N; i++) { for(int j = 0; j < thin; j++) { x = rbinom(1,n,y)[0]; y = rbeta(1,x+aa,n-x+bb)[0]; } mat(i, 0) = x; mat(i, 1) = y; } return(mat); }
The R code for benchmarking gibbsR and gibbsC is as follows.
library(StatComp21064) library(microbenchmark) tm2 <- microbenchmark( vR = gibbsR(1e2, 10, 22, 2, 2), vC = gibbsC(1e2, 10, 22, 2, 2) ) knitr::kable(summary(tm2)[,c(1,3,5,6)])
The results again show an evident computational speed gain of C++ against R.
This is a function which depends on gradient descent method. The descending direction is the gradient direction.The descending step length can be fixed, or it can be solved by the backtracking method. The general idea of the backtracking method is that on the function f, use the parameter alpha to make a secant on f at the point before the drop. If the point after the drop is within the intersecting section, there is no need to adjust the stepsize, otherwise use the parameter b to adjust the stepsize. \ The source R code for gradient.descent is as follows:
gradient.descent<-function(x,y,epsilon=1e-14,maxit=1000,stepsize=1e-3,alpha=0.25,b=0.8,stepmethod="backtracking",verbose=TRUE,plotLoss=TRUE){ stopifnot(stepsize<=1e-3) m<-nrow(x) x<-cbind(rep(1,m),x) n<-ncol(x) beta<-numeric(n) iter<-1 error<-1 loss<-crossprod(x%*%beta-y) while((error>epsilon)&&(iter<maxit)){ h<-x %*% beta grad<-crossprod(x,(h-y)) beta.new<-beta-stepsize*grad loss.new<-crossprod(x%*%beta.new-y) #update for next iter if(stepmethod=="backtracking") if(loss.new>(loss[iter]+alpha*stepsize*sum(grad*grad))) stepsize=stepsize*b error<-max(abs(beta.new-beta)) loss<-append(loss,loss.new) beta<-beta.new iter<-iter+1 #thistime if(verbose&& iter%%10==0) message(paste('Iteration:',iter)) } if(plotLoss) plot(loss, type="l",bty="n") return(list(par=beta,RSE=sqrt(crossprod(x%*%beta.new-y)/(nrow(x)- ncol(x))),iter=iter)) }
n=100 x1<-rnorm(n) x2<-rnorm(n) y=1+0.5*x1+x2+rnorm(n,sd=0.1) x<-cbind(x1,x2) gradient.descent(x,y)
Normal mixed model: $$f(x)=\sum^C_{i=1}\pi_i\phi(x;\mu_i,\Sigma_i),$$ Among them, $\phi(x;\mu_i,\Sigma_i)$ is the bivariate normal distribution with the mean value of $\mu_i$ and the covariance matrix of $\Sigma_i$, and the parameters are $(\pi_i,\mu_i,\Sigma_i ),i=1,\cdots,C$, remember $x_j,j=1,\cdots,n$ is the observed n samples, $z_{ij}=P(x_j\sim \phi(x;\ mu_i,\Sigma_i)),i=1,\cdots,C,j=1,\cdots,n$ are missing values. \ In the tth iteration of the EM algorithm, given $(\pi^t_i,\mu^t_i,\Sigma^t_i),i=1,\cdots,C,$ \ Step E is: $$ \begin{aligned} z^t_{kl}&=E[z_{kl}|\pi^t_i,\mu^t_i,\Sigma^t_i,x_j,i=1,\cdots,C,j=1,\cdots,n]\ &=P(x_l\sim \phi(x;\mu_k,\Sigma_k)|\pi^t_i,\mu^t_i,\Sigma^t_i,x_l,i=1,\cdots,C)\ &=\frac{\pi^t_k\phi(x_l;\mu^t_k,\Sigma^t_k)}{\sum^C_{i=1}\pi^t_i\phi(x_l;\mu^t_i,\Sigma^t_i)} \end{aligned} $$ The M steps are: $$ \begin{aligned} \pi^{t+1}i&=\frac{\sum^n{j=1}z^t_{ij}}{n}\ \mu^{t+1}i&=\frac{\sum^n{j=1}z^t_{ij}x_j}{\sum^n_{j=1}z^t_{ij}}\ \Sigma^{t+1}i&=\frac{\sum^n{j=1}z^t_{ij}(x_j-\mu^{t+1}i)(x_j-\mu^{t+1)_i)^T}}{\sum^n{j=1}z^t_{ij}} \end{aligned} $$
The important point is to draw x first (if the dimension is greater than 2 dimensions, it can be drawn in two or two dimensions, but it is not suitable for too high dimensions), estimate that the data is approximately mixed by several normals, determine C, and approximately determine mu0 based on the aggregation (This is very important!!! Mu0 must not deviate too far from the final true value, otherwise an error will be reported!!!) \ The source R code for EM is as follows:
library(mvtnorm)
EM<-function(x,C,iter,mu0){ n <- nrow(x) p <- ncol(x) # initialize Z, mu, Sig, pi Z <- matrix(c(rep(1, C), rep(0, (n - 1) * C)), n, C) Sig <- matrix(rep(c(1, 0, 0, 1), C), ncol = p, byrow = T) pi <- rep(1 / C, C) mu<-mu0 # start EM here for (t in 1:iter) { for (k in 1:n) { for (l in 1:C) { Z[k,l] <- pi[l] *mvtnorm::dmvnorm(x[k,], mu[l,], Sig[(p * (l - 1) + 1):(p * l),]) } Z[k,] <- Z[k,] / sum(Z[k,]) } # update Z (E-step) pi <- colMeans(Z) # update pi (M-step-1) for (i in 1:C) { mu[i,] <- t(Z[,i]) %*% x / sum(Z[,i]) # update mu (M-step-2) sumsig <- Z[1,i] * (x[1,] - mu[i,]) %*% t(x[1,] - mu[i,]) for (k in 2:n) { sumsig <- sumsig + Z[k,i] * (x[k,] - mu[i,]) %*% t(x[k,] - mu[i,]) } Sig[(p * (i - 1) + 1):(p * i),] <- sumsig / sum(Z[,i]) # update Sigma (M-step-3) } } return(list(mu=mu,Sig=Sig)) }
Do it with the R-built-in data "faithful", this mu0 is observed according to the figure:
library(datasets) x<-as.matrix(datasets::faithful) EM(x,2,20,matrix(c(2,50,4,80),2,2,byrow=T))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.