R/atmcmc.R
In atmcmc: Automatically Tuned Markov Chain Monte Carlo

#' Runs a MCMC algorithm tuned via adaption and corresponding diagnostics
#' 
#' A symmetric random walk Metropolis algorithm with Gaussian proposals is automatically tuned and run and 
#' diagnosed to converge to a target distribution and estimate the stationary mean of a functional
#' 
#' The algorithm automatically tunes the covariance matrix of the proposal distribution \eqn{N(X_n, \Sigma_p)} of a symmetric random walk Metropolis algorithm. 
#' The algorithm can be broken down into four main phases: a 1st adaption phase, transient phase, 2nd adaption phase and sampling phase. The 1st adaption phase employs the 
#' Adaptive Metropolis-within-Gibbs algorithm from Roberts and Rosenthal (2009), and the diagnostics to end this phase is to check whether the acceptance rate for every 
#' coordinate comes into the desired range. The transient phase runs a Metropolis-within-Gibbs algorithm, and this runs until the chain reaches the mode of the target 
#' distribution. The purpose of the transient phase is to avoid including bad X values when tuning for \eqn{\Sigma_p = mult * \Sigma_n}, where \eqn{\Sigma_n} is the 
#' empirical covariance matrix of the target distribution calculated from the values generated by the Markov chain. The diagnostics to end this phase is to fit a simple 
#' linear regression to see if the chain values are trending. The 2nd adaption phase employs a slightly modified version of the Adaptive Metropolis algorithm from Haario et 
#' al. (2001) or Roberts and Rosenthal (2009). This phase updates \eqn{\Sigma_p} at every iteration by calculating the empirical covariance matrix of the target 
#' distribution from the chain values. The diagnostics to confirm whether this phase is indeed improving the chain is to see if the squared jumping distance between every 
#' consecutive iteration is increasing. Again, a simple linear regression is used to see if the squared jumping distances are increasing. After all this, a symmetric random 
#' walk Metropolis algorithm is run and Gelman-Rubin diagnostics is used to verify convergence of the Markov chain. Note that 2nd half of the sampling phase is what we 
#' take as a sample.\cr \cr
#' For a target distribution that is considered to be `strongly multimodal', the basic structure of the algorithm is still the same, but multiple chains are run in the 1st adaption phase 
#' and transient phase until each chain reaches different mode. The algorithm leaves only one chain for each unique local mode and deletes others. It considers two modes 
#' are different when, in at least one coordinate, the absolute value of the difference of two means is lesser than the smaller of the standard deviation of two. A 2nd 
#' adaption phase is run for each remaining chain, and after the 2nd adaption phase, the algorithm confirms whether each chain has different mode. For the sampling 
#' phase, at each iteration, the chain either moves inside one local mode or jumps to another mode at a fixed probability. Again, Gelman-Rubin diagnostics is used to check 
#' for convergence. 
#' 
#' @references Heikki Haario, Eero Saksman, and Johanna Tamminen. An adaptive metropolis algorithm. \emph{Bernoulli,} 7(2):223-242, 2001.\cr
#' @references Gareth O. Roberts and Jeffrey S. Rosenthal. Examples of adaptive MCMC. \emph{Journal of Computational and Graphical Statistics,} 18(2):349-367, 2009.\cr
#' @references Andrew Gelman and Donald B. Rubin. Inference from iterative simulation using multiple sequences. \emph{Statistical science,} 7(4):457-472, 1992.\cr
#' @references Stephen P. Brooks and Andrew Gelman. General methods for monitoring convergence of iterative simulations. \emph{Journal of computational and graphical statistics,} 7(4):434-455, 1998.\cr
#' @param g log of target density function
#' @param dim dimension of target density function
#' @param X0 initial value of Markov chain
#' @param support support of target density function. Argument takes a `dim' x 2 matrix 
#' @param multimodal whether to assume target density function is strongly multimodal. Argument takes T or F
#' @param functional function to estimate an expectation with respect to target density function 
#' @param maxiter maximum iteration number for a full MCMC run
#' @param mult constant multiplied to the covariance matrix of the proposal distribution in the 2nd adaption phase and the sampling phase
#' @param mrep number of chains created to find multiple modes if multimodal=T
#' @param nrep number of replicative chains for Gelman-Rubin diagnostics (convergence diagnostics) 
#' @param batchwidth number of iterations in a batch. Is used to find the average X values in the transient phase and the average squared jumping distances in the 2nd adaption phase. 
#' @param holdup holdup*batchwidth is the number of iterations between the start of the sampling phase and the first computation of Gelman-Rubin diagnostics (to check for convergence of the chain) in the sampling phase 
#' @param batchwidth.adp1 number of iterations in a batch initially used to check for the acceptance rate in the 1st adaption phase
#' @param scale.adj amount added to or subtracted from the log of standard deviation of the univariate Gaussian proposal for each coordinate in the 1st adaption phase
#' @param endbatch.adp1 batchwidth.adp1 x 2^(endbatch.adp1) iterations needs to have the desired level of acceptance rate to end the 1st adaption phase
#' @param minaccpt minimum acceptable acceptance rate to end the 1st adaption phase
#' @param maxaccpt maximum acceptable acceptance rate to end the 1st adaption phase
#' @param nreg number of distinct points in the simple linear regression to check if there is a linear trend in average X values in the transient phase or in average squared jumping distances in the 2nd adaption phase
#' @param startdist number multiplied to `max X value - min X value' in the 2nd adaption + flat part of transient phase (or just 2nd adaption phase if multimodal=T). Is used to determine over-disposed starting distribution for the sampling phase
#' @param minR minimum acceptable R value to end the sampling phase
#' @param maxR maximum acceptable R value to end the sampling phase
#' @param CI.alpha (1-CI.alpha)x100\% confidence interval is constructed for R_interval in the sampling phase
#' @param nimprob.X number of consecutive g(X)=-Inf iterations required to break off the full algorithm. This is to prevent the chain from drifting to a wrong direction 
#' @param minaccpt.adp2 minimum acceptable acceptance rate in the 2nd adaption phase to keep the value of `mult' as it is. If the acceptance rate is less than `minaccpt.adp2', `mult' is cut down to 'mult'/max(2,'dim')
#' @param batchwidth.adp2 first n number of iterations used in the 2nd adaption phase to check for the acceptance rate to decide whether to decrease `mult'
#' @param jumpprob probability of a jump between modes at each iteration in the sampling phase
#' @param displayfreq how frequently to display the iteration number as `atmcmc' runs. Every `displayfreq'th iteration number is printed on the screen
#' @param plot whether to display traceplots of coordinate 1 (& mode 1 for multimodal=T) as each phase ends. Takes argument T or F
#' @param m every `m'th iteration is plotted. Has to be a factor of `batchwidth'/2
#' @return  A list consisting of
#'  \item{Xveclistdim1, Xveclistdim2,...}{matrix of X values saved from the sampling phase. Each matrix contains X values of each coordinate}
#'  \item{Xveclistbase}{matrix of X values saved from the 1st adaption ,transient, and 2nd adaption phase. Includes only one chain for each unique local mode for multimodal=T}
#'  \item{nummodes}{number of unique local modes found for multimodal=T}
#'  \item{dim}{dimension of target density function}
#'  \item{batchwidth}{number of iterations in a batch. Is used to find the average X values in the transient phase and the average squared jumping distances in the 2nd adaption phase.}
#'  \item{means}{average of X values from the 2nd half of sampling phase}
#'  \item{functionalmeans}{average of functional X values from the 2nd half of sampling phase}
#'  \item{sumchain}{string of iteration numbers to show when each phase has ended. For the sampling phase, this shows when the 1st half of sampling phase has ended also}
#'  \item{acceptrate}{acceptance rate of the 2nd half of sampling phase}
#'  \item{runtime}{runtime of the full MCMC}
#' It also prints values of estimates, estimates_of_functional, acceptance_rate, time_elapsed, and phase_length. For details, see `summarymcmc' section
#' @examples dim = 3  #dimension of target density function
#' X0 = rep(0.1,dim)  #initial X value
#' 
#' tmpmat = rbind(c(-0.7, 1.2, 1.6),c(0.9, 1.1, -0.1),c(0.2, 0.3, -1.5))
#' targSigma = t(tmpmat) %*% tmpmat
#' targMean = c(22, -10,  15)
#' #log of target density function
#' g = function(X){-0.5*t(X-targMean)%*%solve(targSigma)%*%(X-targMean)}
#'   
#' output = atmcmc(g, dim, X0)
#' plotmcmc(output, name = "project1")
#' summarymcmc(output, name = "project1")
#' @export
atmcmc <- function ( g , dim, X0, support = cbind(rep(-Inf,dim), rep(Inf,dim)), multimodal=F, functional = function(X){X}, maxiter=2000000, mult = (2.38)^2 / (dim), mrep =10, nrep = 10, batchwidth = 200,
                     holdup=10, batchwidth.adp1 = 100, scale.adj=0.05, endbatch.adp1=2, minaccpt=0.28, maxaccpt=0.6, nreg=5, startdist=1.5, minR=0.9, maxR=1.1, CI.alpha=0.05, nimprob.X=100, 
                     minaccpt.adp2=0.02, batchwidth.adp2=200, jumpprob=0.05, displayfreq=100, plot=T, m=10) { 
  if ((multimodal!=F)&(multimodal!=T)) {stop("Argument multimodal should be either T or F", sep="")}
  
  rmvtnorm <-function(n, mu, sigma){
    d = length(mu)
    temp = matrix(rnorm(n*d,0,1),d,n)
    chol = t(chol(sigma))
    rvector= mu+chol%*%temp
    return(rvector)
  }
  
  retrieve <- function(argString, argInt) {return( eval(as.name(paste(argString, argInt, sep=""))))}
  
  if(multimodal==F){
    ptm <- proc.time()
    
    X0=as.matrix(X0)
    support=as.matrix(support)
    if (length(dim)!=1){stop("Argument dim should be scalar", sep="")}
    if ((dim(X0)[1]!=dim)|(dim(X0)[2]!=1)){stop("Argument X0 should be ",dim," x ",1," matrix", sep="")}
    if ((dim(support)[1]!=dim)|(dim(support)[2]!=2)){stop("Argument support should be ",dim," x ",2," matrix", sep="")}
    if (length(maxiter)!=1){stop("Argument maxiter should be scalar", sep="")}
    if (length(mult)!=1){stop("Argument mult should be scalar", sep="")}
    if (length(mrep)!=1){stop("Argument mrep should be scalar", sep="")}
    if (length(nrep)!=1){stop("Argument nrep should be scalar", sep="")}
    if (length(batchwidth)!=1){stop("Argument batchwidth should be scalar", sep="")}
    if (length(holdup)!=1){stop("Argument holdup should be scalar", sep="")}
    if (length(batchwidth.adp1)!=1){stop("Argument batchwidth.adp1 should be scalar", sep="")}
    if (length(scale.adj)!=1){stop("Argument scale.adj should be scalar", sep="")}
    if (length(endbatch.adp1)!=1){stop("Argument endbatch.adp1 should be scalar", sep="")}
    if (length(minaccpt)!=1){stop("Argument minaccpt should be scalar", sep="")}
    if (length(maxaccpt)!=1){stop("Argument maxaccpt should be scalar", sep="")}
    if (length(nreg)!=1){stop("Argument nreg should be scalar", sep="")}
    if (length(startdist)!=1){stop("Argument startdist should be scalar", sep="")}
    if (length(minR)!=1){stop("Argument minR should be scalar", sep="")}
    if (length(maxR)!=1){stop("Argument maxR should be scalar", sep="")}
    if (length(CI.alpha)!=1){stop("Argument CI.alpha should be scalar", sep="")}
    if (length(nimprob.X)!=1){stop("Argument nimprob.X should be scalar", sep="")}
    if (length(minaccpt.adp2)!=1){stop("Argument minaccpt.adp2 should be scalar", sep="")}
    if (length(batchwidth.adp2)!=1){stop("Argument batchwidth.adp2 should be scalar", sep="")}
    if (length(jumpprob)!=1){stop("Argument jumpprob should be scalar", sep="")}
    if (length(displayfreq)!=1){stop("Argument displayfreq should be scalar", sep="")}
    if ((plot!=F)&(plot!=T)) {stop("Argument plot should be either T or F", sep="")}
    if (length(m)!=1){stop("Argument m should be scalar", sep="")}
    if (((batchwidth/2) %% m)!=0){stop("m=",m, " has to be a factor of batchwidth/2=",batchwidth/2, sep="")}
    
    title <- bquote(bold(paste("Trace Plot Coordinate 1"))) 
    
    X=X0
    B = holdup*batchwidth  # hold-up before applying Gelman-Rubin diagnostics for the sampling phase
    count=0  #count of consecutive pi(X)==0
    Xveclist = matrix( rep(0,maxiter*dim), ncol=dim)  # for keeping track of chain values in 1st adaption, transient, and 2nd adaption phase
    
    numaccept = matrix(0,maxiter) # for keeping track of number of accepts (except for the 1st adaption phase)
    sqrdjump = matrix( rep(0,maxiter*dim), ncol=dim) # for keeping track of squared jumping distance
    avgsqrdjump = matrix(0,maxiter/batchwidth, dim) # for keeping track of average squared jumping distance
    transient = matrix(0,maxiter/batchwidth, dim) # for checking whether the chain values are trending
    tempacceptrate = matrix(0,maxiter/batchwidth.adp1,dim) # for keeping track of local acceptance rate (batchwidth=batchwidthadp1) for 1st adaption phase
    
    Y = X
    gX = g(X)
    if (gX==-Inf){stop("X0 has probability of 0 under the stationary distribution. Please choose some other initial value X0.", sep="")}
    
    propSigma = rep(exp(0),dim) # std.dev. of proposal distribution for each coordinate (1st adaption phase)
    unimult = rep(0,dim) # for keeping track of local log of std.dev. of proposal distribution (1st adaption phase)
    uninumaccept = matrix( rep(0,maxiter*dim), ncol=dim) # for keeping track of number of accepts (1st adaption phase)
    batchwidthadp1 = batchwidth.adp1
    
    # 1st adaption phase
    # Adaptive Metropois within Gibbs
    ###########################################################################################################################################
    cat("Running a MCMC for", maxiter, "iterations","\n");
    
    for (i in 1:maxiter) {
      
      if (count>=(dim*nimprob.X)){break} 
      
      # Output progress report.
      if ((i %% displayfreq) == 0)
        cat (" ...",i);
      
      k=i
      # adjust ste.dev. for each coordinate based on the local acceptace rate(batchwidth=batchwidthadp1)  
      if ((k>1)&((k %% batchwidthadp1) == 1)){
        newindex = ceiling((i-1)/batchwidthadp1)
        for (j in 1:dim) {
          tempacceptrate[newindex,j] = sum(uninumaccept[((i-batchwidthadp1):(i-1)),j])/(batchwidthadp1)   
        }
        if (((min(tempacceptrate[newindex,])>minaccpt)&(max(tempacceptrate[newindex,])<maxaccpt))){
          # if the acceptance rate for the last batchwidthadp1 iterations is between `minaccpt' and `maxaccpt' for every coordinate: stop the adaption
          k = batchwidthadp1+1
          batchwidthadp1 = 2*batchwidthadp1
          if (batchwidthadp1 == batchwidth.adp1*2^(endbatch.adp1+1)){
            adpstop1 = i-1 
            break
          }
        }
        else {
          for (j in 1:dim) {
            if (tempacceptrate[newindex,j] >0.44) {unimult[j] = unimult[j]+scale.adj}
            else {unimult[j] = unimult[j]-scale.adj}
            propSigma[j] = exp(unimult[j]) 
          }
        }
      }
      
      for (coord in 1:dim) {
        Y[coord] = X[coord] + rnorm(1,0,propSigma[coord])  # propose a value
        gY = g(Y)
        U = runif(1)  # for accept/reject
        if (gX==-Inf) {A=0; count=count+1} else {A = min(0,(gY-gX)); count=0}  # for accept/reject  
        if (log (U) < A ) {
          gX <- gY
          X[coord] <- Y[coord]
          uninumaccept[i,coord] <- uninumaccept[i,coord]+1 
        }  # accept proposal
        else {Y[coord] <- X[coord]}
      }
      
      Xveclist[i,] = X; #save X value
      
    }
    
    if (count>=(dim*nimprob.X)){stop("past ",dim*nimprob.X ," X values all have probabilities of 0 under the stationary distribution", sep="")}
    if (i==maxiter){stop("MCMC chain ran for ", maxiter, " iterations but didn't achieve convergence", sep="")}
    cat("...End of 1st adaption phase","\n")
    ###########################################################################################################################################
    if (plot==T){
      Xveclisttmp<-Xveclist[,1]
      Xveclisttmp <- Xveclisttmp[seq(m,adpstop1,by=m)]
      par(mfrow=c(1,1))
      plot((seq(m, adpstop1, by = m)), Xveclisttmp[(1:(adpstop1/m))], type="l", main=title, xlab = "iteration", ylab = " ", col="purple", xlim=c(1,(adpstop1)),ylim=c(min(Xveclisttmp[1:(adpstop1/m)]),max(Xveclisttmp[1:(adpstop1/m)])))
    }
    
    pvalue = rep(0,dim)
    # Transient phase 
    # Metropolis within Gibbs
    ###########################################################################################################################################
    for (i in (adpstop1+1):maxiter) {  
      
      if (count>=(dim*nimprob.X)){break} 
      
      for (coord in 1:dim) {
        Y[coord] = X[coord] + rnorm(1,0,propSigma[coord])  # propose a value
        gY = g(Y)
        U = runif(1)  # for accept/reject
        if (gX==-Inf) {A=0; count=count+1} else {A = min(0,(gY-gX)); count=0}  # for accept/reject
        if (log (U) < A ) {
          gX <-gY
          X[coord] <- Y[coord]
          uninumaccept[i,coord] <- uninumaccept[i,coord]+1
        }# accept proposal
        else {Y[coord] <- X[coord]}
      }
      
      Xveclist[i,] = X; #save X value
      
      # Output progress report.
      if ((i %% displayfreq) == 0)
        cat (" ...",i);
      
      if (((i-adpstop1) %% batchwidth) == 0){
        newindex = ceiling(i/batchwidth)
        for (j in 1:dim){
          transient[newindex,j]=mean(Xveclist[((i-batchwidth+1):i),j]) # calculate average X value
          # fit a simple linear regression to X values & check p-values for slope=0
          if ((ceiling((i-adpstop1)/batchwidth)) > (nreg-1)){
            response = transient[((newindex-(nreg-1)):newindex),j]
            explanatory = c(1:nreg)
            fit = lm(response~explanatory)
            pvalue[j] = summary(fit)$coefficients[2,4]
          }
        }    
        # if no rejection of slope=0 at 10% significance level for every coordinate of X
        # stop the transient phase
        if (min(pvalue)>0.1){(transientstop = i) & (break)}
      }  
      
    }
    
    if (count>=(dim*nimprob.X)){stop("past ",dim*nimprob.X ," X values all have probabilities of 0 under the stationary distribution", sep="")}
    if (i==maxiter){stop("MCMC chain ran for ", maxiter, " iterations but didn't achieve convergence", sep="")}
    cat("...End of transient phase","\n")
    ###########################################################################################################################################
    if (plot==T){
      Xveclisttmp <- Xveclist[,1]
      Xveclisttmp <- Xveclisttmp[seq(m,transientstop,by=m)]
      par(mfrow=c(1,1))
      plot((seq(m, adpstop1, by = m)), Xveclisttmp[(1:(adpstop1/m))], type="l", main=title, xlab = "iteration", ylab = " ", col="purple", xlim=c(1,(transientstop)),ylim=c(min(Xveclisttmp[1:(transientstop/m)]),max(Xveclisttmp[1:(transientstop/m)])))
      lines((seq((adpstop1), transientstop, by = m)),Xveclisttmp[(adpstop1/m):(transientstop/m)], type="l", col="orange")
    }
    
    pvalue = rep(0,dim)
    # 2nd adaption phase
    # Adaptive Metropolis 
    ###########################################################################################################################################
    for (i in (transientstop+1):maxiter){
      
      if (count>=(nimprob.X)){break} 
      
      if (dim==1){covsofar = var( Xveclist[((transientstop-nreg*batchwidth+1):(i-1)),])} else {covsofar = cov( Xveclist[((transientstop-nreg*batchwidth+1):(i-1)),])}
      propSigma = mult * covsofar 
      
      Temp = X
      
      if (dim==1){Y = X + rnorm(1, sd = sqrt(propSigma))} else{Y = X + rmvtnorm(1, mu = rep(0,dim), sigma = propSigma)}  # propose a value
      gY = g(Y)
      U = runif(1)  # for accept/reject
      if (gX==-Inf) {A=0; count=count+1} else {A = min(0,(gY-gX)); count=0}  # for accept/reject
      if (log(U) < A) {
        gX <-gY
        X<-Y
        numaccept[i] <- numaccept[i]+1
      } # accept proposal
      
      sqrdjump[(i-transientstop),] = (Temp - X)^2  #calculate squared jumping distance
      Xveclist[i,] = X; #save X value
      
      # Output progress report.
      if ((i %% displayfreq) == 0)
        cat (" ...",i);
      
      if (((i-transientstop) %% batchwidth) == 0){
        newindex = ceiling((i-transientstop)/batchwidth)
        for (j in 1:dim){
          avgsqrdjump[newindex,j]=mean(sqrdjump[((i-transientstop-batchwidth+1):(i-transientstop)),j]) # calculate average squared jumping distance
          # fit a simple linear regression to average jumping distances & check p-values for the slope=0
          if ((ceiling((i-transientstop)/batchwidth)) > (nreg-1)){
            response = avgsqrdjump[((newindex-(nreg-1)):newindex),j]
            explanatory = c(1:nreg)
            fit = lm(response~explanatory)
            pvalue[j] = summary(fit)$coefficients[2,4]
          }
        }    
        # if no rejection of slope=0 at 10% significance level for every coordinate of X
        # stop the adaption
        if (min(pvalue)>0.1){(adpstop2 = i) & (break)}
      }
      
      if ((i-transientstop)==batchwidth.adp2){
        if(sum(numaccept[(transientstop+1):(transientstop+batchwidth.adp2)])<(minaccpt.adp2*batchwidth.adp2)){
          X = Xveclist[transientstop,]
          mult=mult/max(2,dim)
          i = transientstop
          numaccept[(transientstop+1):(transientstop+batchwidth.adp2)] <-rep(0,length(numaccept[(transientstop+1):(transientstop+batchwidth.adp2)]))
        }
      }
      
    }
    
    if (count>=(nimprob.X)){stop("past ",nimprob.X ," X values all have probabilities of 0 under the stationary distribution", sep="")}
    if (i==maxiter){stop("MCMC chain ran for ", maxiter, " iterations but didn't achieve convergence", sep="")}
    cat("...End of 2nd adaption phase","\n")
    ###########################################################################################################################################
    if (plot==T){
      Xveclisttmp <- Xveclist[,1]
      Xveclisttmp <- Xveclisttmp[seq(m,adpstop2,by=m)]
      par(mfrow=c(1,1))
      plot((seq(m, adpstop1, by = m)), Xveclisttmp[(1:(adpstop1/m))], type="l", main=title, xlab = "iteration", ylab = " ", col="purple", xlim=c(1,(adpstop2)),ylim=c(min(Xveclisttmp[1:(adpstop2/m)]),max(Xveclisttmp[1:(adpstop2/m)])))
      lines((seq((adpstop1), transientstop, by = m)),Xveclisttmp[(adpstop1/m):(transientstop/m)], type="l", col="orange")
      lines((seq((transientstop), adpstop2, by = m)),Xveclisttmp[(transientstop/m):(adpstop2/m)], type="l", col="red")
    }
    
    # Find the over-disposed starting distribution : startdist * (max X -min X from the flat part of transient phase & 2nd adaption phase)
    ###########################################################################################################################################
    count=rep(0,nrep)
    a = matrix(0,dim,2)
    starttemp = matrix(0,dim,nrep-1)
    startdist = (startdist-1)/2
    
    for (l in 1:dim){
      a[l,1] = min (Xveclist[((transientstop-nreg*batchwidth+1):adpstop2),l])
      a[l,2] = max (Xveclist[((transientstop-nreg*batchwidth+1):adpstop2),l])
      dist = a[l,2]-a[l,1]
      starttemp[l,] = runif((nrep-1), max(support[l,1],(a[l,1]-startdist*dist)), min(support[l,2],(a[l,2]+startdist*dist)))
    }
    ###########################################################################################################################################
    
    X = cbind(Xveclist[adpstop2,],starttemp)  # starting values for `nrep' replication chains
    Y = matrix(0,dim,nrep)
    
    gX = rep(0,nrep)
    for (k in 1:nrep) {
      gX[k]<-g(X[,k])
    }
    gY = rep(0,nrep)
    
    qlowerwithin = qupperwithin = avglist = s2 = xbar = array(0,dim=c(ceiling((maxiter-adpstop2-B)/batchwidth),dim,nrep))
    qlowertotal = quppertotal = xbartotal = varwi = varbt = vhat = Rvar = vhatvar = d = Rvaradj = Rci = matrix(0,ceiling((maxiter-adpstop2-B)/batchwidth),dim)
    
    Xveclistbase<-as.matrix(Xveclist[(1:adpstop2),])
    
    Xveclistdim1=matrix(0,(maxiter-adpstop2),nrep)
    if (dim>1){
      for (j in 2:dim){
        dummy = matrix(0,(maxiter-adpstop2),nrep)
        assign( paste("Xveclistdim", j, sep=""), dummy )
      }
    }
    
    lowquantile = CI.alpha/2; highquantile = 1-CI.alpha/2 
    
    #Sampling phase
    #Metropolis
    ###########################################################################################################################################
    for (i in (adpstop2+1):maxiter) {  
      
      for (k in 1:nrep) {
        if (count[k]>=(nimprob.X)){break}
        
        if (dim==1){Y[,k] = X[,k] + rnorm(1, sd = sqrt(propSigma))} else{Y[,k] = X[,k] + t(rmvtnorm(1, mu = rep(0,dim), sigma = propSigma))}  # propose a value  
        gY[k]<-g(Y[,k])
        if  (gX[k]==-Inf) {A=0; count[k]=count[k]+1} else {A = min(0,(gY[k]-gX[k])); count[k]=0}  # for accept/reject
        U = runif(1)  # for accept/reject
        if (log(U) < A) {
          X[,k] <- Y[,k]
          numaccept[i] <- numaccept[i]+1
          gX[k] <-gY[k]
        } # accept proposal
        
      }
      
      if (count[k]>=(nimprob.X)){break}
      
      for (j in 1:dim){
        expression = paste("Xveclistdim",j, "[(i-adpstop2),]<-", "t(X[j,])", sep = "")
        eval(parse(text = expression ))
      }
      
      # Output progress report.
      if ((i %% displayfreq) == 0)
        cat (" ...",i);
      
      if ((i > (adpstop2+B))&(((i-adpstop2-B) %% batchwidth) == 0)){
        
        discard = adpstop2+floor((i-adpstop2)/2) # iteration number up to which chain values are discarded for Gelman-Rubin diagnostics   
        samplesize = i-discard # sample size for Gelman-Rubin diagnostics
        newindex = ceiling((i-adpstop2-B)/batchwidth) 
        
        for (j in 1:dim){
          
          for (k in 1:nrep) {
            qlowerwithin[newindex,j,k] = quantile(retrieve("Xveclistdim", j)[((discard+1-adpstop2):(i-adpstop2)),k], probs = lowquantile)
            qupperwithin[newindex,j,k] = quantile(retrieve("Xveclistdim", j)[((discard+1-adpstop2):(i-adpstop2)),k], probs = highquantile)
            avglist[newindex,j,k] = sum(retrieve("Xveclistdim", j)[((discard+1-adpstop2):(i-adpstop2)),k])/(samplesize)
          }            
          
          qlowertotal[newindex,j] = quantile(retrieve("Xveclistdim", j)[((discard+1-adpstop2):(i-adpstop2)),], probs=lowquantile) 
          quppertotal[newindex,j] = quantile(retrieve("Xveclistdim", j)[((discard+1-adpstop2):(i-adpstop2)),], probs=highquantile) 
          
          #Estimates of R with (1-CI.alpha)% confidence interval (Broooks-Gelman)
          Rci[newindex,j] = (quppertotal[newindex,j]-qlowertotal[newindex,j])/(mean(qupperwithin[newindex,j,]-qlowerwithin[newindex,j,]))
          
          xbartotal[newindex,j] = mean(retrieve("Xveclistdim", j)[((discard+1-adpstop2):(i-adpstop2)),])
          s2[newindex,j,] = colSums((retrieve("Xveclistdim", j)[((discard+1-adpstop2):(i-adpstop2)),] - avglist[newindex,j,])^2)/(samplesize-1)
          xbar[newindex,j,]=colMeans(retrieve("Xveclistdim", j)[((discard+1-adpstop2):(i-adpstop2)),])
          varwi[newindex,j] = sum((retrieve("Xveclistdim", j)[((discard+1-adpstop2):(i-adpstop2)),] - avglist[newindex,j,])^2)/(nrep*(samplesize-1))
          varbt[newindex,j] = sum((avglist[newindex,j,]-xbartotal[newindex,j])^2)/(nrep-1)    
          vhat[newindex,j]=varwi[newindex,j]*(samplesize-1)/(samplesize)+varbt[newindex,j]+varbt[newindex,j]/nrep
          
          #Estimates of R; Gelman-Rubin diagnostics
          Rvar[newindex,j] = vhat[newindex,j]/varwi[newindex,j]
          
          vhatvar[newindex,j]=(((samplesize-1)/samplesize)^2)*var(s2[newindex,j,])/nrep+(((nrep+1)/nrep)^2)*2*(varbt[newindex,j]^2)/(nrep-1)+
            2*(nrep+1)*(samplesize-1)/((nrep^2)*samplesize)*(cov(s2[newindex,j,],xbar[newindex,j,]^2)-2*xbartotal[newindex,j]*cov(s2[newindex,j,],xbar[newindex,j,]))
          d[newindex,j]= 2*vhat[newindex,j]^2/vhatvar[newindex,j]
          
          #Estimates of R; Gelman-Rubin diagnostics; adjusted for sampling variance
          Rvaradj[newindex,j]=Rvar[newindex,j]*(d[newindex,j]+3)/(d[newindex,j]+1)
          
        }
        # If `minR'< R <`maxR', stop the chain
        if (((max(Rci[newindex,],Rvaradj[newindex,]))<maxR)&((min(Rci[newindex,],Rvaradj[newindex,]))>minR)) {(chainstop = i) & (break)}    
        
      }  
      
    }
    
    if (count[k]>=(nimprob.X)){stop("past ",nimprob.X ," X values of ",k,"th replicative chain all have probabilities of 0 under the stationary distribution", sep="")}
    if (i==maxiter){stop("MCMC chain ran for ", maxiter, " iterations but didn't achieve convergence", sep="")}
    cat("...End of sampling phase...End of MCMC run","\n","\n")
    ###########################################################################################################################################
    if (plot==T){
      Xveclisttmp=Xveclistdim1[(1:(chainstop-adpstop2)),1]
      if (dim>1){
        for (j in 2:dim){
          expression =paste("Xveclisttmp=cbind(Xveclisttmp,Xveclistdim",j, "[(1:(chainstop-adpstop2)),1])", sep = "")
          eval(parse(text = expression))
        }
      }
      xveclist<-matrix(0,chainstop,1)
      xveclist[1:adpstop2]<-Xveclistbase[,1]
      if (dim==1){xveclist[(adpstop2+1):chainstop]<-Xveclisttmp} else {xveclist[(adpstop2+1):chainstop]<-Xveclisttmp[,1]}
      
      Xveclisttmp <- xveclist[seq(m,length(xveclist),by=m)]
      par(mfrow=c(1,1))
      plot((seq(m, adpstop1, by = m)), Xveclisttmp[(1:(adpstop1/m))], type="l", main=title, xlab = "iteration", ylab = " ", col="purple", xlim=c(1,(chainstop)),ylim=c(min(Xveclisttmp[1:(chainstop/m)]),max(Xveclisttmp[1:(chainstop/m)])))
      lines((seq((adpstop1), transientstop, by = m)),Xveclisttmp[(adpstop1/m):(transientstop/m)], type="l", col="orange")
      lines((seq((transientstop), adpstop2, by = m)),Xveclisttmp[(transientstop/m):(adpstop2/m)], type="l", col="red")
      lines((seq((adpstop2), discard, by = m)),Xveclisttmp[(adpstop2/m):(discard/m)], type="l", col="blue")
      lines((seq((discard), chainstop, by = m)),Xveclisttmp[(discard/m):(chainstop/m)], type="l", col="green")
    }
    
    runtime = proc.time() - ptm
    names(runtime) <- NULL 
    
    #summary statistics
    ###########################################################################################################################################
    #iteration numbers to show when each phase has ended
    sumchain<-c(adpstop1, transientstop, adpstop2, discard, chainstop)
    
    #estimates of the expected values of X and functional X 
    #uses all points from all replication chains in the 2nd half of sampling phase
    means = functionalmeans = rep(0,dim)
    totalmeans=rep(0,dim)
    for (j in 1:dim){
      means[j]<-mean(retrieve("Xveclistdim", j)[((discard+1-adpstop2):(chainstop-adpstop2)),])
      functionalmeans[j]<-mean(functional(retrieve("Xveclistdim", j)[((discard+1-adpstop2):(chainstop-adpstop2)),]))
    }
    
    #acceptance rate
    acceptrate = sum(numaccept[(discard+1):chainstop])/(nrep*(chainstop-discard))
    
    lst = list(Xveclistdim1 = Xveclistdim1)
    if (dim>1){
      for (j in 2:dim){
        expression =paste('lst = c(lst, list(Xveclistdim',j,'=Xveclistdim',j,'))', sep = '')
        eval(parse(text = expression))
      }
    }
    
    output = c(lst, list(Xveclistbase=Xveclistbase, dim=dim, batchwidth=batchwidth, means=means, functionalmeans=functionalmeans, sumchain=sumchain, acceptrate=acceptrate, runtime=runtime[3]))
    summary = list(estimates = output$means, estimates_of_functional = output$functionalmeans, acceptance_rate = output$acceptrate,
                   time_elapsed=output$runtime, phase_length=output$sumchain)
    print( summary )
    return( output )
    
  }
  
  else{
    
    ptm <- proc.time() 
    
    X0=as.matrix(X0)
    support=as.matrix(support)
    if (length(dim)!=1){stop("Argument dim should be scalar", sep="")}
    if ((dim(X0)[1]!=dim)|(dim(X0)[2]!=mrep)){stop("X0 should be ",dim," x ",mrep," matrix", sep="")}
    if ((dim(support)[1]!=dim)|(dim(support)[2]!=2)){stop("support should be ",dim," x ",2," matrix", sep="")}
    if (length(maxiter)!=1){stop("Argument maxiter should be scalar", sep="")}
    if (length(mult)!=1){stop("Argument mult should be scalar", sep="")}
    if (length(mrep)!=1){stop("Argument mrep should be scalar", sep="")}
    if (length(nrep)!=1){stop("Argument nrep should be scalar", sep="")}
    if (length(batchwidth)!=1){stop("Argument batchwidth should be scalar", sep="")}
    if (length(holdup)!=1){stop("Argument holdup should be scalar", sep="")}
    if (length(batchwidth.adp1)!=1){stop("Argument batchwidth.adp1 should be scalar", sep="")}
    if (length(scale.adj)!=1){stop("Argument scale.adj should be scalar", sep="")}
    if (length(endbatch.adp1)!=1){stop("Argument endbatch.adp1 should be scalar", sep="")}
    if (length(minaccpt)!=1){stop("Argument minaccpt should be scalar", sep="")}
    if (length(maxaccpt)!=1){stop("Argument maxaccpt should be scalar", sep="")}
    if (length(nreg)!=1){stop("Argument nreg should be scalar", sep="")}
    if (length(startdist)!=1){stop("Argument startdist should be scalar", sep="")}
    if (length(minR)!=1){stop("Argument minR should be scalar", sep="")}
    if (length(maxR)!=1){stop("Argument maxR should be scalar", sep="")}
    if (length(CI.alpha)!=1){stop("Argument CI.alpha should be scalar", sep="")}
    if (length(nimprob.X)!=1){stop("Argument nimprob.X should be scalar", sep="")}
    if (length(minaccpt.adp2)!=1){stop("Argument minaccpt.adp2 should be scalar", sep="")}
    if (length(batchwidth.adp2)!=1){stop("Argument batchwidth.adp2 should be scalar", sep="")}
    if (length(jumpprob)!=1){stop("Argument jumpprob should be scalar", sep="")}
    if (length(displayfreq)!=1){stop("Argument displayfreq should be scalar", sep="")}
    if ((plot!=F)&(plot!=T)) {stop("Argument plot should be either T or F", sep="")}
    if (length(m)!=1){stop("Argument m should be scalar", sep="")}
    if (((batchwidth/2) %% m)!=0){stop("m=",m, " has to be a factor of batchwidth/2=",batchwidth/2, sep="")}
    
    title <- bquote(bold(paste("Trace Plot coordinate 1/ mode 1")))
    
    X=X0    
    B = holdup*batchwidth  # hold-up before applying Gelman-Rubin diagnostics  for the sampling phase
    Xveclist = array(0,dim=c(maxiter,dim,mrep))  # for keeping track of chain values in 1st adaption, transient, and 2nd adaption phase
    
    transient = matrix(0,maxiter/batchwidth, dim) # for checking whether the chain values are trending
    tempacceptrate = array(0,dim=c(maxiter/batchwidth.adp1,dim,mrep)) # for keeping track of local acceptance rate (batchwidth=batchwidthadp1) for 1st adaption phase
    
    Y = X
    gX = gY = rep(0,mrep)
    for (k in 1:mrep) {
      gX[k]<-g(X[,k])
      if (gX[k]==-Inf){break}
    }
    if (gX[k]==-Inf){stop("X0 from ",k,"th column has probability of 0 under the stationary distribution. Please choose some other initial value for ",k,"th column of X0.", sep="")}
    
    adpstop1=transientstop=adpstop2=rep(0,mrep)
    propSigma = matrix(exp(0),dim,mrep) # std.dev. of proposal distribution for each coordinate (1st adaption phase)
    
    # 1st adaption phase
    # Adaptive Metropois within Gibbs
    ###########################################################################################################################################
    for (k in 1:mrep) {
      
      count=0  #count of consecutive pi(X)==0
      unimult = rep(0,dim) # for keeping track of local log of std.dev. of proposal distribution (1st adaption phase)
      uninumaccept = matrix( rep(0,maxiter*dim), ncol=dim) # for keeping track of number of accepts (1st adaption phase)
      batchwidthadp1 = batchwidth.adp1
      
      cat("Running a MCMC (chain No.",k,") for", maxiter, "iterations","\n");
      
      for (i in 1:maxiter) {
        
        if (count>=(dim*nimprob.X)){break} 
        
        # Output progress report.
        if ((i %% displayfreq) == 0)
          cat (" ...",i);
        
        l=i
        # adjust std.dev. for each coordinate based on the local acceptance rate(batchwidth=batchwidthadp1)  
        if ((l>1)&((l %% batchwidthadp1) == 1)){
          newindex = ceiling((i-1)/batchwidthadp1)
          for (j in 1:dim) {
            tempacceptrate[newindex,j,k] = sum(uninumaccept[((i-batchwidthadp1):(i-1)),j])/(batchwidthadp1)
          }
          if ((min(tempacceptrate[newindex,,k])>minaccpt)&(max(tempacceptrate[newindex,,k])<maxaccpt)){
            
            l = batchwidthadp1+1
            batchwidthadp1 = 2*batchwidthadp1
            if (batchwidthadp1 == batchwidth.adp1*2^2){
              adpstop1[k] = i-1 
              cat("...End of 1st adaption phase (chain No.",k,")","\n")
              break
            }
          }
          else {
            for (j in 1:dim) {
              if (tempacceptrate[newindex,j,k] >0.44) {unimult[j] = unimult[j]+scale.adj}
              else {unimult[j] = unimult[j]-scale.adj}
              propSigma[j,k] = exp(unimult[j])
              
            }
          }
        }
        
        for (coord in 1:dim) {
          Y[coord,k] = X[coord,k] + rnorm(1,0,propSigma[coord,k])  # propose a value
          gY[k] = g(Y[,k])
          U = runif(1)  # for accept/reject
          if (gX[k]==-Inf){A=0; count=count+1} else {A = min(0,(gY[k]-gX[k])); count=0}  # for accept/reject
          if (log (U) < A ) {
            gX[k] <- gY[k]
            X[coord,k] <- Y[coord,k]
            uninumaccept[i,coord] <- uninumaccept[i,coord]+1 
          }  # accept proposal
          else {Y[coord,k] <- X[coord,k]}  
        }
        
        Xveclist[i, ,k] = X[,k] #save X value
        
      }
      
      if (count>=(dim*nimprob.X)){break}   
      if (i==maxiter){break}
      if (plot==T){
        if (k == 1){
          Xveclisttmp <- Xveclist[,1,k]
          Xveclisttmp <- Xveclisttmp[seq(m,adpstop1[k],by=m)]
          par(mfrow=c(1,1))
          plot((seq(m, adpstop1[k], by = m)), Xveclisttmp[(1:(adpstop1[k]/m))], type="l", main=title, xlab = "iteration", ylab = " ", col="purple", xlim=c(1,(adpstop1[k])),ylim=c(min(Xveclisttmp[1:(adpstop1[k]/m)]),max(Xveclisttmp[1:(adpstop1[k]/m)])))
        }
      }
      ###########################################################################################################################################
      
      pvalue = rep(0,dim)
      # Transient phase 
      # Metropolis within Gibbs
      ###########################################################################################################################################
      for (i in (adpstop1[k]+1):maxiter) {  
        
        if (count>=(dim*nimprob.X)){break} 
        
        for (coord in 1:dim) {
          Y[coord,k] = X[coord,k] + rnorm(1,0,propSigma[coord,k])  # propose a value
          gY[k] = g(Y[,k])
          U = runif(1)  # for accept/reject
          if (gX[k]==-Inf){A=0; count=count+1} else {A = min(0,(gY[k]-gX[k])); count=0}  # for accept/reject
          if (log (U) < A ) {
            gX[k] <- gY[k]
            X[coord,k] <- Y[coord,k]
            uninumaccept[i,coord] <- uninumaccept[i,coord]+1 
          }  # accept proposal
          else {Y[coord,k] <- X[coord,k]}  
        }
        
        Xveclist[i, ,k] = X[,k]; #save X value
        
        # Output progress report.
        if ((i %% displayfreq) == 0)
          cat (" ...",i);
        
        if (((i-adpstop1[k]) %% batchwidth) == 0){
          newindex = ceiling(i/batchwidth)
          for (j in 1:dim){
            transient[newindex,j]=mean(Xveclist[((i-batchwidth+1):i),j,k]) # calculate average X value
            # fit a simple linear regression to X values & check p-values for slope=0
            if ((ceiling((i-adpstop1[k])/batchwidth)) > (nreg-1)){
              response = transient[((newindex-(nreg-1)):newindex),j]
              explanatory = c(1:nreg)
              fit = lm(response~explanatory)
              pvalue[j] = summary(fit)$coefficients[2,4]
            }
          }    
          # if no rejection of slope=0 at 10% significance level for every coordinate of X
          # stop the transient phase
          if (min(pvalue)>0.1){
            transientstop[k] = i
            cat("...End of transient phase (chain No.",k,")","\n")
            break
          }
        }
        
      }
      
      if (count>=(dim*nimprob.X)){break} 
      if (i==maxiter){break}
      if (plot==T){
        if (k == 1){
          Xveclisttmp <- Xveclist[,1,k]
          Xveclisttmp <- Xveclisttmp[seq(m,transientstop[k],by=m)]
          par(mfrow=c(1,1))
          plot((seq(m, adpstop1[k], by = m)), Xveclisttmp[(1:(adpstop1[k]/m))], type="l", main=title, xlab = "iteration", ylab = " ", col="purple", xlim=c(1,(transientstop[k])),ylim=c(min(Xveclisttmp[1:(transientstop[k]/m)]),max(Xveclisttmp[1:(transientstop[k]/m)])))
          lines((seq((adpstop1[k]), transientstop[k], by = m)),Xveclisttmp[(adpstop1[k]/m):(transientstop[k]/m)], type="l", col="orange")
        }
      }
    }
    
    if (count>=(dim*nimprob.X)){stop("past ",dim*nimprob.X ," X values of ",k,"th chain all have probabilities of 0 under the stationary distribution", sep="")}
    if (i==maxiter){stop(k, "th chain ran for ", maxiter, " iterations but didn't achieve convergence", sep="")}
    
    ###########################################################################################################################################
    
    #Leave only one chain for each unique local mode
    ###########################################################################################################################################
    test1=matrix(1,mrep,mrep)
    for (j in 1:mrep) {
      for (k in 2:mrep){
        if (k>j)
          for (l in 1:dim){
            if (abs(mean(Xveclist[((transientstop[j]-batchwidth*nreg+1):transientstop[j]),l,j])-mean(Xveclist[((transientstop[k]-batchwidth*nreg+1):transientstop[k]),l,k]))> min (sd(Xveclist[((transientstop[j]-batchwidth*nreg+1):transientstop[j]),l,j]), sd(Xveclist[((transientstop[k]-batchwidth*nreg+1):transientstop[k]),l,k])))
              test1[k,j]=test1[j,k]=0
          }
      }
    }
    
    test2=rep(1,mrep)
    for (j in 2:mrep){
      for (k in 1:(j-1)){
        if (test1[j,k]==1)
          test2[j]=0
      }
    }
    
    k=0
    for (j in 1:(mrep-1)) {
      l=mrep-(j-1)
      if(test2[l]==0){
        k <-k+1
        Xveclist<- array(Xveclist[,,-l], dim=c(maxiter,dim,(mrep-k)))
        propSigma<- array(propSigma[,-l], dim=c(dim,(mrep-k)))
        adpstop1<- array(adpstop1[-l], dim=(mrep-k))
        transientstop<- array(transientstop[-l], dim=(mrep-k))
        adpstop2<- array(adpstop2[-l], dim=(mrep-k))
      }    
    }
    
    nummodes_temp=length(adpstop1)
    if (nummodes_temp==1) {stop(mrep, " chains only found one mode. Please run again with different X0 and/or mrep, or run with multimodal=F.", sep="")}
    ###########################################################################################################################################
    
    numaccept = matrix(0,maxiter,nummodes_temp) # for keeping track of number of accepts (2nd adaption phase and sampling phase)
    mvpropSigma=array(0,dim=c(dim,dim,nummodes_temp))
    
    X = matrix(0,dim,nummodes_temp)
    for (k in (1:nummodes_temp)) {
      X[,k] = Xveclist[transientstop[k],,k]
    }
    gX = gY = rep(0,nummodes_temp)
    for (k in 1:nummodes_temp) {
      gX[k]<-g(X[,k])
    }
    
    # 2nd adaption phase
    # Adaptive Metropolis 
    ###########################################################################################################################################
    for (k in (1:nummodes_temp)) {
      
      count=0
      transientstop_temp=transientstop[k]
      sqrdjump = matrix( rep(0,(maxiter-transientstop_temp)*dim), ncol=dim) # for keeping track of squared jumping distance
      avgsqrdjump = matrix(0,(maxiter-transientstop_temp)/batchwidth, dim) # for keeping track of average squared jumping distance
      
      cat("Running a MCMC (chain No.",k,") for", maxiter-transientstop_temp, "iterations","\n");
      
      pvalue = rep(0,dim)
      
      for (i in (transientstop_temp+1):maxiter) {
        
        if (count>=(nimprob.X)){break}
        
        if (dim==1){covsofar = var( Xveclist[((transientstop_temp-nreg*batchwidth+1):(i-1)),,k] )} else {covsofar = cov( Xveclist[((transientstop_temp-nreg*batchwidth+1):(i-1)),,k] )}
        mvpropSigma[,,k] = mult * covsofar
        
        Temp = X[,k]
        
        if (dim==1){Y[,k] = X[,k] + rnorm(1, sd = sqrt(mvpropSigma[,,k]))} else{Y[,k] = X[,k] + t(rmvtnorm(1, mu = rep(0,dim), sigma = mvpropSigma[,,k]))}  # propose a value
        gY[k] = g(Y[,k])
        U = runif(1)  # for accept/reject
        if (gX[k]==-Inf){A=0; count=count+1} else {A = min(0,(gY[k]-gX[k])); count=0}  # for accept/reject
        if (log (U) < A ) {
          gX[k] <- gY[k]
          X[,k] <- Y[,k]
          numaccept[i,k] <- numaccept[i,k]+1 
        }	# accept proposal  
        
        sqrdjump[(i-transientstop_temp),] = (Temp - X[,k])^2  #calculate squared jumping distance
        Xveclist[i, ,k] = X[,k]; #save X value
        
        # Output progress report.
        if ((i %% displayfreq) == 0)
          cat (" ...",i);
        
        if (((i-transientstop_temp) %% batchwidth) == 0){
          newindex = ceiling((i-transientstop_temp)/batchwidth)
          for (j in 1:dim){
            avgsqrdjump[newindex,j]=mean(sqrdjump[((i-transientstop_temp-batchwidth+1):(i-transientstop_temp)),j]) # calculate average squared jumping distance
            # fit a simple linear regression to average jumping distances & check p-values for the slope=0
            if ((ceiling((i-transientstop_temp)/batchwidth)) > (nreg-1)){
              response = avgsqrdjump[((newindex-(nreg-1)):newindex),j]
              explanatory = c(1:nreg)
              fit = lm(response~explanatory)
              pvalue[j] = summary(fit)$coefficients[2,4]
            }
          }    
          # if no rejection of slope=0 at 10% significance level for every coordinate of X
          # stop the adaption
          if (min(pvalue)>0.1){
            adpstop2[k] = i
            cat("...End of 2nd adaption phase (chain No.",k,")","\n")
            break
          }
        }
        
        if ((i-transientstop_temp)==batchwidth.adp2){
          if(sum(numaccept[((transientstop_temp+1):(transientstop_temp+batchwidth.adp2)),k])<(minaccpt.adp2*batchwidth.adp2)){
            X[,k] = Xveclist[transientstop_temp,,k]
            mult=mult/max(2,dim)
            i = transientstop_temp
            numaccept[((transientstop_temp+1):(transientstop_temp+batchwidth.adp2)),k] <-rep(0,length(numaccept[((transientstop_temp+1):(transientstop_temp+batchwidth.adp2)),k]))
          }
        }
        
      } 
      
      if (count>=(nimprob.X)){break}
      if (i==maxiter){break}
      if (plot==T){
        if (k==1){
          Xveclisttmp <- Xveclist[,1,k]
          Xveclisttmp <- Xveclisttmp[seq(m,adpstop2[k],by=m)]
          par(mfrow=c(1,1))
          plot((seq(m, adpstop1[k], by = m)), Xveclisttmp[(1:(adpstop1[k]/m))], type="l", main=title, xlab = "iteration", ylab = " ", col="purple", xlim=c(1,(adpstop2[k])),ylim=c(min(Xveclisttmp[1:(adpstop2[k]/m)]),max(Xveclisttmp[1:(adpstop2[k]/m)])))
          lines((seq((adpstop1[k]), transientstop[k], by = m)),Xveclisttmp[(adpstop1[k]/m):(transientstop[k]/m)], type="l", col="orange")
          lines((seq((transientstop[k]), adpstop2[k], by = m)),Xveclisttmp[(transientstop[k]/m):(adpstop2[k]/m)], type="l", col="red")
        }
      }
    }
    
    if (count>=(nimprob.X)){stop("past ",nimprob.X ," X values of ",k,"th chain all have probabilities of 0 under the stationary distribution", sep="")}
    if (i==maxiter){stop(k, "th chain ran for ", maxiter, " iterations but didn't achieve convergence", sep="")}
    ###########################################################################################################################################
    
    #Check if each chain still has unique mode, and if not, then leave only one chain for each unique local mode 
    ###########################################################################################################################################
    test1_n = matrix(1,nummodes_temp,nummodes_temp)
    for (i in 1:nummodes_temp) {
      for (k in 2:nummodes_temp){
        if (k>i)
          for (l in 1:dim){
            if (abs(mean(Xveclist[((transientstop[i]+1):adpstop2[i]),l,i])-mean(Xveclist[((transientstop[i]+1):adpstop2[i]),l,k]))> min (sd(Xveclist[((transientstop[i]+1):adpstop2[i]),l,i]), sd(Xveclist[((transientstop[k]+1):adpstop2[k]),l,k])))
              test1_n[k,i]=test1_n[i,k]=0
          }
      }
    }
    
    test2_n=rep(1,nummodes_temp)
    for (j in 2:nummodes_temp){
      for (k in 1:(j-1)){
        if (test1_n[j,k]==1)
          test2_n[j]=0
      }
    }
    
    k=0
    for (j in 1:(nummodes_temp-1)) {
      l=nummodes_temp-(j-1)
      if(test2_n[l]==0){
        k <-k+1
        Xveclist<- array(Xveclist[,,-l], dim=c(maxiter,dim,(nummodes_temp-k)))
        propSigma<- array(propSigma[,-l], dim=c(dim,(nummodes_temp-k)))
        numaccept<- array(numaccept[,-l], dim=c(maxiter,(nummodes_temp-k)))
        adpstop1<- array(adpstop1[-l], dim=(nummodes_temp-k))
        transientstop<- array(transientstop[-l], dim=(nummodes_temp-k))
        adpstop2<- array(adpstop2[-l], dim=(nummodes_temp-k))
      }    
    }
    ###########################################################################################################################################
    
    nummodes = length(adpstop1)
    if (nummodes==1) {stop(mrep, " chains only found one mode. Please run again with different X0 and/or mrep, or run with multimodal=F.", sep="")}
    adpstop2_max = max(adpstop2)
    
    # Find the over-disposed starting distribution : startdist * (max X -min X from the 2nd adaption phase)
    ###########################################################################################################################################
    count=rep(0,nrep)
    a = matrix(0,dim,2)
    b = array(0,dim=c(dim,nummodes,2))
    starttemp = matrix(0,nrep,dim)
    startdist = (startdist-1)/2
    
    for (l in 1:nummodes){
      for (k in 1:dim){
        b[k,l,1] = min(Xveclist[((transientstop[l]+1):adpstop2[l]),k,l]) 
        b[k,l,2] = max(Xveclist[((transientstop[l]+1):adpstop2[l]),k,l])
      }
    }
    
    for (j in 1:nummodes){
      starttemp[j,] = Xveclist[adpstop2[j],,j]
    }
    
    for (j in (nummodes+1):nrep){
      l=sample(c(1:nummodes),1)
      for (k in 1:dim){
        dist = b[k,l,2]-b[k,l,1]  
        starttemp[j,k] = runif(1,  max(support[k,1],(b[k,l,1]-startdist*dist)), min(support[k,2],(b[k,l,2]+startdist*dist)))
      }
    }
    ###########################################################################################################################################
    
    X = t(starttemp)  # starting values for `nrep' replication chains
    Y = matrix(0,dim,nrep)
    
    gX = rep(0,nrep)
    for (k in 1:nrep) {
      gX[k]<-g(X[,k])
    }
    gY = rep(0,nrep)
    
    qlowerwithin = qupperwithin = avglist = s2 = xbar = array(0,dim=c(ceiling((maxiter-adpstop2_max-B)/batchwidth),dim,nrep))
    qlowertotal = quppertotal = xbartotal = varwi = varbt = vhat = Rvar = vhatvar = d = Rvaradj = Rci = matrix(0,ceiling((maxiter-adpstop2_max-B)/batchwidth),dim)
    
    Xveclistbase<-Xveclist[(1:adpstop2_max),,]
    
    Xveclistdim1=matrix(0,(maxiter-adpstop2),nrep)
    if (dim>1){
      for (j in 1:dim){
        dummy = matrix(0,(maxiter-adpstop2_max),nrep)
        assign( paste("Xveclistdim", j, sep=""), dummy )
      }
    }
    
    localm =locals= matrix(0,dim,nummodes)
    dt = matrix(0,dim,nummodes)
    D=rep(0,nummodes)
    for (i in 1:nummodes){
      for (l in 1:dim){
        localm[l,i] = mean(Xveclist[((transientstop[i]+1):adpstop2[i]),l,i])
        locals[l,i] = sd(Xveclist[((transientstop[i]+1):adpstop2[i]),l,i]) 
      }
    }
    
    Dstar = rep(0,nrep)
    for (k in 1:nrep) {   
      for (q in 1:nummodes){
        for (p in 1:dim){
          dt[p,q] = abs(X[p,k]-localm[p,q])/locals[p,q]
          D[q] = max(dt[,q])
          Dstar[k]= which.min(D)
        }
      }  
    }  
    
    lowquantile = CI.alpha/2; highquantile = 1-CI.alpha/2 
    
    #Sampling phase
    #Metropolis
    ###########################################################################################################################################
    cat("Running a MCMC for", maxiter-adpstop2_max, "iterations","\n");
    
    for (i in (adpstop2_max+1):maxiter) {  
      
      for (k in 1:nrep) { 
        
        if (count[k]>=(nimprob.X)){break}
        
        Ustar = runif(1)
        if (Ustar<jumpprob) {
          sp = sample(c(1:nummodes)[-Dstar[k]],1)
          Y[,k] = (X[,k]-localm[,Dstar[k]])*locals[,sp]/(locals[,Dstar[k]]) + localm[,sp]
          for (q in 1:nummodes){
            for (p in 1:dim){
              dt[p,q] = abs(Y[p,k]-localm[p,q])/locals[p,q]
              D[q] = max(dt[,q])
              Dstar_new= which.min(D)
            }
          } 
          if (Dstar_new ==sp){
            gY[k]<-g(Y[,k])
            if (gX[k]==-Inf){A=0; count[k]=count[k]+1} else {A = min(0,(gY[k]-gX[k])); count[k]=0}  # for accept/reject
            U = runif(1)
            if (log(U) < A) {
              X[,k] <- Y[,k]
              numaccept[i,1] <- numaccept[i,1]+1
              gX[k] <-gY[k]
              Dstar[k]=Dstar_new
            } # accept proposal
          }
        }
        else{
          if (dim==1){Y[,k] = X[,k] + rnorm(1, sd = mvpropSigma[,,Dstar[k]])} else{Y[,k] = X[,k] + t(rmvtnorm(1, mu = rep(0,dim), sigma = mvpropSigma[,,Dstar[k]]))}  # propose a value  
          for (q in 1:nummodes){
            for (p in 1:dim){
              dt[p,q] = abs(Y[p,k]-localm[p,q])/locals[p,q]
              D[q] = max(dt[,q])
              Dstar_new= which.min(D)
            }
          } 
          if (Dstar_new ==Dstar[k]){
            gY[k]<-g(Y[,k])
            if (gX[k]==-Inf){A=0; count[k]=count[k]+1} else {A = min(0,(gY[k]-gX[k])); count[k]=0}  # for accept/reject
            U = runif(1)  # for accept/reject
            if (log(U) < A) {
              X[,k] <- Y[,k]
              numaccept[i,1] <- numaccept[i,1]+1
              gX[k] <-gY[k]
              Dstar[k]=Dstar_new
            } # accept proposal 
          }
        }
        
      }
      
      if (count[k]>=(nimprob.X)){break}
      
      for (j in 1:dim){
        expression = paste("Xveclistdim",j, "[(i-adpstop2_max),]<-", "t(X[j,])", sep = "")
        eval(parse(text = expression ))
      }
      
      # Output progress report.
      if ((i %% displayfreq) == 0)
        cat (" ...",i);
      
      if ((i > (adpstop2_max+B))&(((i-adpstop2_max-B) %% batchwidth) == 0)){
        
        discard = adpstop2_max+floor((i-adpstop2_max)/2) # iteration number up to which chain values are discarded for Gelman-Rubin diagnostics     
        samplesize = i-discard # sample size for Gelman-Rubin diagnostics
        newindex = ceiling((i-adpstop2_max-B)/batchwidth) 
        
        for (j in 1:dim){
          
          for (k in 1:nrep) {
            qlowerwithin[newindex,j,k] = quantile(retrieve("Xveclistdim", j)[((discard+1-adpstop2_max):(i-adpstop2_max)),k], probs = lowquantile)
            qupperwithin[newindex,j,k] = quantile(retrieve("Xveclistdim", j)[((discard+1-adpstop2_max):(i-adpstop2_max)),k], probs = highquantile)
            avglist[newindex,j,k] = sum(retrieve("Xveclistdim", j)[((discard+1-adpstop2_max):(i-adpstop2_max)),k])/(samplesize)
          }            
          
          qlowertotal[newindex,j] = quantile(retrieve("Xveclistdim", j)[((discard+1-adpstop2_max):(i-adpstop2_max)),], probs=lowquantile) 
          quppertotal[newindex,j] = quantile(retrieve("Xveclistdim", j)[((discard+1-adpstop2_max):(i-adpstop2_max)),], probs=highquantile) 
          
          #Estimates of R with (1-CI.alpha)% confidence interval (Broooks-Gelman)
          Rci[newindex,j] = (quppertotal[newindex,j]-qlowertotal[newindex,j])/(mean(qupperwithin[newindex,j,]-qlowerwithin[newindex,j,]))
          
          xbartotal[newindex,j] = mean(retrieve("Xveclistdim", j)[((discard+1-adpstop2_max):(i-adpstop2_max)),])
          s2[newindex,j,] = colSums((retrieve("Xveclistdim", j)[((discard+1-adpstop2_max):(i-adpstop2_max)),] - avglist[newindex,j,])^2)/(samplesize-1)
          xbar[newindex,j,]=colMeans(retrieve("Xveclistdim", j)[((discard+1-adpstop2_max):(i-adpstop2_max)),])
          varwi[newindex,j] = sum((retrieve("Xveclistdim", j)[((discard+1-adpstop2_max):(i-adpstop2_max)),] - avglist[newindex,j,])^2)/(nrep*(samplesize-1))
          varbt[newindex,j] = sum((avglist[newindex,j,]-xbartotal[newindex,j])^2)/(nrep-1)    
          vhat[newindex,j]=varwi[newindex,j]*(samplesize-1)/(samplesize)+varbt[newindex,j]+varbt[newindex,j]/nrep
          
          #Estimates of R; Gelman-Rubin diagnostics
          Rvar[newindex,j] = vhat[newindex,j]/varwi[newindex,j]
          
          vhatvar[newindex,j]=(((samplesize-1)/samplesize)^2)*var(s2[newindex,j,])/nrep+(((nrep+1)/nrep)^2)*2*(varbt[newindex,j]^2)/(nrep-1)+
            2*(nrep+1)*(samplesize-1)/((nrep^2)*samplesize)*(cov(s2[newindex,j,],xbar[newindex,j,]^2)-2*xbartotal[newindex,j]*cov(s2[newindex,j,],xbar[newindex,j,]))
          d[newindex,j]= 2*vhat[newindex,j]^2/vhatvar[newindex,j]
          
          #Estimates of R; Gelman-Rubin diagnostics; adjusted for sampling variance
          Rvaradj[newindex,j]=Rvar[newindex,j]*(d[newindex,j]+3)/(d[newindex,j]+1)
          
        }
        # If `minR'< R <`maxR', stop the chain
        if (((max(Rci[newindex,],Rvaradj[newindex,]))<maxR)&((min(Rci[newindex,],Rvaradj[newindex,]))>minR)) {(chainstop = i) & (break)}    
        
      }  
      
    }
    
    if (count[k]>=(nimprob.X)){stop("past ",nimprob.X ," X values of ",k,"th replicative chain all have probabilities of 0 under the stationary distribution", sep="")}
    if (i==maxiter){stop("MCMC chain ran for ", maxiter, " iterations but didn't achieve convergence", sep="")}
    cat("...End of sampling phase...End of MCMC run","\n","\n")
    ###########################################################################################################################################
    if (plot==T){
      Xveclisttmp=Xveclistdim1[(1:(chainstop-adpstop2_max)),1]
      if (dim>1){
        for (j in 2:dim){
          expression =paste("Xveclisttmp=cbind(Xveclisttmp,Xveclistdim",j, "[(1:(chainstop-adpstop2_max)),1])", sep = "")
          eval(parse(text = expression))
        }
      }
      xveclist<-matrix(0,chainstop,1)
      if (dim==1){
        xveclist[1:adpstop2[1]]<-Xveclistbase[(1:adpstop2[1]),1]
        xveclist[(adpstop2_max+1):chainstop]<-Xveclisttmp
      } 
      else {
        xveclist[1:adpstop2[1]]<-Xveclistbase[(1:adpstop2[1]),1,1]
        xveclist[(adpstop2_max+1):chainstop]<-Xveclisttmp[,1]
      }
      Xveclisttmp <- xveclist[seq(m,length(xveclist),by=m)]
      ymax.simple = max(max(Xveclisttmp[1:(adpstop2[1]/m)]),max(Xveclisttmp[(adpstop2_max/m+1):(chainstop/m)]))
      ymin.simple = min(min(Xveclisttmp[1:(adpstop2[1]/m)]),min(Xveclisttmp[(adpstop2_max/m+1):(chainstop/m)]))
      par(mfrow=c(1,1))
      plot((seq(m, adpstop1[1], by = m)), Xveclisttmp[(1:(adpstop1[1]/m))], type="l", main=title, xlab = "iteration", ylab = " ", col="purple", xlim=c(1,(chainstop)),ylim=c(ymin.simple, ymax.simple))
      lines((seq((adpstop1[1]), transientstop[1], by = m)),Xveclisttmp[(adpstop1[1]/m):(transientstop[1]/m)], type="l", col="orange")
      lines((seq((transientstop[1]), adpstop2[1], by = m)),Xveclisttmp[(transientstop[1]/m):(adpstop2[1]/m)], type="l", col="red")
      lines((seq((adpstop2_max+m), discard, by = m)),Xveclisttmp[(adpstop2_max/m+1):(discard/m)], type="l", col="blue")
      lines((seq((discard), chainstop, by = m)),Xveclisttmp[(discard/m):(chainstop/m)], type="l", col="green")
    }
    
    runtime = proc.time() - ptm
    names(runtime) <- NULL 
    
    #summary statistics
    ###########################################################################################################################################
    #iteration numbers to show when each phase has ended
    sumchain = matrix(0,6,nummodes)
    for (k in 1:nummodes){
      sumchain[,k]<-c(adpstop1[k], transientstop[k], adpstop2[k], adpstop2_max,discard, chainstop)
    }
    
    #estimates of the expected values of X and functional X 
    #uses all points from all replication chains in the 2nd half of sampling phase
    means = functionalmeans = rep(0,dim)
    for (j in 1:dim){
      means[j]<-mean(retrieve("Xveclistdim", j)[((discard+1-adpstop2_max):(chainstop-adpstop2_max)),])
      functionalmeans[j]<-mean(functional(retrieve("Xveclistdim", j)[((discard+1-adpstop2_max):(chainstop-adpstop2_max)),]))
    }
    
    #acceptance rate
    acceptrate = sum(numaccept[((discard+1):chainstop),1])/(nrep*(chainstop-discard))
    
    lst = list(Xveclistdim1 = Xveclistdim1)
    if (dim>1){
      for (j in 2:dim){
        expression =paste('lst = c(lst, list(Xveclistdim',j,'=Xveclistdim',j,'))', sep = '')
        eval(parse(text = expression))
      }
    }
    
    output = c(lst, list(Xveclistbase=Xveclistbase, nummodes=nummodes, batchwidth=batchwidth, dim=dim, means=means, functionalmeans=functionalmeans, sumchain=sumchain, acceptrate=acceptrate, runtime=runtime[3]))
    summary = list(estimates = output$means, estimates_of_functional = output$functionalmeans, acceptance_rate = output$acceptrate,
                   time_elapsed=output$runtime, phase_length=output$sumchain)
    print( summary )
    return( output )
    
  }
  
}