R/rhierMnlRwMixtureParallel.R

Defines functions rhierMnlRwMixtureParallel

Documented in rhierMnlRwMixtureParallel

#' MCMC Algorithm for Hierarchical Multinomial Logit with Mixture-of-Normals Heterogeneity
#'
#' \code{rhierMnlRwMixtureParallel} implements a MCMC algorithm for a hierarchical multinomial logit model with a mixture of normals heterogeneity distribution.
#'
#' @usage rhierMnlRwMixtureParallel(Data, Prior, Mcmc, verbose = FALSE)
#'
#' @param Data A list containing: `p`- number of choice alternatives, `lgtdata` - A \eqn{nlgt} size list of multinomial logistic data, and optional `Z`- matrix of unit characteristics.
#' @param Prior A list with one required parameter: `ncomp`, and optional parameters: `mubar`, `Amu`, `nu`, `V`, `Ad`, `deltaBar`, and `a`.
#' @param Mcmc A list with one required parameter: 'R' - number of iterations, and optional parameters: `s`, `w`, `keep`, `nprint`, and `drawcomp`.
#' @param verbose If \code{TRUE}, enumerates model parameters and timing information.
#'
#' @details
#' \subsection{Model and Priors}{
#' \eqn{y_i} \eqn{\sim}{~} \eqn{MNL(X_i,\beta_i)} with \eqn{i = 1, \ldots,} length(lgtdata) 
#' and where \eqn{\beta_i} is \eqn{1 \times nvar}
#' 
#' \eqn{\beta_i} = \eqn{Z\Delta}[i,] + \eqn{u_i} \cr
#' Note:  Z\eqn{\Delta} is the matrix Z \eqn{ \times \Delta} and [i,] refers to \eqn{i}th row of this product \cr
#' Delta is an \eqn{nz \times nvar} array 
#' 
#' \eqn{u_i} \eqn{\sim}{~} \eqn{N(\mu_{ind},\Sigma_{ind})} with \eqn{ind} \eqn{\sim}{~} multinomial(pvec)
#' 
#' \eqn{pvec}                \eqn{\sim}{~} dirichlet(a) \cr
#' \eqn{delta = vec(\Delta)} \eqn{\sim}{~} \eqn{N(deltabar, A_d^{-1})} \cr
#' \eqn{\mu_j}               \eqn{\sim}{~} \eqn{N(mubar, \Sigma_j (x) Amu^{-1})} \cr
#' \eqn{\Sigma_j}            \eqn{\sim}{~} \eqn{IW(nu, V)}
#' 
#' Note: \eqn{Z} should NOT include an intercept and is centered for ease of interpretation. 
#' The mean of each of the \code{nlgt} \eqn{\beta}s is the mean of the normal mixture. 
#' Use \code{summary()} to compute this mean from the \code{compdraw} output.\cr
#' 
#' Be careful in assessing prior parameter \code{Amu}: 0.01 is too small for many applications. 
#' See chapter 5 of Rossi et al for full discussion.
#' }
#' \subsection{Argument Details}{
#' \emph{\code{Data  = list(lgtdata, Z, p)} [\code{Z} optional]}
#' \tabular{ll}{
#' \code{lgtdata:        } \tab A \eqn{nlgt} size list of multinominal lgtdata \cr
#' \code{lgtdata[[i]]$y: } \tab \eqn{n_i \times 1} vector of multinomial outcomes (1, \ldots, p) for \eqn{i}th unit\cr
#' \code{lgtdata[[i]]$X: } \tab \eqn{(n_i \times p) \times nvar} design matrix for \eqn{i}th unit \cr
#' \code{Z:              } \tab An \eqn{(nlgt) \times nz} matrix of unit characteristics\cr 
#' \code{p:              } \tab number of choice alternatives 
#' }
#' \emph{\code{Prior = list(a, deltabar, Ad, mubar, Amu, nu, V, a, ncomp)} [all but \code{ncomp} are optional]}
#' \tabular{ll}{
#' \code{a:              } \tab \eqn{ncomp \times 1} vector of Dirichlet prior parameters (def: \code{rep(5,ncomp)}) \cr
#' \code{deltabar:       } \tab \eqn{(nz \times nvar) \times 1} vector of prior means (def: 0) \cr
#' \code{Ad:             } \tab prior precision matrix for vec(D) (def: 0.01*I) \cr
#' \code{mubar:          } \tab \eqn{nvar \times 1} prior mean vector for normal component mean (def: 0 if unrestricted; 2 if restricted) \cr
#' \code{Amu:            } \tab prior precision for normal component mean (def: 0.01 if unrestricted; 0.1 if restricted) \cr
#' \code{nu:             } \tab d.f. parameter for IW prior on normal component Sigma (def: nvar+3 if unrestricted; nvar+15 if restricted) \cr
#' \code{V:              } \tab PDS location parameter for IW prior on normal component Sigma (def: nu*I if unrestricted; nu*D if restricted with d_pp = 4 if unrestricted and d_pp = 0.01 if restricted) \cr
#' \code{ncomp:          } \tab number of components used in normal mixture 
#' }
#' \emph{\code{Mcmc  = list(R, keep, nprint, s, w)} [only \code{R} required]}
#' \tabular{ll}{
#' \code{R:              } \tab number of MCMC draws \cr
#' \code{keep:           } \tab MCMC thinning parameter -- keep every \code{keep}th draw (def: 1) \cr
#' \code{nprint:         } \tab print the estimated time remaining for every \code{nprint}'th draw (def: 100, set to 0 for no print) \cr
#' \code{s:              } \tab scaling parameter for RW Metropolis (def: 2.93/\code{sqrt(nvar)}) \cr
#' \code{w:              } \tab fractional likelihood weighting parameter (def: 0.1)
#' }
#' }
#' 
#' @return A list of sharded partitions where each partition contains the following:
#' \item{compdraw}{A list (length: R/keep) where each list contains `mu` (vector, length: `ncomp`) and `rooti` (matrix, shape: ncomp \eqn{\times} ncomp)}
#' \item{probdraw}{A \eqn{(R/keep) \times (ncomp)} matrix that reports the probability that each draw came from a particular component}
#' \item{Deltadraw}{A \eqn{(R/keep) \times (nz \times nvar)} matrix of draws of Delta, first row is initial value. This could be null if Z is NULL}
#'
#' @references
#' Bumbaca, Federico (Rico), Sanjog Misra, and Peter E. Rossi (2020), "Scalable Target Marketing:
#' Distributed Markov Chain Monte Carlo for Bayesian Hierarchical Models", Journal of Marketing
#' Research, 57(6), 999-1018.
#'
#' Chapter 5, \emph{Bayesian Statistics and Marketing} by Rossi, Allenby, and McCulloch.
#'
#' @author
#' Federico Bumbaca, Leeds School of Business, University of Colorado Boulder, \email{federico.bumbaca@colorado.edu}
#'
#'
#' @seealso 
#' \code{\link{partition_data}}, 
#' \code{\link{drawPosteriorParallel}}, 
#' \code{\link{combine_draws}}, 
#' \code{\link{rheteroMnlIndepMetrop}}
#'
#' @examples
#' 
#' \donttest{
#' 
#' R=500
#' 
#' ######### Single Component with rhierMnlRwMixtureParallel########
#' ##parameters
#' p = 3 # num of choice alterns                            
#' ncoef = 3  
#' nlgt = 1000                          
#' nz = 2
#' Z = matrix(runif(nz*nlgt),ncol=nz)
#' Z = t(t(Z)-apply(Z,2,mean)) # demean Z
#' 
#' ncomp = 1 # no of mixture components
#' Delta = matrix(c(1,0,1,0,1,2),ncol=2)
#' comps = NULL
#' comps[[1]] = list(mu=c(0,2,1),rooti=diag(rep(1,3)))
#' pvec = c(1)
#' 
#' simmnlwX = function(n,X,beta){
#'   k=length(beta)
#'   Xbeta=X %*% beta
#'   j=nrow(Xbeta)/n
#'   Xbeta=matrix(Xbeta,byrow=TRUE,ncol=j)
#'   Prob=exp(Xbeta)
#'   iota=c(rep(1,j))
#'   denom=Prob %*% iota
#'   Prob=Prob/as.vector(denom)
#'   y=vector("double",n)
#'   ind=1:j
#'   for (i in 1:n) { 
#'     yvec = rmultinom(1, 1, Prob[i,])
#'     y[i] = ind%*%yvec
#'   }
#'   return(list(y=y,X=X,beta=beta,prob=Prob))
#' }
#' 
#' ## simulate data
#' simlgtdata=NULL
#' ni=rep(5,nlgt) 
#' for (i in 1:nlgt) 
#' {
#'   if (is.null(Z))
#'   {
#'     betai=as.vector(bayesm::rmixture(1,pvec,comps)$x)
#'   } else {
#'     betai=Delta %*% Z[i,]+as.vector(bayesm::rmixture(1,pvec,comps)$x)
#'   }
#'   Xa=matrix(runif(ni[i]*p,min=-1.5,max=0),ncol=p)
#'   X=bayesm::createX(p,na=1,nd=NULL,Xa=Xa,Xd=NULL,base=1)
#'   outa=simmnlwX(ni[i],X,betai)
#'   simlgtdata[[i]]=list(y=outa$y,X=X,beta=betai)
#' }
#' 
#' ## set MCMC parameters
#' Prior1=list(ncomp=ncomp) 
#' keep=1
#' Mcmc1=list(R=R,keep=keep) 
#' Data1=list(list(p=p,lgtdata=simlgtdata,Z=Z))
#' s = 1
#' Data2 = partition_data(Data1, s=s)
#' 
#' out2 = parallel::mclapply(Data2, FUN = rhierMnlRwMixtureParallel, Prior = Prior1,
#' Mcmc = Mcmc1, mc.cores = s, mc.set.seed = FALSE)
#' 
#' ######### Multiple Components with rhierMnlRwMixtureParallel########
#' ##parameters
#' R = 500
#' p=3 # num of choice alterns                            
#' ncoef=3  
#' nlgt=1000  # num of cross sectional units                         
#' nz=2
#' Z=matrix(runif(nz*nlgt),ncol=nz)
#' Z=t(t(Z)-apply(Z,2,mean)) # demean Z
#' 
#' ncomp=3
#' Delta=matrix(c(1,0,1,0,1,2),ncol=2) 
#' 
#' comps=NULL 
#' comps[[1]]=list(mu=c(0,2,1),rooti=diag(rep(1,3)))
#' comps[[2]]=list(mu=c(1,0,2),rooti=diag(rep(1,3)))
#' comps[[3]]=list(mu=c(2,1,0),rooti=diag(rep(1,3)))
#' pvec=c(.4,.2,.4)
#' 
#' simmnlwX= function(n,X,beta) {
#'   k=length(beta)
#'   Xbeta=X %*% beta
#'   j=nrow(Xbeta)/n
#'   Xbeta=matrix(Xbeta,byrow=TRUE,ncol=j)
#'   Prob=exp(Xbeta)
#'   iota=c(rep(1,j))
#'   denom=Prob %*% iota
#'   Prob=Prob/as.vector(denom)
#'   y=vector("double",n)
#'   ind=1:j
#'   for (i in 1:n) 
#'   {yvec=rmultinom(1,1,Prob[i,]); y[i]=ind %*% yvec}
#'   return(list(y=y,X=X,beta=beta,prob=Prob))
#' }
#' 
#' ## simulate data
#' simlgtdata=NULL
#' ni=rep(5,nlgt) 
#' for (i in 1:nlgt) 
#' {
#'   if (is.null(Z))
#'   {
#'     betai=as.vector(bayesm::rmixture(1,pvec,comps)$x)
#'   } else {
#'     betai=Delta %*% Z[i,]+as.vector(bayesm::rmixture(1,pvec,comps)$x)
#'   }
#'   Xa=matrix(runif(ni[i]*p,min=-1.5,max=0),ncol=p)
#'   X=bayesm::createX(p,na=1,nd=NULL,Xa=Xa,Xd=NULL,base=1)
#'   outa=simmnlwX(ni[i],X,betai) 
#'   simlgtdata[[i]]=list(y=outa$y,X=X,beta=betai)
#' }
#' 
#' ## set parameters for priors and Z
#' Prior1=list(ncomp=ncomp) 
#' keep=1
#' Mcmc1=list(R=R,keep=keep) 
#' Data1=list(list(p=p,lgtdata=simlgtdata,Z=Z))
#' s = 1
#' Data2 = partition_data(Data1,s)
#' 
#' out2 = parallel::mclapply(Data2, FUN = rhierMnlRwMixtureParallel, Prior = Prior1,
#' Mcmc = Mcmc1, mc.cores = s, mc.set.seed = FALSE)
#'}
#'
#'
#' @export
#'

rhierMnlRwMixtureParallel = function(Data,Prior,Mcmc, verbose = FALSE) {
  
  #check if all parameters present
  if(missing(Data) || !is.list(Data)) {stop("Requires Data argument -- list of p,lgtdata, and (possibly) z", .call=FALSE)}
  if(missing(Prior) || !is.list(Prior)) {stop("Requires Prior list argument (at least ncomp)", .call=FALSE)}
  if(missing(Mcmc) || !is.list(Mcmc)) {stop("Requires Mcmc list argument", .call=FALSE)}
  
  # check on data
  if(is.null(Data$p)) {stop("Requires Data element p (# choice alternatives)", .call=FALSE)}
  p = Data$p

  if(is.null(Data$lgtdata)) {stop("Requires Data element lgtdata (list of data for each unit)", .call=FALSE)}

  nlgt=length(Data$lgtdata)

  nvar=ncol(Data$lgtdata[[1]]$X)

  drawdelta=TRUE
  if(is.null(Data$Z)) {
    cat("Z not specified",fill=TRUE); #fsh() ;
    drawdelta=FALSE
  } else {
    if (nrow(Data$Z) != nlgt) {
      stop(paste("Nrow(Z) ",nrow(Z),"ne number logits ",nlgt),.call=FALSE)
    } else {
      z=Data$Z
    }
  }
  if(drawdelta) {
    nz=ncol(z)
#    colmeans=apply(z,2,mean)
#    if(sum(colmeans) > .00001)
#   {
#      stop(paste("z does not appear to be de-meaned: colmeans= ", colmeans), .call=FALSE)
#    }
  }
  
  ny = 0
  nX = 0
  for (i in 1:nlgt) {
    ny = ny + length(Data$lgtdata[[i]]$y)
    nX = nX + nrow(Data$lgtdata[[i]]$X)
  }
  indy = 1
  indX = 1
  ypooled = vector(length=ny)
  Xpooled = matrix(0, nrow=nX, ncol=nvar)
  if(!is.null(Data$lgtdata[[1]]$X)) {oldncol=ncol(Data$lgtdata[[1]]$X)}
  
  for (i in 1:nlgt) {
    if(is.null(Data$lgtdata[[i]]$y)) {stop(paste("Requires element y of lgtdata[[",i,"]]"), .call=FALSE)}
    if(is.null(Data$lgtdata[[i]]$X)) {stop(paste("Requires element X of lgtdata[[",i,"]]"), .call=FALSE)}
    ypooled[indy:(indy+length(Data$lgtdata[[i]]$y)-1)] = Data$lgtdata[[i]]$y
    indy = indy + length(Data$lgtdata[[i]]$y)
    nrowX=nrow(Data$lgtdata[[i]]$X)
    if((nrowX/p) !=length(Data$lgtdata[[i]]$y)) {stop(paste("nrow(X) ne p*length(yi); exception at unit",i), .call=FALSE)}
    newncol=ncol(Data$lgtdata[[i]]$X)
    if(newncol != oldncol) {stop(paste("All X elements must have same # of cols; exception at unit",i), .call=FALSE)}
    Xpooled[indX:(indX+nrow(Data$lgtdata[[i]]$X)-1),] = Data$lgtdata[[i]]$X
    indX = indX + nrow(Data$lgtdata[[i]]$X)
    oldncol=newncol
  }

  levely=as.numeric(levels(as.factor(ypooled)))
  if(length(levely) != p) {stop(paste("y takes on ",length(levely),
                                      " values -- must be = p"), .call=FALSE)}
  bady=FALSE
  for (i in 1:p ) {
    if(levely[i] != i) bady=TRUE
  }
  if(verbose){
  cat("Table of Y values pooled over all units",fill=TRUE)
  print(table(ypooled))
  }
  if (bady) {stop("Invalid Y", .call=FALSE)}
  
  # check on prior
  if(is.null(Prior$ncomp)) {
    stop("Requires Prior element ncomp (num of mixture components)", 
         .call=FALSE)
  } else {
    ncomp=Prior$ncomp
  }

  if(is.null(Prior$mubar)) {
    mubar=matrix(rep(0,nvar),nrow=1)
  } else {
    mubar=matrix(Prior$mubar,nrow=1)
  }
  if(ncol(mubar) != nvar) {stop(paste("mubar must have ncomp cols, ncol(mubar)= ",
                                      ncol(mubar)), .call=FALSE)}
  if(is.null(Prior$amu)) {
    amu=matrix(ScalableBayesmConstant.A,ncol=1)
  } else {
    amu=matrix(Prior$amu,ncol=1)
  }
  if(ncol(amu) != 1 | nrow(amu) != 1) {stop("Am must be a 1 x 1 array", 
                                            .call=FALSE)}
  if(is.null(Prior$nu)) {
    nu=nvar+ScalableBayesmConstant.nuInc
  } else {
    nu=Prior$nu
  }
   if(nu < 1) {stop("invalid nu value", .call=FALSE)}
 if(is.null(Prior$v)) {
    v=nu*diag(nvar)
  } else {
    v=Prior$v
  }
  if(sum(dim(v)==c(nvar,nvar)) !=2) {stop("Invalid v in Prior", .call=FALSE)}
  if(is.null(Prior$ad) & drawdelta) {
    ad=ScalableBayesmConstant.A*diag(nvar*nz)
  } else {
    ad=Prior$ad
  }
  if(drawdelta) {
    if(ncol(ad) != nvar*nz | nrow(ad) != nvar*nz) {
      stop("ad must be nvar*nz x nvar*nz",.call=FALSE)
      }
  }
  if(is.null(Prior$deltabar)& drawdelta) {
    deltabar=rep(0,nz*nvar)
  } else {
    deltabar=Prior$deltabar
  }
  if(drawdelta) {
    if(length(deltabar) != nz*nvar) {stop("deltabar must be of length nvar*nz", 
                                          .call=FALSE)}
  }
  if(is.null(Prior$a)) {
    a=rep(ScalableBayesmConstant.a,ncomp)
  } else {
    a=Prior$a
  }
  if(length(a) != ncomp) {stop("Requires dim(a)= ncomp (no of components)", 
                               .call=FALSE)}
  bada=FALSE
  for(i in 1:ncomp) {
    if(a[i] < 0) {bada=TRUE}
  }
  if(bada) {stop("invalid values in a vector", .call=FALSE)}
  
  # check on Mcmc
  if(is.null(Mcmc$s)) {
    s=ScalableBayesmConstant.RRScaling/sqrt(nvar)
  } else {
    s=Mcmc$s
  }
  if(is.null(Mcmc$w)) {
    w=ScalableBayesmConstant.w
  }  else {
    w=Mcmc$w
  }
  if(is.null(Mcmc$R)) {
    stop("Requires MCMC element 'R' - number of iterations", .call=FALSE)
  } else {
    r = Mcmc$R
  }
  if(is.null(Mcmc$keep)) {
    keep=ScalableBayesmConstant.keep
  } else {
    keep=Mcmc$keep
  }
  if(is.null(Mcmc$nprint)) {
    nprint=ScalableBayesmConstant.nprint
  } else {
    nprint=Mcmc$nprint
    if(nprint<0) {stop('nprint must be an integer greater than or equal to 0', 
                       .call=FALSE)}
  }
  if(is.null(Mcmc$drawcomp)) {Mcmc$drawcomp=FALSE}
  
  if(verbose){
  # print out problem
  cat(" ",fill=TRUE)
  cat("Starting MCMC Inference for Hierarchical Logit:",fill=TRUE)
  cat("   Normal Mixture with",ncomp,"components for first stage prior",
      fill=TRUE)
  cat(paste("  ",p," alternatives; ",nvar," variables in X"),fill=TRUE)
  cat(paste("   for ",nlgt," cross-sectional units"),fill=TRUE)
  cat(" ",fill=TRUE)
  cat("Prior Parameters: ",fill=TRUE)
  cat("nu =",nu,fill=TRUE)
  cat("v ",fill=TRUE)
  print(v)
  cat("mubar ",fill=TRUE)
  print(mubar)
  cat("amu ", fill=TRUE)
  print(amu)
  cat("a ",fill=TRUE)
  print(a)
  if(drawdelta)
  {
    cat("deltabar",fill=TRUE)
    print(deltabar)
    cat("ad",fill=TRUE)
    print(ad)
  }
  cat(" ",fill=TRUE)
  cat("MCMC Parameters: ",fill=TRUE)
  cat(paste("s=",round(s,3)," w= ",w," r= ",r," keep= ",keep," nprint= ",nprint),
      fill=TRUE)
  cat("",fill=TRUE)
  }
  
  oldbetas = matrix(double(nlgt * nvar), ncol = nvar)
  
  #-----------------------------------------------------------------------------
  #  create functions needed
  llmnlFract = function(beta,y,X,betapooled,rootH,w,wgt){
    z=as.vector(rootH%*%(beta-betapooled))
    return((1-w)*bayesm::llmnl(beta,y,X)+w*wgt*(-.5*(z%*%z)))
  }
  #-----------------------------------------------------------------------------
  # intialize compute quantities for Metropolis
  
  if(verbose){
  cat("initializing Metropolis candidate densities for ",nlgt," units ...",
      fill=TRUE)
  }  
  
  #  compute fraction likelihood estimates and hessians
  #  compute pooled optimum
  betainit=c(rep(0,nvar))
  out=optim(betainit,bayesm::llmnl,method="BFGS",control=list(fnscale=-1,
                                                              trace=0,
                                                              reltol=1e-6),
            X=Xpooled,y=ypooled)
  betapooled=out$par
  H=bayesm::mnlHess(betapooled,ypooled,Xpooled)

  rm(ypooled)
  rm(Xpooled)
  invisible(gc())
  rootH=chol(H)
  hess = vector(mode="list", length=nlgt)

  for (i in 1:nlgt) {
    wgt=length(Data$lgtdata[[i]]$y)/ny
    out=optim(betapooled,llmnlFract,method="BFGS",control=list( fnscale=-1,
                                                                trace=0,
                                                                reltol=1e-4),
              X=Data$lgtdata[[i]]$X,y=Data$lgtdata[[i]]$y,betapooled=betapooled,
              rootH=rootH,w=w,wgt=wgt)
    if(out$convergence == 0) {
 
      hess=bayesm::mnlHess(out$par,Data$lgtdata[[i]]$y,Data$lgtdata[[i]]$X)
      #print(hess)
      oldbetas[i,] = out$par

      Data$lgtdata[[i]]$hess=hess
    
    } else {
      
      hess=diag(nvar)
      print(hess)
      oldbetas[i,]=c(rep(0,nvar))
 
      Data$lgtdata[[i]]$hess=hess
    }
    #if(i%%50 ==0) {cat("  completed unit #",i,fill=TRUE)}
  }
  
  #  initialize values
  #
  # set initial values for the indicators
  #     ind is of length(nlgt) and indicates which mixture component this obs
  #     belongs to.
  
  ind=NULL
  ninc=floor(nlgt/ncomp)
  for (i in 1:(ncomp-1)) {ind=c(ind,rep(i,ninc))}
  if(ncomp != 1) {
    ind = c(ind,rep(ncomp,nlgt-length(ind)))
  } else {
    ind=rep(1,nlgt)
  }
  
  # initialize probs
  oldprob=rep(1/ncomp,ncomp)
  
  #initialize delta
  if (drawdelta){
    olddelta = rep(0,nz*nvar)
  } else { #send placeholders to the _loop function if there is no z matrix
    olddelta = 0
    z = matrix(0)
    deltabar = 0
    ad = matrix(0)
  }
  draws =  rhierMnlRwMixtureParallel_rcpp_loop(Data$lgtdata, z, deltabar, ad, 
                                               mubar, amu, nu, v, s, r, keep, 
                                               nprint, drawdelta, 
                                               as.matrix(olddelta), a, oldprob, 
                                               oldbetas, ind, verbose)
  return(draws)
}

Try the scalablebayesm package in your browser

Any scripts or data that you put into this service are public.

scalablebayesm documentation built on April 3, 2025, 7:55 p.m.