GibbsBernoulliMLB: Bernoulli LCM
In JonathanBradley28/CM: The Conjugate Multivariate Distribution

Description Usage Arguments Value Examples

View source: R/GibbsBernoulliMLB.R

This code implements a version of the collapsed Gibbs sampler from Bradley et al. (2020). The model assumes that data follow a Bernoulli distribution with a logit link to a mixed effects model. The priors and random effects are assumed to follow a multivariate logit-beta distribution.

1 2	GibbsBernoulliMLB(B, data, X, G, ini = NA, report = 100, itmax = 20, nslice = 2)

`B`	The number of iterations of the collapsed Gibbs sampler
`data`	An n dimensional vector consisting of totals associated with binomial count-valued observations.
`X`	An nxp matrix of covariates. Each column represents a covariate. Each row corresponds to one of the n replicates.
`G`	An nxr matrix of basis functions. Each column represents a basis function. Each row corresponds to one of the n replicates.
`ini`	Initial estimate of the proportions to aid in selecting hyperparameters. Default is the sample proportion.
`report`	Option to print the iteration number of the MCMC.
`itmax`	The values of rho and eps in "GibbsBinomialMLB" are chosen to minimize the Hellinger distance between the data and the estimated proportion. itmax defines the maximum number of iterations in "optim" to obtain these values for rho and eps.
`nslice`	The burnin for the slice sampler

beta A pxB matrix of MCMC values for beta. The vector beta corresponds to the covariates X.

eta A rxB matrix of MCMC values for eta. The vector eta corresponds to the basis functions G.

xi A nxB matrix of MCMC values for xi. The vector xi corresponds to uncorrelated random effects.

pi A nxB matrix of MCMC values for the proportions of the binomial data, modeled with, exp(Xbeta+Geta+xi)/(1+exp(Xbeta+Geta+xi)).

pi.smooth A nxB matrix of MCMC values for the smoothed proportions associated with binomial data, modeled with, exp(Xbeta+Geta)/(1+exp(Xbeta+Geta)).

alphab A 1xB matrix of MCMC values for the shape parameter associated with beta.

alphae A 1xB matrix of MCMC values for the shape parameter associated with eta.

alphax A 1xB matrix of MCMC values for the shape parameter associated with xi.

kappab A 1xB matrix of MCMC values for the second shape parameter associated with beta.

kappae A 1xB matrix of MCMC values for the second shape parameter associated with eta.

kappax A 1xB matrix of MCMC values for the second shape parameter associated with xi

hypparams The selected hyperparameters.

library(CM)

set.seed(4000)

#simulate pseudo data
n = 600
#define a test function
#A non-linear test function
lambda <- function(t) (0.1 + sin(2*pi*t))

#define some 1-d locations
points = seq(0,1,length.out=n+2)
points=points[1:n+2]

#get the logit true proportion at these locations
truemean<-matrix(0,n,1)
for (j in 1:length(points)){
 truemean[j,1] = lambda(points[j])
}

#simulate logit proportion
plot(truemean)

#inverse logit
pi1 = exp(truemean)/(1+exp(truemean))

plot(pi1)

#generate sample sizes
m = matrix(1,n,1)
data = matrix(0,n,1)
for (j in 1:n){
 data[j]=rmultinom(1, m[j], prob = c(pi1[j],1-pi1[j]))[1]
}

plot(data)

#covariate intercept-only
X = matrix(1,n,1)

##compute the basis function matrix
#compute thin-plate splines
r = 10
knots = seq(0,1,length.out=r)

# basis matrix
Bisquare<-function(locs,knots,tol,smoother){
 r = dim(knots)[1]
 n = dim(locs)[1]
 disKnots = matrix(0,r,r)
 for (j in 1:r){
   disKnots[r,]=sqrt((knots[j,]-knots)^2)
 }
 disKnots<-disKnots[disKnots!=0]
 r_l<-smoother*max(disKnots)

 psi = matrix(0,n,r)
 for (i in 1:n){
   for (j in 1:r){
     if (sum(sqrt((locs[i,]-knots[j,])^2))<tol){
       psi[i,j]= (1-(sum(sqrt((locs[i,]-knots[j,])^2))/r_l)^2)^2
     }
   }
 }
 for (j in 1:r){
   psi[,j] = psi[,j]/max(psi[,j])
 }
 return(psi)
}
G = Bisquare(as.matrix(points,n,1),as.matrix(knots,r,1),tol=0.5,smoother=0.5)

#number of MCMC reps
B = 2000
burnin=1000

#Pseudo-Code 2 with updates for shape parameters
ind=sample(n,n)
holdout=data[ind[1:floor(0.1*n)]]
train=data[ind[(floor(0.1*n)+1):n]]
Xtrain=as.matrix(X[ind[(floor(0.1*n)+1):n],])
Gtrain=G[ind[(floor(0.1*n)+1):n],]
Xhold=as.matrix(X[ind[1:floor(0.1*n)],])
Ghold=G[ind[1:floor(0.1*n)],]

output<-GibbsBernoulliMLB(B,train,Xtrain,Gtrain,loess.smooth(points, data)$y,itmax = 10000)

#check some trace plots
plot((as.mcmc(output$beta[1,burnin:B])))
plot(as.mcmc(output$alphab[1,burnin:B]))
plot(as.mcmc(output$alphae[1,burnin:B]))
plot(as.mcmc(output$alphax[1,burnin:B]))
plot(as.mcmc(output$kappab[1,burnin:B]))
plot(as.mcmc(output$kappae[1,burnin:B]))
plot(as.mcmc(output$kappax[1,burnin:B]))
plot((as.mcmc(output$eta[5,burnin:B])))
plot((as.mcmc(output$xi[10,burnin:B])))

nu.gibbsf=X%*%(output$beta)+G%*%(output$eta)
#xi may have a posterior mean not equal to zero
ximean=mean(output$xi[,burnin:B])
for (b in 1:B){
  nu.gibbsf[,b]=nu.gibbsf[,b]+ximean
}
pi.gibbsf= exp(nu.gibbsf)/(1+exp(nu.gibbsf))

#hows our estimate of the proportions?
pihatf = apply(pi.gibbsf[,burnin:B], 1, mean)
piboundsf = apply(pi.gibbsf[,burnin:B], 1, quantile, probs = c(0.025,0.975))

#recenter predicted probabilities around the mean of the data
pi.gibbsf=  pi.gibbsf + mean(data)-mean(pihatf)

#hows our estimate of the proportions?
pihatf = apply(pi.gibbsf[,burnin:B], 1, mean)
piboundsf = apply(pi.gibbsf[,burnin:B], 1, quantile, probs = c(0.025,0.975))
plot(points,pi1,ylim=c(0,1))
lines(sort(points),pihatf,col="red")
lines(sort(points),piboundsf[1,],col="blue")
lines(sort(points),piboundsf[2,],col="blue")