# lmeNBBayes: Generate posterior samples from a flexible mixed effect... In lmeNBBayes: Compute the Personalized Activity Index Based on a Flexible Bayesian Negative Binomial Model

## Description

Let Y_{ij} be the response count at jth repeated measure from the ith patient (i=1,\cdots,N and j=1,\cdots,n_i). The negative binomial mixed-effect independent model assumes that given the random effect G[i]=g[i], the count response from the same subjects i.e., Y_{ij} and Y_{ij'} are conditionally independent and follow the negative binomial distribution:

Y[ij] | G[i]=g[i], beta i.i.d.~ NB(Y[ij]; size=exp(X[ij]*beta),prob=g[i])

where \boldsymbol{X}_{ij} is the covariates for mean counts. This formulation results in \log E(Y_{i,j}) = \log(μ_{\frac{1}{G}} - 1 ) + \boldsymbol{X}_{ij}^T\boldsymbol{β}. To allow flexible form of a random effect distribution, we assume that the patient-specific random effect is assumed to be from Dirichlet process mixture of beta distributions. This essentially means that random effect G[i] is from an infinite mixture of Beta distributions:

G[i]| { a[G[h]], r[G[h]], pi[h] }[h=1]^{infty} ~ sum pi[h] Beta(G[i]; shape1=a[G[h]],shape2=r[G[h]]) ,

where pi[h] is modelled with the stick-breaking prior. Introducing latent variable V[h],h=1,2,..., this prior is defined as pi=V and pi[h]=V[h]prod[l < h] (1-V[h]) for h>1 V[h] i.i.d. ~ Beta(1,D).

The rest of priors are specified as: beta ~ N(mu,Sigma) ,

(a[G],r[G]) ~ Unif(a[G];\textrm{min}=0.5,\textrm{max}=max[a[G]])Unif(r[G];\textrm{min}=0.5, \textrm{max}=max[r[G]]) ,

D ~ Unif(v;min=a_D,max=ib_D) .

The default values of the hyperparameters are mu[beta] = rep(0,p), Sigma[beta] = diag(5,p), max[a[G]] = 30, a[D] = 0.01 and ib[D] = 3. These selections of hyperparameters could be used as uninformative ones.

The function lmeNBBayes also allows generating posterior samples from the parametric version of the model above which simply assumes that the random effect is from the single beta distribution. (The rest of the prior specifications are the same).

## Usage

 1 2 3 4 5 6 lmeNBBayes(formula,data, ID, B = 105000, burnin = 5000, printFreq = B, M = NULL, probIndex = FALSE, thin =1,labelnp=NULL, epsilonM = 1e-4, para = list(mu_beta = NULL,Sigma_beta = NULL, max_aG=30,mu_lnD=NULL,sd_lnD=NULL), DP=TRUE,thinned.sample=FALSE, proposalSD = NULL) 

## Arguments

 formula An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. The formula must contain an intercept term. data A data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. The each row must contains the data corresponding to the repeated measure j of subjects and the rows (i,j)s must be ordered in a way that measurements from a subject is clustered together as (1,1),...,(1,n),(2,1),...,(2,n),...,(N,n[N]). ID A vector of length ∑[i=1]^N n[i] , containing the patient IDs that corresponds to data. i.e., c(rep(ID_1,n_1),rep(ID_2,n_2),...,rep(ID_N,n_N)). The length must be the same as the number of rows of data. Missing ID values are NOT accepted. B A scalar, the number of McMC iterations. burnin A scalar for a burn-in period. The proposal variance of the Metoropolice-Hasting rate is adjusted during the burn-in preiod. printFreq An integer value, indicating the frequency of iterations to print during the McMC run. M Necessary only if DP=1. Our Gibbs sampler approximates the infinite mixture of beta distributions by truncating it with M components by setting V[M]=1 so that pi_M = 1 - ∑_{h=1}^{M-1} π_h. If M is NULL, M is selected so that the amount of probability assigned to the final mass point is expected to be epsilonM. i.e., E(π_M)=E\{ (π_M | D )\} = E (\{ 1- 1/(D+1)\}^{M-1}) < ε. probIndex Logical, if it is TRUE then the conditional probability index is computed for each patient at every thinning after discarding burn-in. thin Thinning frequency. Necessary if probIndex is TRUE. labelnp A vector of length ∑[i=1]^N n[i] , containing 0 or 1. Zero indicates that the corresponding repeated measure should be treated as pre-scan and 1 indicates that it is a new scan. labelnp is necessary only if probIndex is TRUE. epsilonM A scalar. See the description of M. para A list containing hyperparameter values. If DP=0 then the followings must be specified: mu_beta (a vector of length p), Sigma_beta (a p by p covariance matrix) and max_aG (a positive scaler). If DP=1 then in addition to the above parameters,mu_lnD (positive scaler) sd_lnD (positive scaler) must be specified. If some of these are not specified then the default values discussed in description are used. DP If DP=1 then the flexible mixed effect negative binomial regression is fit to the dataset. If DP=0 then the random effect distribution is assumed to be a single beta distribution. thinned.sample Logical. If true then return only the thinned samples, else returns the entire MCMC sample of size B. proposalSD List object containing two list objects min and max, which contain minimum and maximum values of the proposal standard deviations. If DP=0 then a list object min (max) must contains 3 elements corresponding to minimum (maximum) values of the proposal standard deviation of aG, rG and beta. See details for beta. If DP=1 then a list object min (max) must contains 4 elements corresponding to minimum (maximum) values of the proposal standard deviation of aG, rG, beta and ln D. See details for beta.

## Details

For the parameters with non-conjugate priors beta,D,a[G],b[G], the Metropolis Hasting (MH) algorithm is employed to sample from their full conditional distributions. For D,a[G],b[G], the MH algorithm is performed separately with a normal proposal distribution, where its proposal variance is tuned during the burn-in to have the acceptance rates range between 0.2 and 0.6. One can adjust the minimum and maximum of the proposal sd via the proposalSD arguments. For each element of beta, we found that updating each regression coefficient with separate MH algorithm resulted in poor mixing in the Markov chain when high correlation is assumed in some of beta in the prior. Therefore, the MH algorithm is performed simultaneously for all beta and a MVN proposal distribution is employed with aSigma as its proposal covariance matrix, where Sigma is the covariance of a prior for beta and a is a tuning scaler adjusted during the burn-in period to have the acceptance rates range between 0.2 and 0.6.

Kondo, Y.

## References

Kondo, Y., Zhao, Y. and Petkau, A.J., "A flexible mixed effect negative binomial regression model for detecting unusual increases in MRI lesion counts in individual multiple sclerosis patients".

getDIC dqmix index.batch.Bayes
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 ## Not run: ## generate samples from DSMSB review 2 d <- getS.StatInMed(rev=2,iseed=1,dist="YZ",Scenario="full") formula.fit <- Y ~ timeInt1:trtAss + timeInt2:trtAss B <- 10000 burnin <- 1000 thin <- 2 fit <- lmeNBBayes(formula=formula.fit,data=d, ID=d$ID, B = B, burnin = burnin, thin=thin) ## The output can be printed out: fit ## Now, compute the conditional probability index using the mean function of placebo patients. ## We need to modify two things in output of lmeNBBayes. ## 1st, change the formula so that it does not distinguish between treatment and placebo fit$para$formula <- Y ~ timeInt1 + timeInt2 ## 2nd, disregard the coefficient that corresponds to the treated patients fit$beta <- fit$beta[,-c(3,5)] cpi <- index.batch.Bayes(data=d,labelnp=d$labelnp,ID=d$ID, olmeNBB=fit,printFreq=10^7) cpi ## finally access the accuracy of the CPI estimates in terms of RMSE Npat <- length(unique(d$ID)) est <- cpi$condProbSummary[,1] true <- d$probIndex[1:Npat] sqrt( mean( ( est - true )^2 ,na.rm=TRUE) ) ## End(Not run)