Description Usage Arguments Details Value Author(s) References See Also Examples
Let Y[ij] be the response count at jth repeated measure from ith subject. The negative binomial mixed-effect independent model assumes that given the random effect G[i]=g[i], the count response Y[ij] follows the negative binomial distribution:
Y[ij] | G[i] =g[i] ~ NB(r[ij],p[i]),
where p[i], the failure probability of subject i at each time point j is parametrized as p[i]= 1/(g[i]*α+1) and α>0. The model assumes E(G[i]) = 1 so that E(Y[ij]|G[i]=g[i])=r[ij]*g[i]*α and E(Y[ij])= r[ij]*g[i]. This assumption allows the interpretation of the latent random variable G[i] as the subject i's activity rate relative to the overall cohort. The marginal mean μ[ij] = E(Y[ij]) is modeled with fixed effect coefficients, β: μ[ij] = exp(X[ij]^T β). Furthermore, let Var(G[i])=theta, then Var(Y[ij])=mu[ij]^2theta+mu[ij](1+(theta + 1)alpha).
——————————————————————————————————————
Regarding the dependence structures of Y[ij] and Y[ij'] conditional on the random effect G[i], we consider two models, namely independent and AR(1) models.
[1]: Independent model Y[ij] and Y[ij'] are independent conditionally on G[i]. This assumption leads to Cov(Y[ij],Y[ij']|G[i]=g[i])=0 and Cov(Y[ij],Y[ij'])=mu[ij] mu[ij']theta
[2]: AR(1) model Autoregressive (1) structures for Y[ij] and Y[ij'] conditionally on G[i]. That is given the random effect G[i]=g[i], Y[ij] depends on Y[i(j-1)] through the beta binomial thinning and is conditionally independent on Y[ij'] given Y[i(j-1)] for all j' < j-1 . The beta binomial thinning operator depends on a parameter δ and models to have Cov(Y[ij],Y[ij']|G[i]=g[i])=δ^{j-j'} μ[ij](α*g[i]^2+g[i]) . This means that δ measures the strength of the positive AR(1) association between repeated measures of a subject: the larger δ, the stronger the positive correlations between the repeated measures of the same subject are.
—————————————————————————————————————–
Regarding the random effect G[i] distribution, lmeNB
allows three models, namely log-normal, gamma and semiparametric models.
(All models assume E(G[i])=1 and Var(G[i])=θ.)
(1) The log-normal model
That is regular log-normal parameters are restricted as
meanlog=-log(theta+1)/2
,
sdlog = sqrt(log(theta+1))
.
(2) The gamma model
That is regular gamma parameters are restricted as shape=1/theta
, scale=theta
.
(3) The semiparametric model No distributional assumption and the random effect distribution is approximated by the estimated values of the quantity: γ[i] = w[i] (y[i+]/ μ[i+]) + (1 - w[i]) , where: y[i+] = ∑[j=1]^{n[i]} y[ij] , μ[i+] = ∑[j=1]^{n[i]} μ[ij] and, w[i] = sqrt( Var(G[i])/Var(Y[i+]/μ[i+]) ) . See Zhao et al. (2013) for more details. The missing count responses, if assumed to be missing at random, can be igored. Other types of missing data are currently not accepted.
1 2 3 4 |
formula |
An object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. The formula must contain an intercept term. |
data |
A data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model.
The each row must contains the data corresponding to the repeated measure j of subjects and the rows (i,j)s must be ordered in a way that measurements from a subject is clustered together as (1,1),...,(1,n[1]),(2,1),...,(2,n[2]),...,(N,n[N]). Missing values are accepted of the response variables are treated as missing at random and simply removed from the data when |
ID |
A vector of length ∑[i=1]^N n[i] , containing the patient IDs that corresponds to |
p.ini |
The initial values of the parameters: If If
|
IPRT |
A logical, passed to Iprt of function |
AR |
A logical, if |
RE |
The distribution of random effects G[i].
If |
deps |
In the semiparametric models, the algorithms are terminated when the maximum difference of fixed effect coefficients between the current and previous iterations is less than |
Vcode |
Necessary only if the AR(1) model is fit.
A vector of length the total number of repeated measures, containing the indices of time point.
For example, there are three subjects and the first two subjects do not have missing visits and completed five visits while the last subject missed the third visit and have four visits in total, then
|
i.tol |
The absolute tolerance of |
o.tol |
The relative tolerance for |
labelnp |
See |
maxit |
The maximum number of iterations. Necessary only for semiparametric random effect model. |
C |
If |
semi.boot |
The number of bootstrap samples to construct the bootstrap empirical confidence intervals for the fixed effect coefficients. If the value is less than 1 then the bootstrap confidence intervals are not computed. Necessary only for semiparametric random effect model. |
u.low |
For the semiparametric procedures,
we observed that the algorithm could behave very unstable when factor covariates are employed in the dataset,
and data contains "few" repeated measures of one of the corresponding factor groups:
The algorithm could yield unacceptably small estimate of mu[ij].
As
Var(Y[ij]) and
Cov(Y_{ij},Y_{ij'}) are both multiples of mu[ij] (see description above),
small mu[ij] leads to singular Var(Y[i]) matrix.
As
a result, the algorithm breakdown when computing the weighted matrix of the weighted least square,
W[i]=Var(Y[i])^{-1}.
To prevent this issue, the current algorithm takes add-hoc treatment, and replaces small mu[ij] i.e.
those with mu[ij]< |
fitParaIND
calls optim
to minimize the negative log-likelihood of
the negative binomial model with respect to the model parameters:
\log(α), \log(θ),β[0], β[1],...
.
The Nelder-Mead algorithm is employed.
The log-likelihood is obtained by marginalizing out the random effects.
The numerical integration is carried out using adaptive quadrature.
The missing count responses, if assumed to be missing at random, can be igored.
Other types of missing data are currently not accepted.
When the estimated over-dispersion parameter (α) is close to zero, the negative binomial model reduces to the poisson model, suggesting that the negative binomial mixed-effect model might not be appropriate.
When AR
=1 and the estimated auto-correlation parameter (δ) is close to zero, the model is suggesting that there is no AR(1) structure among the sequentially collected responses. Hence user might use AR
=0 setting which assumes no AR(1) structure.
We note that the results could be sensitive to initial values.
call |
The input of the function. |
opt |
If If |
nlk |
The value of the negative log-likelihood corresponding to |
V |
If If |
est |
The matrix of the number of fixed-effect parameters (i.e. |
Zhao, Y. and Kondo, Y.
Detection of unusual increases in MRI lesion counts in individual multiple sclerosis patients. (2013) Zhao, Y., Li, D.K.B., Petkau, A.J., Riddehough, A., Traboulsee, A., Journal of the American Statistical Association.
A flexible mixed effect negative binomial regression model for detecting unusual increases in MRI lesion counts in individual multiple sclerosis patients. Kondo, Y., Zhao, Y., Petkau, A.J.
The subroutines of this function is:
fitParaIND
,
fitParaAR1
,
fitSemiIND
,
fitSemiAR1
,
The subroutines of index.batch
to compute the conditional probability index:
jCP.ar1
,
CP1.ar1
,
MCCP.ar1
,
CP.ar1.se
,
CP.se
,
jCP
,
The functions to generate simulated datasets:
rNBME.R
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | ## Not run:
## ================================================================================ ##
## generate a simulated dataset from the negative binomial mixed-effect
## independent model:
## Y_ij | G_i = g_i ~ NB(r_ij,p_i) where r_ij = exp(X^T beta)/a , p_i =1/(a*g_i+1)
## with G_i ~ log-normal(E(G_i)=1,var(G_i)=th)
set.seed(1)
sn <- 5 # the number of repeated measures of each patient
n <- 80 ## the number of patients
loga <- - 0.5
a <- exp(loga) ## dispersion parameter
logtheta <- 1.3
th <- exp(logtheta) ## the variance of the gamma distributed random effect g_i
## No difference between the means of groups
## The model only has an intercept term beta0 <- 0.5
b0 <- 0.5
u1 <- rep(exp(b0),sn) ## the mean vector for group 1 at time point 1,...,sn
u2 <- rep(exp(b0),sn) ## the mean vector for group 2 at time point 1,...,sn
## DT.ind is generated from the IND model
DT.ind<- rNBME.R(gdist="N", n=n, a=a, th=th, u1=u1, u2=u2, sn=sn)
## DT.ar is generated from AR(1) model with delta=0.4
DT.ar<- rNBME.R(gdist="N", n=n, a=a, th=th, u1=u1, u2=u2, sn=sn, d=0.4)
dt.ind<-data.frame(CEL=DT.ind$y,Vcode=DT.ind$vn-2,ID=DT.ind$id)
dt.ar<-data.frame(CEL=DT.ar$y,Vcode=DT.ar$vn-2,ID=DT.ar$id)
## ================================================================================ ##
#### ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ####
#### Fit various parametric independent models ####
#### ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ####
tst1 <- lmeNB(CEL~1, data=dt.ind, ID=dt.ind$ID, IPRT=TRUE) ## gamma Gi
tst2 <- lmeNB(CEL~1, data=dt.ar, ID=dt.ar$ID, IPRT=TRUE) ## gamma Gi
tst3 <- lmeNB(CEL~1, data=dt.ind, ID=dt.ind$ID, IPRT=TRUE, RE="N") ## log-normal Gi
tst4 <- lmeNB(CEL~1, data=dt.ar, ID=dt.ar$ID, IPRT=TRUE, RE="N") ## log-normal Gi
tst5 <- lmeNB(CEL~Vcode, data=dt.ind, ID=dt.ind$ID, IPRT=TRUE, RE="N")## log-normal Gi
## conditional probability index with the fitted results of tst5
Psum5 <- index.batch(olmeNB=tst5, labelnp=dt.ind$Vcode >= 1, data=dt.ind, ID=dt.ind$ID)
#### ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ####
#### Fit the semi-parametric independent model ####
#### ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ####
tst6 <- lmeNB(CEL~1, data=dt.ind, ID=dt.ind$ID, IPRT=TRUE, RE="NoN")
tst7 <- lmeNB(CEL~Vcode, data=dt.ind, ID=dt.ind$ID, IPRT=TRUE, RE="NoN",
semi.boot=100,labelnp=dt.ind$Vcode >= 1)
## semi.boot = 100 option computes bootstrap SE and 95
## conditional probability index with fitting result of tst7
Psum7 <- index.batch(olmeNB=tst7,labelnp=dt.ind$Vcode >= 1, data=dt.ind,
ID=dt.ind$ID, Vcode=Vcode)
#### ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ####
#### Fit the parametric AR1 models ####
#### ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ####
tst8 <- lmeNB(CEL~1, data=dt.ind, ID=dt.ind$ID, IPRT=TRUE,AR=TRUE,Vcode=dt.ind$Vcode)
tst9 <- lmeNB(CEL~1, data=dt.ar, ID=dt.ar$ID, IPRT=TRUE,AR=TRUE,Vcode=dt.ar$Vcode)
## conditional probability index
Psum9 <- index.batch(olmeNB=tst9, labelnp=dt.ind$Vcode >= 1, data=dt.ind,
ID=dt.ind$ID,Vcode=dt.ind$Vcode)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.