Function to fit first-order Probit-Normal Marginalized Transition Random Effects Models, PNMTREM(1)

Description

Fits PNMTREM(1) via maximum likelihood estimation with Fisher-Scoring Algorithm.

Usage

1
2
3
4
pnmtrem1(covmat1, covmat2, respmat1, respmat2, z, nsubj, nresp, param01, 
param02, beta0, alpha0, tol1 = 1e-04, tol2 = 1e-04, maxiter1 = 50, 
maxiter2 = 50, tun1 = 1,tun2 = 1, x01 = 0, eps1 = 10^-10, x02 = 0, 
eps2 = 10^-10, silent = TRUE, delta.print = FALSE, deltastar.print = FALSE)

Arguments

covmat1

a (p_1+1) \times N \times k matrix or data frame, which has the design matrix form, for the baseline time point (t=1). Here, p_1 is the number of independent variables included in the baseline model, N is the number of subjects and k is the number of multiple responses.

covmat2

a (p_2+1) \times N \times k \times (T-1) matrix or data frame, which has the design matrix form, for t ≥q 2. Here, p_2 is the number of independent variables included in the t ≥q 2 model, N is the number of subjects, k is the number of multiple responses and T is the number of repeated measurements per subject.

respmat1

an (N * k) \times 1 matrix or data frame for the multiple responses at baseline. The general form of it can be depicted as \mbox{Respmat}_{1}=(Y_{.11}, …, Y_{.1k})^T where Y_{.1j}=(Y_{11j}, …, Y_{N1j}).

respmat2

an (N * k * T) \times 1 matrix or data frame for the multiple responses for t ≥q 1. The general form of it can be illustrated as

\mbox{Respmat}_{2}=(Y_{.11}, …, Y_{.1k}, …, Y_{.T1}, …, Y_{.Tk})^T where Y_{.tj}=(Y_{1tj}, …, Y_{Ntj}).

z

a (p_3+1) \times N \times k \times (T-1) matrix or data frame to be included in the second level of the t ≥q 2 model. z typically includes a subset of covariates.

nsubj

an integer which defines the number of subjects in the study.

nresp

an integer which defines the number of multiple binary responses.

param01

a length of [(p_1+1)+(k-1)+1] vector where p_1 is the number of covariates included in the baseline model and k is number of multiple responses. param01 is used to start the Fisher-Scoring (FS) algorithm for the baseline model. The general form of it can be given as \mbox{param01}=(β^*, λ_j^*, c_1), where j=2,…,k and c_1=log(σ_1).

param02

a length of [(p_2+1)+(p_3+1)*(T-1)+(k-1)+(T-1)] vector where p_2 is the number of covariates included in the first level of t ≥q 2 model, p_3 is the number of covariates included in the second level of it, k is the number of multiple responses and T is the number of repeated measurements per subject. param02 is used to start the FS algorithm for the t ≥q 2 model. The general form of it can be given as \mbox{param02}=(β,α_{t,1},λ_j, c_t), where α_{t,1}=(α_{21,1}, …, α_{2p3,1}, …, α_{T1,1}, …, α_{Tp3,1}) and j=2,…, k and t=2, …, T and c_t=log(σ_t).

beta0

a (p_2+1) \times 1 matrix for which all the elements are set to 0. It corresponds to the β_0 component of the Implicit Function Theorem (IFT) point, P_0.

alpha0

a (p_3+1) \times (T-1) matrix for which all the elements are set to 0. It corresponds to the α_{t,10} component of the P_0.

tol1

the amount of tolerance for the convergence of the FS algorithm for baseline model. The default is set to 0.0001.

tol2

the amount of tolerance for the convergence of the FS algorithm for t ≥q 2 model. The default is set to 0.0001.

maxiter1

the maximum number of iterations expected to be consumed by the FS algorithm for baseline model. The default is set to 50.

maxiter2

the maximum number of iterations expected to be consumed by the FS algorithm for t ≥q 2 model. The default is set to 50.

tun1

the tuning parameter for baseline model need to be chosen preferably as integer to decrease the FS steps in each iteration in cases where the algorithm might miss the convergence of the parameters. The default is set to 1.

tun2

the tuning parameter for t ≥q 2 model to decrease the FS steps in each iteration as in the case of tun1. The default is set to 1.

x01

an integer defined for the initial values of the Newton-Raphson (N-R) algorithm to obtain Δ_{i2j0}. The default is set to 0.

eps1

the amount of tolerance for the convergence of N-R algorithm to obtain Δ_{i2j0}. The default is set to 10^{-10}.

x02

an integer defined for the initial values of the Newton-Raphson (N-R) algorithm to obtain the empirical Bayesian estimates of the individual characteristics, \hat{z}_i. The default is set to 0.

eps2

the tolerance defined for the convergence of N-R algorithm to obtain \hat{z}_i. The default is set to 10^{-10}.

silent

a logical statement to decide whether the details of the FS algorithm details for both the baseline and t ≥q 2 models to be printed. The default is set to TRUE which means not printing these details.

delta.print

a logical statement to decide the print of the estimates of Δ_{itj} where t=2, …, T together with the modeling outputs. The default is set to FALSE which means not printing these estimates.

deltastar.print

a logical statement to decide the print of the estimates of Δ_{itj}^* where t=1, …, T together with the modeling outputs. The default is set to FALSE which means not printing these estimates.

Details

The modeling framework assumes two different models: 1) a model for baseline time point (baseline model), a two-level one, and 2) a model for later time points (t ≥q 2 model), a three-level one. These two models are linked to each other via a marginal constraint equation. Both of them are marginalized models and capture marginal effect of independent variables on the mean responses in their first levels. While the former captures the multivariate response dependence in its second level, the latter captures this dependence in its third level. Furthermore, the t ≥q 2 model captures the serial dependence in its second level. Implicit function theorem, specifically first-order implicit differentiation was used to explicitly link first and second level of the t ≥q 2 model. All the integrals are approximated via 20-point Gauss-Hermite Qudratures. Logarithm of the standard deviation parameters of random effects distributions are modeled. A detailed example in terms of data preparation, initial obtaining and setting is provided below.

Value

pnmtrem1 returns the modeling output of baseline and t ≥q 2 models and the associated maximized log-likelihood values. Additionally, it automatically prints the empirical Bayesian estimates of the individual characteristics, \hat{z}_i. The order of these estimates are in subject order. The estimates of Δ_{itj} (for t=2,..., T) and Δ_{itj}^* (for t=1,...,T) are in the same order of the responses and covariates.

Note

Version 1.3.

Author(s)

Ozgur Asar, Ozlem Ilk

References

Asar, O., Ilk, O., Sezer, A. D. (2013). A marginalized multilevel model for analyzing multivariate longitudinal binary data. Submitted.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
## Not run: 
## loading a simulated bivariate longitudinal binary data with 500 subjects
## and 4 time points
data(pnmtrem1.sim.data1)
data(pnmtrem1.sim.data2)

## number of subjects, multiple responses and time points
nsubj<-500
nresp<-2
ntime<-4

## sepearating the portion of data which pnmtrem1 function will use
covmat1<-as.matrix(pnmtrem1.sim.data1[,5:6])
covmat<-as.matrix(pnmtrem1.sim.data2[,5:7])
mresp1<-as.matrix(pnmtrem1.sim.data1[,4])
mresp<-as.matrix(c(pnmtrem1.sim.data1[,4],pnmtrem1.sim.data2[,4]))

## obtaining initials for \beta^*
glm1<-glm(mresp1~-1+covmat1,family=binomial(link=probit))
bsinit<-glm1$coef;names(bsinit)<-NULL

## initials for parameters in the baseline model, i.e. \beta^*, \lambda^*, c_1
param01<-c(bsinit,1,log(0.5))

## obtaining initials of \beta
# preparing data to be analyzed by mmm2
mresp.mmm<-as.matrix(pnmtrem1.sim.data2[,4])
id<-as.matrix(rep(seq(1:nsubj),((ntime-1)*nresp)))
time<-as.matrix(c(rep(2,nsubj*nresp),rep(3,nsubj*nresp),rep(4,nsubj*nresp)))
data<-cbind(id,time,mresp.mmm,covmat)

# ordering data by subject ID
data2<-NULL
for (i in 1:nsubj){
data.id<-data[data[,1]==i,]
data2<-rbind(data2,data.id)
}
# subsetting data by response type (6th column of data2)
data.resp1<-data2[data2[,6]==1,]
data.resp2<-data2[data2[,6]==0,]
data.mmm<-cbind(data.resp1[,1],data.resp1[,3],data.resp2[,3],data.resp1[,5])
library(mmm2)
mmm2.fit<-mmm2(data=data.mmm,nresp=2,family=binomial(link=probit),
corstr = "exchangeable")
binit<-coef(mmm2.fit)

## obtaining initials of \alpha
glm3<-glm(mresp[(nsubj*nresp+1):(2*nsubj*nresp),]~-1+mresp1,family=binomial(link=probit))
glm4<-glm(mresp[(2*nsubj*nresp+1):(3*nsubj*nresp),]~-1+
mresp[(nsubj*nresp+1):(2*nsubj*nresp),],family=binomial(link=probit))
glm5<-glm(mresp[(3*nsubj*nresp+1):(4*nsubj*nresp),]~-1+
mresp[(2*nsubj*nresp+1):(3*nsubj*nresp),],family=binomial(link=probit))
alpinit<-c(glm3$coef[1],glm4$coef[1],glm5$coef[1]);names(alpinit)<-NULL

## initials for parameters in the t \geq 2 model, i.e. \beta, \alpha, \lambda, c_2, c_3, c_4
param02<-c(binit,alpinit,1,log(0.5),log(0.5),log(0.5))

## implicit function initials, \beta_0 and \alpha_0
beta0<-matrix(c(0,0,0),ncol=1)
alpha0<-matrix(c(0,0,0),ncol=1)

## covariate set to be interacted with the response history
z<-matrix(rep(1,3*nsubj*nresp),ncol=1)

fit<-pnmtrem1(covmat1=covmat1,covmat2=covmat,respmat1=mresp1,respmat2=mresp,z=z,
nsubj=500,nresp=2,param01=param01,param02=param02,beta0=beta0,alpha0=alpha0,
tol1=0.0001,tol2=0.0001,maxiter1=50,maxiter2=50,tun1=1,tun2=1,x01=0,eps1=10^-10,
x02=0,eps2=10^-10,silent=FALSE,delta.print=TRUE,deltastar.print=TRUE)

## manipulation of the output
fit
fit$output1
fit$maxloglik1
fit$output2
fit$maxloglik2
fit$delta
fit$delstar
fit$empbayes
## End(Not run)