LDTFPdensity: Density Regression using Linear Dependent Tailfree Processes

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

This function generates a posterior density sample for a Linear Dependent Tailfree Process model for conditional density estimation.

Usage

1
2
3
4
5
6
7
LDTFPdensity(y,x,xtf,prediction,prior,mcmc,
             state,status,ngrid=100,
             grid=NULL,compute.band=FALSE,
             type.band="PD",
             data=sys.frame(sys.parent()),
             na.action=na.fail,
             work.dir=NULL)

Arguments

y

a vector giving the response variables.

x

a matrix giving the design matrix for the median function.

xtf

a matrix giving the design matrix for the conditional probabilities.

prediction

a list giving the information used to obtain conditional inferences. The list includes the following elements: xdenpred and xtfdenpred giving the design matrices for the median and conditional probabilities, respectively, used to obtain inferences about the conditional densities and survival functions, xmedpred and xtfmedpred giving the design matrices for the median and conditional probabilities, respectively, used to obtain inferences about quantiles, and quans a double precision vector giving THREE quantiles for which inferences are obtained. If quans is not specified, the default is quans=c(0.03,0.50,0.97).

prior

a list giving the prior information. The list includes the following parameter: maxn an integer giving the truncation of the tailfree process, a0 and b0 giving the hyperparameters for prior distribution of the precision parameter of the linear dependent tailfree prior, alpha giving the value of the precision parameter (it must be specified if a0 is missing), mub giving the mean of the normal prior of the median regression coefficients, Sb giving the (co)variance of the normal prior distribution for the median regression coefficents, and tau1 and tau2 giving th hyperparameters of the inv-gamma distribution for the centering variance.

mcmc

a list giving the MCMC parameters. The list must include the following elements: nburn an integer giving the number of burn-in scans, nskip an integer giving the thinning interval, nsave an integer giving the total number of scans to be saved, ndisplay an integer giving the number of saved scans to be displayed on screen (the function reports on the screen when every ndisplay iterations have been carried out).

state

a list giving the current value of the parameters. This list is used if the current analysis is the continuation of a previous analysis.

status

a logical variable indicating whether this run is new (TRUE) or the continuation of a previous analysis (FALSE). In the latter case the current value of the parameters must be specified in the object state.

ngrid

integer giving the number of grid points where the conditional density estimate is evaluated. The default is 100.

grid

vector of grid points where the conditional densities are evaluated. The default value is NULL and the grid is chosen according to the range of the data.

compute.band

logical variable indicating whether the credible band for the conditional density and mean function must be computed.

type.band

string indication the type of credible band to be computed; if equal to "HPD" or "PD" then the 95 percent pointwise HPD or PD band is computed, respectively.

data

data frame.

na.action

a function that indicates what should happen when the data contain NAs. The default action (na.fail) causes LDTFPdensity to print an error message and terminate if there are any incomplete observations.

work.dir

working directory.

Details

This generic function fits a Linear Dependent Tailfree process (Jara and Hanson, 2011), given by:

yi = xi' beta + vi, i=1,…,n

vi | Gxtfi ~ Gxtfi

{Gxtf: xtf in X} | maxm, alpha, sigma2 ~ LDTFP^maxm(h,Pi^{sigma2},\textit{A}^{alpha,rhi})

where, h is the logistic CDF, and Gxtf is median-zero and centered around an N(0,sigma2) distribution. To complete the model specification, independent hyperpriors are assumed,

alpha | a0, b0 ~ Gamma(a0,b0)

sigma^-2 | tau1, tau2 ~ Gamma(tau1/2,tau2/2)

The precision parameter, alpha, of the LDTFP prior can be considered as random, having a gamma distribution, Gamma(a0,b0), or fixed at some particular value. To let alpha to be fixed at a particular value, set a0 to NULL in the prior specification.

The computational implementation of the model is based on Slice sampling (Neal, 2003).

Value

An object of class LDTFPdensity representing the LDTFP model fit. Generic functions such as print, plot, and summary have methods to show the results of the fit. The results include beta, alpha and sigma^2.

The list state in the output object contains the current value of the parameters necessary to restart the analysis. If you want to specify different starting values to run multiple chains set status=TRUE and create the list state based on this starting values. In this case the list state must include the following objects:

alpha

a double precision giving the value of the precision parameter.

betace

a vector giving the value of the median regression coefficient.

sigma^2

a double precision giving the value of the centering variance.

betatf

a matrix giving the regression coefficients for each conditional pribability.

Author(s)

Alejandro Jara <atjara@uc.cl>

References

Jara, A., Hanson, T. (2011). A class of mixtures of dependent tail-free processes. Biometrika, 98(3): 553 - 566.

Neal, R. (2003) Slice sampling. Anals of Statistics, 31: 705 - 767.

See Also

DPcdensity, LDDPdensity

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
## Not run: 

    ######################## 
    # IgG data
    ########################
      data(igg)
      z <- igg$age
      y <- log(igg$igg)

    # Design matrices
      ages1 <- z^2
      ages2 <- 1/ages1

      x <- cbind(rep(1,length(y)),ages1,ages2)
      xtf <- cbind(rep(1,length(y)),ages1,ages2)
       
      colnames(x) <- c("(Intercept)","age^2","age^{-2}")
      colnames(xtf) <- c("(Intercept)","age^2","age^{-2}")

    # Prediction
      xdpred <- c(11/12,25/12,38/12,52/12,65/12)
      agesp1 <- xdpred^2
      agesp2 <- 1/agesp1
      xdenpred <- cbind(rep(1,length(xdpred)),agesp1,agesp2)
      xtfdenpred <- xdenpred

      xmpred <- seq(0.5,6,len=50)
      agesp1 <- xmpred^2
      agesp2 <- 1/agesp1
      xmedpred <- cbind(rep(1,length(xmpred)),agesp1,agesp2)
      xtfmedpred <- xmedpred
 
      prediction <- list(xdenpred=xdenpred,
			 xtfdenpred=xtfdenpred,
                         xmedpred=xmedpred,
                         xtfmedpred=xtfmedpred,
                         quans=c(0.03,0.50,0.97))

    # Prior information
      Sb <- diag(1000,3)
      mub <- rep(0,3)
 
      prior <- list(maxm=4,
                    a0=1,
                    b0=1,
                    mub=mub,
                    Sb=Sb,
                    tau1=2.02,
                    tau2=2.02)

    # Initial state
      state <- NULL

    # MCMC parameters

      mcmc <- list(nburn=5000,
                   nsave=5000,
                   nskip=4,
                   ndisplay=200)

    # Fitting the model
 
      fit1 <- LDTFPdensity(y=y,
			   x=x,
			   xtf=xtf,
			   prediction=prediction,
			   prior=prior, 
			   mcmc=mcmc,
			   state=state, 
			   status=TRUE,
			   compute.band=TRUE)

      fit1
      summary(fit1)
      plot(fit1)

    # Plots predictions
    # (conditional densities and quantile functions)

      par(mfrow=c(3,2))      
 
      for(i in 1:5)
      {
         plot(fit1$grid,fit1$densm[i,],lwd=2,
              type="l",lty=1,col="black",xlab="log IgG",ylab="density",
              ylim=c(0,2))
         lines(fit1$grid,fit1$densl[i,],lwd=1,lty=2,col="black")    
         lines(fit1$grid,fit1$densu[i,],lwd=1,lty=2,col="black")    
      }
 
      plot(z,y,ylab="log IgG",xlab="Age (years)")
      lines(xmpred,fit1$qmm[,2],lwd=2)
      lines(xmpred,fit1$qml[,2],lwd=1,lty=2)
      lines(xmpred,fit1$qmu[,2],lwd=1,lty=2)

      lines(xmpred,fit1$qmm[,1],lwd=2)
      lines(xmpred,fit1$qml[,1],lwd=1,lty=2)
      lines(xmpred,fit1$qmu[,1],lwd=1,lty=2)

      lines(xmpred,fit1$qmm[,3],lwd=2)
      lines(xmpred,fit1$qml[,3],lwd=1,lty=2)
      lines(xmpred,fit1$qmu[,3],lwd=1,lty=2)


    #######################################
    # A simulated data using "perfect"
    # simulation (mixture of two normals 
    # and normal true models).
    #######################################

    # Functions needed to simulate data
    # and to evaluate true models

      findq <- function(true.cdf,target,low,
                        upp,epsilon=0.0000001)
      {
          plow <- true.cdf(low)
          pupp <- true.cdf(upp)
          pcenter <- true.cdf((upp+low)/2) 
          err <- abs(pcenter-target)
          i <- 0 
          while(err > epsilon)
          {
                i <- i + 1
                if(target< pcenter)
                {  
                   upp <- (upp+low)/2
                   pupp <- pcenter
                   pcenter <- true.cdf((upp+low)/2) 
                   err <- abs(pcenter-target)
                } 
                if(target>= pcenter)
                {  
                   low <- (upp+low)/2
                   plow <- pcenter
                   pcenter <- true.cdf((upp+low)/2) 
                   err <- abs(pcenter-target)
                } 
            }
            return((upp+low)/2)	
      }	

      true.dens1 <- function(x)
      {
            0.5*dnorm(x,2.5,sqrt(0.005))+
            0.5*dnorm(x,2.85,sqrt(0.005))
      }

      true.dens2 <- function(x)
      {
            dnorm(x,2.1,sqrt(0.0324))
      }

      true.cdf1 <- function(x)
      {
            0.5*pnorm(x,2.50,sqrt(0.005))+
            0.5*pnorm(x,2.85,sqrt(0.005))
      }

      true.cdf2 <- function(x)
      {
            pnorm(x,2.1,sqrt(0.0324))
      }

    # Simulation
      nsim <- 500 
      qq <- seq(1,nsim)/(nsim+1)

      y1 <- rep(0,nsim)
      for(i in 1:nsim)
      {
          aa <- findq(true.cdf1,qq[i],low=-6,upp=6)
          y1[i] <- aa 
      }	

      y2 <- rep(0,nsim)
      for(i in 1:nsim)
      {
          aa <- findq(true.cdf2,qq[i],low=-6,upp=6)
          y2[i] <- aa 
      }	

      trt <- c(rep(0,nsim),rep(1,nsim))
      y <- c(y1,y2)

    # Design matrices
      W1 <- cbind(rep(1,2*nsim),trt)
      W2 <- cbind(rep(1,2*nsim),trt)
      colnames(W1) <- c("(Intercept)","trt")
      colnames(W2) <- c("(Intercept)","trt")

    # Design matrix for prediction 
      intp <- rep(1,2)
      trtp <- c(0,1)
      zpred <- cbind(intp,trtp) 

      prediction <- list(xdenpred=zpred,
                         xtfdenpred=zpred,
                         xmedpred=zpred,
                         xtfmedpred=zpred,
                         quans=c(0.03,0.50,0.97))

    # Prior information
      prior <- list(maxm=5,
                    a0=1,
                    b0=1,
                    mub=rep(0,2),
                    Sb=diag(1000,2),
                    tau1=2.002,
                    tau2=2.002)

    # Initial state
      state <- NULL

    # MCMC parameters
      nburn <- 5000
      nsave <- 5000
      nskip <- 4
      ndisplay <- 200
      mcmc <- list(nburn=nburn,
                   nsave=nsave,
                   nskip=nskip,
                   ndisplay=ndisplay)

    # Fitting the model
      fit1 <- LDTFPdensity(y=y,
                           x=W1,
                           xtf=W2,
                           grid=seq(1.2,3.2,len=200),
                           prediction=prediction,
                           prior=prior, 
                           mcmc=mcmc,     
                           state=state,
                           status=TRUE,
                           compute.band=TRUE)

    # Plotting density estimates and true models

      par(cex=1.7,mar=c(4.1, 4.1, 1, 1))
      plot(fit1$grid,fit1$densu[1,],type="l",xlab="y",
           ylab="f(y|x)",lty=2,lwd=3,main="trt=0")
      lines(fit1$grid,fit1$densl[1,],lty=2,lwd=3)
      lines(fit1$grid,fit1$densm[1,],lty=1,lwd=3)
      tmp1 <- true.dens1(fit1$grid)
      lines(fit1$grid,tmp1,lty=1,lwd=3,col="red") 

      par(cex=1.7,mar=c(4.1, 4.1, 1, 1))
      plot(fit1$grid,fit1$densu[2,],type="l",xlab="y",
           ylab="f(y|x)",lty=2,lwd=3,main="trt=1")
      lines(fit1$grid,fit1$densl[2,],lty=2,lwd=3)
      lines(fit1$grid,fit1$densm[2,],lty=1,lwd=3)
      tmp1 <- true.dens2(fit1$grid)
      lines(fit1$grid,tmp1,lty=1,lwd=3,col="red") 


## End(Not run)

DPpackage documentation built on May 1, 2019, 10:23 p.m.