FPTbinary: Bayesian analysis for a Finite Polya Tree Bernoulli...

Description Usage Arguments Details Value Author(s) References Examples

Description

This function generates a posterior density sample for a binary regression model using a Finite Polya tree prior for the link function.

Usage

1
2
3
FPTbinary(formula,baseline="logistic",prior,mcmc,state,
          status,misc=NULL,data=sys.frame(sys.parent()),
          na.action=na.fail)

Arguments

formula

a two-sided linear formula object describing the model fit, with the response on the left of a ~ operator and the terms, separated by + operators, on the right.

baseline

a description of the baseline error distribution to be used in the model. The baseline distributions considered by FPTbinary so far is logistic.

prior

a list giving the prior information. The list includes the following parameters: a0 and b0 giving the hyperparameters for prior distribution of the precision parameter of the Finite Polya Tree prior, alpha giving the value of the precision parameter (it must be specified if a0 and b0 are missing), beta0 and Sbeta0 giving the hyperparameters of the normal prior distribution for the regression coefficients, M giving the finite level to be considered for the Finite Polya tree.

mcmc

a list giving the MCMC parameters. The list must include the following integers: nburn giving the number of burn-in scans, nskip giving the thinning interval, nsave giving the total number of scans to be saved, ndisplay giving the number of saved scans to be displayed on the screen (the function reports on the screen when every ndisplay iterations have been carried out), and tune1 and tune2 giving the Metropolis tuning parameters for the regression coefficients and precision parameter, respectively (the default value is 1.1).

state

a list giving the current value of the parameters. This list is used if the current analysis is the continuation of a previous analysis.

status

a logical variable indicating whether this run is new (TRUE) or the continuation of a previous analysis (FALSE). In the latter case the current value of the parameters must be specified in the object state.

misc

misclassification information. When used, this list must include two objects, sens and spec, giving the sensitivity and specificity, respectively. Both can be a vector or a scalar. This information is used to correct for misclassification in the conditional bernoulli model.

data

data frame.

na.action

a function that indicates what should happen when the data contain NAs. The default action (na.fail) causes FPTbinary to print an error message and terminate if there are any incomplete observations.

Details

This generic function fits a semiparametric binary regression model using a Finite Polya tree prior (FPT) for the link function (see, Hanson, 2006; Jara, Garcia-Zattera and Lesaffre, 2006):

yi = I(Vi <= Xi β),\ i=1,…,n

V1,…,Vn | G ~ G

G | alpha ~ FPT^M(Pi,A)

where, the FPT is centered around a Logistic(0,1) distribution if the baseline is logistic, by taking each m level of the partition Pi to coincide with the k/2^m, k=0,…,2^m quantile of the Logistic(0,1) distribution. The family A={alphae: e \in E^{*}}, where E^{*}=\bigcup_{m=1}^{M} E^m and E^m is the m-fold product of E=\{0,1\}, is specified as alpha{e1 … em}=α m^2. To complete the model specification, independent hyperpriors are assumed,

alpha | a0, b0 ~ Gamma(a0,b0)

β | beta0, Sbeta0 ~ N(beta0,Sbeta0)

The precision parameter, alpha, of the FPT prior can be considered as random, having a gamma distribution, Gamma(a0,b0), or fixed at some particular value. To let alpha to be fixed at a particular value, set a0 to NULL in the prior specification.

In the computational implementation of the model, Metropolis-Hastings steps are used to sample the posterior distribution of the regression coefficients and the precision parameter, as described in Hanson (2006), and Jara, Garcia-Zattera and Lesaffre (2006).

Value

An object of class FPTbinary representing the semiparametric logistic regression model fit. Generic functions such as print, plot, predict, summary, and anova have methods to show the results of the fit. The results include beta, the precision parameter (alpha), and the link function.

The MCMC samples of the parameters and the errors in the model are stored in the object thetasave and randsave, respectively. Both objects are included in the list save.state and are matrices which can be analyzed directly by functions provided by the coda package.

The list state in the output object contains the current value of the parameters necessary to restart the analysis. If you want to specify different starting values to run multiple chains set status=TRUE and create the list state based on this starting values. In this case the list state must include the following objects:

beta

giving the value of the regression coefficients.

v

giving the value of the errors (it must be consistent with yi = I(Vi < xi beta).

,

y

giving the value of the true response binary variable (only if the model considers correction for misclassification).

alpha

giving the value of the precision parameter.

Author(s)

Alejandro Jara <atjara@uc.cl>

Tim Hanson <hansont@stat.sc.edu>

References

Hanson, T. (2006) Inference for Mixtures of Finite Polya tree models. Journal of the American Statistical Association, 101: 1548-1565.

Jara, A., Garcia-Zattera, M.J., Lesaffre, E. (2006) Semiparametric Bayesian Analysis of Misclassified Binary Data. XXIII International Biometric Conference, July 16-21, Montreal, Canada.

Lavine, M. (1992) Some aspects of Polya tree distributions for statistical modelling. The Annals of Statistics, 20: 1222-11235.

Lavine, M. (1994) More aspects of Polya tree distributions for statistical modelling. The Annals of Statistics, 22: 1161-1176.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
## Not run: 

    # Prostate cancer data example
      data(nodal)
      attach(nodal)
      lacid<-log(acid)

    # Initial state
      state <- NULL

    # MCMC parameters
      nburn<-20000
      nsave<-10000
      nskip<-10
      ndisplay<-100
      mcmc <- list(nburn=nburn,nsave=nsave,
                   nskip=nskip,ndisplay=ndisplay,
                   tune1=1.1,tune2=1.1)

    # Prior distribution
      prior <- list(alpha=1, beta0=c(0,rep(0.75,5)),
                    Sbeta0=diag(c(100,rep(25,5)),6),M=5)


    # Fitting the Finite Polya tree model
      fit1 <- FPTbinary(ssln~age+lacid+xray+size+grade,
                        prior=prior,mcmc=mcmc,
                        state=state,status=TRUE) 
      fit1

    # Summary with HPD and Credibility intervals
      summary(fit1)
      summary(fit1,hpd=FALSE)

    # Plot model parameters (to see the plots gradually set ask=TRUE)
      plot(fit1)
      plot(fit1,nfigr=2,nfigc=2)	

    # Plot an specific model parameter (to see the plots gradually 
    # set ask=TRUE)
      plot(fit1,ask=FALSE,nfigr=1,nfigc=2,param="xray")	
      plot(fit1,ask=FALSE,param="link",nfigc=1,nfigr=1)

    # Table of Pseudo Contour Probabilities
      anova(fit1)


    # Fitting parametric models

      nburn<-20000
      nsave<-10000
      nskip<-10
      ndisplay<-100
      mcmc <- list(nburn=nburn,nsave=nsave,
                   nskip=nskip,ndisplay=ndisplay,
                   tune=1.1)

      fit2 <- Pbinary(ssln~age+lacid+xray+size+grade,link="probit",
                      prior=prior,mcmc=mcmc,state=state,status=TRUE)  
                    
      fit3 <- Pbinary(ssln~age+lacid+xray+size+grade,link="logit",
                      prior=prior,mcmc=mcmc,state=state,status=TRUE)  

    
    # Model comparison

      DPpsBF(fit1,fit2,fit3)
      

## End(Not run)      

DPpackage documentation built on May 1, 2019, 10:23 p.m.