PTlm: Bayesian analysis for a semiparametric linear regression...

Description Usage Arguments Details Value Author(s) References Examples

Description

This function generates a posterior density sample from a semiparametric linear regression model using a Mixture of Polya Trees prior for the distribution of the errors.

Usage

1
2
PTlm(formula,ngrid=200,grid=NULL,prior,mcmc,state,status,
     data=sys.frame(sys.parent()),na.action=na.fail)

Arguments

formula

a two-sided linear formula object describing the model fit, with the response on the left of a ~ operator and the terms, separated by + operators, on the right.

ngrid

number of grid points where the error density estimate is evaluated. The default value is 200.

grid

grid points where the density estimate is evaluated. The default is NULL.

prior

a list giving the prior information. The list includes the following parameter: a0 and b0 giving the hyperparameters for prior distribution of the precision parameter of the Polya Tree prior, alpha giving the value of the precision parameter (it must be specified if a0 and b0 are missing, see details below), tau1 and tau2 giving the hyperparameters for the prior distribution of the variance of the normal baseline distribution, beta0 and Sbeta0 giving the hyperparameters of the normal prior distribution for the regression coefficients, optionally M giving the finite level to be considered, and frstlprob a logical variable indicating whether the first level probabilities of the PT are fixed defining a median regression model (the default is TRUE). Note that if M is specified, a Partially Specified Mixture of Polya trees is fitted.

mcmc

a list giving the MCMC parameters. The list must include the following integers: nburn giving the number of burn-in scans, nskip giving the thinning interval, nsave giving the total number of scans to be saved, and ndisplay giving the number of saved scans to be displayed on the screen (the function reports on the screen when every ndisplay iterations have been carried out).

state

a list giving the current value of the parameters. This list is used if the current analysis is the continuation of a previous analysis.

status

a logical variable indicating whether this run is new (TRUE) or the continuation of a previous analysis (FALSE). In the latter case the current value of the parameters must be specified in the object state.

data

data frame.

na.action

a function that indicates what should happen when the data contain NAs. The default action (na.fail) causes PTlm to print an error message and terminate if there are any incomplete observations.

Details

By default, this generic function fits a median regression model using a Scale Mixture of Polya Trees prior for the distribution of the errors (see, e.g., Lavine, 1992 and 1994, Hanson and Johnson, 2004):

yi = Xi beta + Vi, i=1,…,n

Vi | G ~ G

G | alpha,sigma2 ~ PT(Pi^{sigma2},\textit{A})

where, the PT is centered around a N(0,sigma2) distribution, by taking each m level of the partition Pi^{sigma2} to coincide with the k/2^m, k=0,…,2^m quantile of the N(0,sigma2) distribution. The family \textit{A}={alphae: e \in E^{*}}, where E^{*}=\bigcup_{m=1}^{+infty} E^m and E^m is the m-fold product of E=\{0,1\}, was specified as alpha{e1 … em}=alpha m^2. To complete the model specification, independent hyperpriors are assumed,

alpha | a0, b0 ~ Gamma(a0,b0)

sigma^-2 | tau1, tau2 ~ Gamma(tau1/2,tau2/2)

Optionally, if frstlprob=FALSE (the default value is TRUE) is specified, a mean regression model is considered. In this case, the following PT prior is considered:

G | alpha,mu,sigma2 ~ PT(Pi^{mu,sigma2},\textit{A})

where, the PT is centered around a N(mu,sigma2) distribution. In this case, the intercept term is automatically excluded from the model and the hyperparameters for the normal prior for mu must be specified. The normal prior is given by,

mu | mub, Sb ~ N(mub,Sb)

The precision parameter, alpha, of the PT prior can be considered as random, having a gamma distribution, Gamma(a0,b0), or fixed at some particular value. To let alpha to be fixed at a particular value, set a0 to NULL in the prior specification.

In the computational implementation of the model, Metropolis-Hastings steps are used to sample the posterior distribution of the regression coefficients and hyperparameters.

Value

An object of class PTlm representing the semiparametric median regression model fit. Generic functions such as print, plot, summary, and anova have methods to show the results of the fit. The results include beta, mu, sigma2, and the precision parameter alpha.

The list state in the output object contains the current value of the parameters necessary to restart the analysis. If you want to specify different starting values to run multiple chains set status=TRUE and create the list state based on this starting values. In this case the list state must include the following objects:

alpha

giving the value of the precision parameter.

beta

giving the value of the regression coefficients.

mu

giving the mean of the normal baseline distribution (If needed).

sigma2

giving the variance of the normal baseline distribution.

v

giving the value of the errors (it must be consistent with the data.

Author(s)

Alejandro Jara <atjara@uc.cl>

References

Hanson, T., and Johnson, W. (2002) Modeling regression error with a Mixture of Polya Trees. Journal of the American Statistical Association, 97: 1020 - 1033.

Lavine, M. (1992) Some aspects of Polya tree distributions for statistical modelling. The Annals of Statistics, 20: 1222-11235.

Lavine, M. (1994) More aspects of Polya tree distributions for statistical modelling. The Annals of Statistics, 22: 1161-1176.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
## Not run: 
    ####################################
    # A simulated Data Set
    # (Mixture of Normals)
    ####################################

      ind<-rbinom(100,1,0.5)
      vsim<-ind*rnorm(100,1,0.15)+(1-ind)*rnorm(100,3,0.15)

      x1<-rep(c(0,1),50)
      x2<-rnorm(100,0,1)

      etasim<-x1+-1*x2
      y<-etasim+vsim


    # Initial state
      state <- NULL

    # MCMC parameters
      nburn<-5000
      nsave<-10000
      nskip<-20
      ndisplay<-100
      mcmc <- list(nburn=nburn,nsave=nsave,nskip=nskip,
                   ndisplay=ndisplay)

    # Prior information
      prior <- list(alpha=1,beta0=rep(0,3),Sbeta0=diag(1000,3),
                    tau1=0.01,tau2=0.01,M=6)

    # Fit the model

      fit1 <- PTlm(formula=y~x1+x2,prior=prior,mcmc=mcmc,state=state,
                   status=TRUE) 

    # Summary with HPD and Credibility intervals
      summary(fit1)
      summary(fit1,hpd=FALSE)

    # Plot model parameters (to see the plots gradually set ask=TRUE)
      plot(fit1)
      plot(fit1,nfigr=2,nfigc=2)

    # Table of Pseudo Contour Probabilities
      anova(fit1)
      

    ############################################
    # The Australian Institute of Sport's data
    # (Skew data example)
    ############################################
      data(sports)
      attach(sports)

    # Initial state
      state <- NULL

    # MCMC parameters

      nburn<-5000
      nsave<-10000
      nskip<-20
      ndisplay<-100
      mcmc <- list(nburn=nburn,nsave=nsave,nskip=nskip,
                   ndisplay=ndisplay)

    # Prior information
      prior <- list(alpha=1,beta0=rep(0,3),Sbeta0=diag(1000,3),
                    tau1=0.01,tau2=0.01,M=8)

    # Fit the model

      fit2 <- PTlm(formula=bmi~lbm+gender,prior=prior,mcmc=mcmc,
                   state=state,status=TRUE) 

    # Summary with HPD and Credibility intervals
      summary(fit2)
      summary(fit2,hpd=FALSE)

    # Plot model parameters (to see the plots gradually set ask=TRUE)
      plot(fit2)
      plot(fit2,nfigr=2,nfigc=2)

    # Table of Pseudo Contour Probabilities
      anova(fit2)


## End(Not run)      

DPpackage documentation built on May 1, 2019, 10:23 p.m.

Related to PTlm in DPpackage...