PSgam: Bayesian analysis for a semiparametric generalized additive...

Description Usage Arguments Details Value Author(s) References Examples

Description

This function generates a posterior density sample for a semiparametric generalized additive model, using a B-Splines and penalties.

Usage

1
2

Arguments

formula

a two-sided linear formula object describing the linear predictor of the model, with the response on the left of a ~ operator and the terms, separated by + operators, on the right. Built-in nonparametric smoothing terms are indicated by ps for smoothing splines terms. See the documentation for ps for its arguments.

family

a description of the error distribution and link function to be used in the model. This can be a character string naming a family function, a family function or the result of a call to a family function. The families(links) considered by PSgam so far are gaussian(identity), binomial(logit), binomial(probit), Gamma(log), and poisson(log).

offset

this can be used to specify an a priori known component to be included in the linear predictor during the fitting (only for poisson and gamma models).

n

this can be used to indicate the total number of cases in a binomial model (only implemented for the logistic link). If it is not specified the response variable must be binary.

prior

a list giving the prior information. The list include the following parameters: beta0 and Sbeta0 giving the hyperparameters of the normal prior distribution for the parametric part of the model, taub1 and taub2 giving the hyperparameters for the prior distribution of the inverse of the penalty parameters (the same for all), and tau1 and tau2 giving the hyperparameters for the prior distribution of the inverse of the dispersion parameter (only gaussian and Gamma models).

mcmc

a list giving the MCMC parameters. The list must include the following integers: nburn giving the number of burn-in scans, nskip giving the thinning interval, nsave giving the total number of scans to be saved, and ndisplay giving the number of saved scans to be displayed on screen (the function reports on the screen when every ndisplay iterations have been carried out).

state

a list giving the current value of the parameters. This list is used if the current analysis is the continuation of a previous analysis.

status

a logical variable indicating whether this run is new (TRUE) or the continuation of a previous analysis (FALSE). In the latter case the current value of the parameters must be specified in the object state.

ngrid

number of grid points where the smoothers are evaluated. The default value is 20.

data

data frame.

na.action

a function that indicates what should happen when the data contain NAs. The default action (na.fail) causes PSgam to print an error message and terminate if there are any incomplete observations.

Details

This generic function fits a generalized additive model (see, e.g., Hastie and Tibshirani, 1990) using Penalized splines (see, e.g., Eilers and Marx, 1996; Lang and Brezger, 2004). The linear predictor is modeled as follows:

etai = Xi beta + f1(x1i)+...+fp(xpi), i=1,…,n

where the effect f of the a covariate x is approximated by a polinomial spline with equally spaced knots, written in terms of a linear combination of B-spline basis functions. Specifically, the function f is aproximated by a spline of degree l with r equally spaced knots within the domain of x. It is well known that this spline can be written in terms of a linear combination of q=l+r B-spline basis,

f(x)=sum bj Bj(x).

The computational implementation of the model is model-specific.

For the poisson, Gamma, and binomial(logit), the full conditional distributions for fixed and random effects are generated through the Metropolis-Hastings algorithm with a IWLS proposal (see, West, 1985 and Gamerman, 1997).

For the binomial(probit) case the following latent variable representation is used:

yij = I(wij > 0), j=1,…,ni.

Value

An object of class PSgam representing the generalized additive model fit. Generic functions such as anova, print, plot, and summary have methods to show the results of the fit. The results include the parametric component of the linear predictor beta, the dispersion parameter of the Gamma or gaussian model, and the penalty parameters sigmab.

The list state in the output object contains the current value of the parameters necessary to restart the analysis. If you want to specify different starting values to run multiple chains set status=TRUE and create the list state based on this starting values. In this case the list state must include the following objects:

b

a vector of dimension q giving the value of the B-spline coefficients.

beta

giving the value of the parametric components of the linear predictor.

sigmab

giving the penalty parameters.

phi

giving the dispersion parameter for the Gamma or gaussian model (if needed).

Author(s)

Alejandro Jara <atjara@uc.cl>

References

Eilers, P.H.C. and Marx, B.D. (1996) Flexible Smoothing with B-splines and penalties. Statistical Science, 11(2): 89-121.

Gamerman, D. (1997) Sampling from the posterior distribution in generalized linear mixed models. Statistics and Computing, 7: 57-68.

Hastie, T. and Tibshirani, R. (1990) Generalized Additive Models. London: Chapman and Hall.

Lang, S., Brezger, A. (2004) Bayesian P-Splines Journal of Computational and Graphical Statistics, 13: 183-212.

West, M. (1985) Generalized linear models: outlier accomodation, scale parameter and prior distributions. In Bayesian Statistics 2 (eds Bernardo et al.), 531-558, Amsterdam: North Holland.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
## Not run: 

 # Normal simulated data
   set.seed(0)
   n <- 400
   sig <- 2
   x0 <- runif(n, 0, 1)
   x1 <- runif(n, 0, 1)
   x2 <- runif(n, 0, 1)
   x3 <- runif(n, 0, 1)
   f0 <- function(x) 2 * sin(pi * x)
   f1 <- function(x) exp(2 * x)
   f2 <- function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10
   f3 <- function(x) 0*x
   f <- f0(x0) + f1(x1) + f2(x2)
   e <- rnorm(n, 0, sig)
   y <- f + e

 # prior
   prior <- list(taub1=2.02,
                 taub2=0.02,
                 beta0=rep(0,1),
                 Sbeta0=diag(100,1),
                 tau1=6.01,
                 tau2=2.01)

  # Initial state
    state <- NULL

  # MCMC parameters
    nburn <- 5000
    nsave <- 5000
    nskip <- 0
    ndisplay <- 100
    mcmc <- list(nburn=nburn,
                 nsave=nsave,
                 nskip=nskip,
                 ndisplay=ndisplay)


  # fitting the model
    fit1 <- PSgam(formula=y~ps(x0,x1,x2,x3,k=20,degree=3,pord=1),
                  family=gaussian(),prior=prior,
                  mcmc=mcmc,ngrid=30,
                  state=state,status=TRUE)


  # A binary example 
    g <- (f-5)/3
    g <- binomial()$linkinv(g)
    y <- rbinom(n,1,g)

  # fitting the model
    fit2 <- PSgam(formula=y~ps(x0,x1,x2,x3,k=20,degree=3,pord=1),
                  family=binomial(logit),prior=prior,
                  mcmc=mcmc,ngrid=30,
                  state=state,status=TRUE)

  # Poisson data
    g <- exp(f/4)
    y <- rpois(n,g)

  # fitting the model
    fit3 <- PSgam(formula=y~ps(x0,x1,x2,x3,k=20,degree=3,pord=1),
                  family=poisson(log),prior=prior,
                  mcmc=mcmc,ngrid=30,
                  state=state,status=TRUE)

## End(Not run)

DPpackage documentation built on May 1, 2019, 10:23 p.m.

Related to PSgam in DPpackage...