ssllrm: Fitting Smoothing Spline Log-Linear Regression Models
In gss: General Smoothing Splines

ssllrm

R Documentation

Fitting Smoothing Spline Log-Linear Regression Models

Description

Fit smoothing spline log-linear regression models. The symbolic model specification via formula follows the same rules as in lm.

Usage

ssllrm(formula, response, type=NULL, data=list(), weights, subset,
       na.action=na.omit, alpha=1, id.basis=NULL, nbasis=NULL,
       seed=NULL, random=NULL, prec=1e-7, maxiter=30, skip.iter=FALSE)

Arguments

`formula`	Symbolic description of the model to be fit.
`response`	Formula listing response variables.
`type`	List specifying the type of spline for each variable. See `mkterm` for details.
`data`	Optional data frame containing the variables in the model.
`weights`	Optional vector of weights to be used in the fitting process.
`subset`	Optional vector specifying a subset of observations to be used in the fitting process.
`na.action`	Function which indicates what should happen when the data contain NAs.
`alpha`	Parameter modifying GCV or Mallows' CL; larger absolute values yield smoother fits; negative value invokes a stable and more accurate GCV/CL evaluation algorithm but may take two to five times as long. Ignored when `method="m"` are specified.
`id.basis`	Index designating selected "knots".
`nbasis`	Number of "knots" to be selected. Ignored when `id.basis` is supplied.
`seed`	Seed to be used for the random generation of "knots". Ignored when `id.basis` is supplied.
`random`	Input for parametric random effects in nonparametric mixed-effect models. See `mkran` for details.
`prec`	Precision requirement for internal iterations.
`maxiter`	Maximum number of iterations allowed for internal iterations.
`skip.iter`	Flag indicating whether to use initial values of theta and skip theta iteration. See `ssanova` for notes on skipping theta iteration.

Details

The model is specified via formula and response, where response lists the response variables. For example, ssllrm(~y1*y2*x,~y1+y2) prescribe a model of the form

log f(y1,y2|x) = g_{1}(y1) + g_{2}(y2) + g_{12}(y1,y2) + g_{x1}(x,y1) + g_{x2}(x,y2) + g_{x12}(x,y1,y2) + C(x)

with the terms denoted by "y1", "y2", "y1:y2", "y1:x", "y2:x", and "y1:y2:x"; the term(s) not involving response(s) are removed and the constant C(x) is determined by the fact that a conditional density integrates (adds) to one on the y axis.

The model terms are sums of unpenalized and penalized terms. Attached to every penalized term there is a smoothing parameter, and the model complexity is largely determined by the number of smoothing parameters.

A subset of the observations are selected as "knots." Unless specified via id.basis or nbasis, the number of "knots" q is determined by max(30,10n^{2/9}), which is appropriate for the default cubic splines for numerical vectors.

Value

ssllrm returns a list object of class "ssllrm".

The method predict.ssllrm can be used to evaluate f(y|x) at arbitrary x, or contrasts of log{f(y|x)} such as the odds ratio along with standard errors. The method project.ssllrm can be used to calculate the Kullback-Leibler projection for model selection.

Note

The responses, or y-variables, must be factors, and there must be at least one numerical x's. For response, there is no difference between ~y1+y2 and ~y1*y2.

The results may vary from run to run. For consistency, specify id.basis or set seed.

References

Gu, C. and Ma, P. (2011), Nonparametric regression with cross-classified responses. The Canadian Journal of Statistics, 39, 591–609.

Gu, C. (2014), Smoothing Spline ANOVA Models: R Package gss. Journal of Statistical Software, 58(5), 1-25. URL http://www.jstatsoft.org/v58/i05/.

Examples

## Simulate data
test <- function(x)
        {.3*(1e6*(x^11*(1-x)^6)+1e4*(x^3*(1-x)^10))-2}
x <- (0:100)/100
p <- 1-1/(1+exp(test(x)))
y <- rbinom(x,3,p)
y1 <- as.ordered(y)
y2 <- as.factor(rbinom(x,1,p))
## Fit model
fit <- ssllrm(~y1*y2*x,~y1+y2)

## Evaluate f(y|x)
est <- predict(fit,data.frame(x=x),
               data.frame(y1=as.factor(0:3),y2=as.factor(rep(0,4))))
## f(y|x) at all y values (fit$qd.pt)
est <- predict(fit,data.frame(x=x))

## Evaluate contrast of log f(y|x)
est <- predict(fit,data.frame(x=x),odds=c(-1,.5,.5,0),
               data.frame(y1=as.factor(0:3),y2=as.factor(rep(0,4))),se=TRUE)
## Odds ratio log{f(0,0|x)/f(3,0|x)}
est <- predict(fit,data.frame(x=x),odds=c(1,-1),
               data.frame(y1=as.factor(c(0,3)),y2=as.factor(c(0,1))),se=TRUE)

## KL projection
kl <- project(fit,include=c("y2:x","y1:y2","y1:x","y2:x"))

## Clean up
## Not run: rm(test,x,p,y,y1,y2,fit,est,kl)
dev.off()
## End(Not run)

gss documentation built on April 14, 2025, 1:09 a.m.