Description Usage Arguments Details Value Note Author(s) References Examples
Estimate conditional probability densities using smoothing spline
ANOVA models. The symbolic model specification via formula
follows the same rules as in lm
.
1 2 3 4 5 6 7 8 9  sscden(formula, response, type=NULL, data=list(), weights, subset,
na.action=na.omit, alpha=1.4, id.basis=NULL, nbasis=NULL,
seed=NULL, ydomain=as.list(NULL), yquad=NULL, prec=1e7,
maxiter=30, skip.iter=FALSE)
sscden1(formula, response, type=NULL, data=list(), weights, subset,
na.action=na.omit, alpha=1.4, id.basis=NULL, nbasis=NULL,
seed=NULL, rho=list("xy"), ydomain=as.list(NULL), yquad=NULL,
prec=1e7, maxiter=30, skip.iter=FALSE)

formula 
Symbolic description of the model to be fit. 
response 
Formula listing response variables. 
type 
List specifying the type of spline for each variable.
See 
data 
Optional data frame containing the variables in the model. 
weights 
Optional vector of counts for duplicated data. 
subset 
Optional vector specifying a subset of observations to be used in the fitting process. 
na.action 
Function which indicates what should happen when the data contain NAs. 
alpha 
Parameter defining crossvalidation scores for smoothing parameter selection. 
id.basis 
Index of observations to be used as "knots." 
nbasis 
Number of "knots" to be used. Ignored when

seed 
Seed to be used for the random generation of "knots."
Ignored when 
ydomain 
Data frame specifying marginal support of conditional density. 
yquad 
Quadrature for calculating integral on Y domain. Mandatory if response variables other than factors or numerical vectors are involved. 
prec 
Precision requirement for internal iterations. 
maxiter 
Maximum number of iterations allowed for internal iterations. 
skip.iter 
Flag indicating whether to use initial values of
theta and skip theta iteration. See 
rho 
rho function needed for sscden1. 
The model is specified via formula
and response
, where
response
lists the response variables. For example,
sscden(~y*x,~y)
prescribe a model of the form
log f(yx) = g_{y}(y) + g_{xy}(x,y) + C(x)
with the terms denoted by "y"
, "y:x"
; the term(s) not
involving response(s) are removed and the constant C(x)
is
determined by the fact that a conditional density integrates to one
on the y
axis. sscden1
does keep terms not involving
response(s) during estimation, although those terms cancel out when
one evaluates the estimated conditional density.
The model terms are sums of unpenalized and penalized terms. Attached to every penalized term there is a smoothing parameter, and the model complexity is largely determined by the number of smoothing parameters.
A subset of the observations are selected as "knots." Unless
specified via id.basis
or nbasis
, the number of
"knots" q is determined by max(30,10n^{2/9}), which is
appropriate for the default cubic splines for numerical vectors.
sscden
returns a list object of class "sscden"
.
sscden1
returns a list object of class
c("sscden1","sscden")
.
dsscden
and cdsscden
can be used to
evaluate the estimated conditional density f(yx) and
f(y1x,y2); psscden
, qsscden
,
cpsscden
, and cqsscden
can be used to
evaluate conditional cdf and quantiles. The methods
project.sscden
or project.sscden1
can
be used to calculate the KullbackLeibler or squareerror
projections for model selection.
Default quadrature on the Y domain will be constructed for numerical
vectors on a hyper cube, then outer product with factor levels will
be taken if factors are involved. The sides of the hyper cube are
specified by ydomain
; for ydomain$y
missing, the default
is c(min(y),max(y))+c(1,1)*(max(y)mimn(y))*.05
.
On a 1D interval, the quadrature is the 200point GaussLegendre
formula returned from gauss.quad
. For multiple
numerical vectors, delayed Smolyak cubatures from
smolyak.quad
are used on cubes with the marginals
properly transformed; see Gu and Wang (2003) for the marginal
transformations.
The results may vary from run to run. For consistency, specify
id.basis
or set seed
.
For reasonable execution time in high dimensions, set
skip.iter=TRUE
.
Chong Gu, [email protected]
Gu, C. (1995), Smoothing spline density estimation: Conditional distribution. Statistica Sinica, 5, 709–726. SpringerVerlag.
Gu, C. (2014), Smoothing Spline ANOVA Models: R Package gss. Journal of Statistical Software, 58(5), 125. URL http://www.jstatsoft.org/v58/i05/.
1 2 3 4 5 6 7 8 9 10  data(penny); set.seed(5732)
fit < sscden1(~year*mil,~mil,data=penny,
ydomain=data.frame(mil=c(49,61)))
yy < 1944+(0:92)/2
quan < qsscden(fit,c(.05,.25,.5,.75,.95),
data.frame(year=yy))
plot(penny$year+.1*rnorm(90),penny$mil,ylim=c(49,61))
for (i in 1:5) lines(yy,quan[i,])
## Clean up
## Not run: rm(penny,yy,quan)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.