Description Usage Arguments Details Value References See Also Examples
NGeDS
constructs a Geometrically Designed variable
knots spline regression model referred to as
a GeDS model, for a response having a Normal distribution.
1 2 3 |
formula |
a description of the structure of the model to be fitted, including the
dependent and independent variables. See |
data |
an optional data frame, list or environment containing the variables of the model.
If not found in |
weights |
an optional vector of ‘prior weights’ to be put on
the observations in the fitting
process in case the user requires weighted GeDS fitting.
It should be |
beta |
numeric parameter in the interval [0,1] tuning the knot placement in stage A of GeDS. See details. |
phi |
numeric parameter in the interval [0,1] specifying the threshold for
the stopping rule (model selector) in stage A of GeDS. See also |
min.intknots |
optional parameter allowing the user to set a minimum number of internal knots required. By default equal to zero. |
max.intknots |
optional parameter allowing the user to set a maximum number of internal knots to be added by the GeDS estimation algorithm. By default equal to the number of knots for the saturated GeDS model. |
q |
numeric parameter which allows to fine-tune the stopping rule of stage A of GeDS, by default equal to 2. See details. |
Xextr |
numeric vector of 2 elements representing the left-most and right-most limits of the interval embedding the observations of the first independent variable. See details. |
Yextr |
numeric vector of 2 elements representing the left-most and right-most limits of the interval embedding the observations of the second independent variable (if the bivariate GeDS is run). See details. |
show.iters |
logical variable indicating whether or not to print information at each step. |
stoptype |
a character string indicating the type of GeDS stopping rule to
be used. It should be either one of |
The NGeDS
function implements the GeDS methodology, recently developed by Kaishev et al. (2016)
and extended
in the GGeDS
function for the more general GNM, (GLM) context, allowing for the response to have any
distribution from the Exponential Family.
Under the GeDS approach the (non-)linear predictor is viewed as a spline with
variable knots which are estimated along with the regression coefficients and the order of the spline, using a
two stage algorithm.
In stage A, a linear variable-knot spline is fitted to the data applying iteratively
least squares regression (see lm
function). In stage B, a Schoenberg variation diminishing spline approximation to the fit
from stage A is constructed, thus simultaneously producing spline fits of order 2,
3 and 4, all of which are included in the output, a GeDS-Class
object.
As noted in formula
, the argument formula
allows the user to specify models
with two components, a
spline regression (non-parametric) component involving part of the independent variables identified through
the function f
and an optional parametric component involving the remaining independent variables.
For NGeDS
one or two independent variables are allowed for the spline component
and arbitrary many independent variables for the parametric component.
Failure to specify the independent variable for the spline regression component through
the function f
will return an error.
See formula
.
Within the argument formula
, similarly as in other R functions, it is possible to
specify one or more offset variables, i.e. known terms with fixed regression coefficients equal to 1.
These terms should be identified via the function offset
.
The parameter beta
tunes the placement of a new knot in stage A of the algorithm.
Once a current second-order spline is fitted to the data the
regression residuals are computed and grouped by their sign.
A new knot is placed at a location defined by the group
for which a certain measure attains its maximum.
The latter measure is defined as a weighted linear combination of the range
of each group and the mean of the absolute residuals within it.
The parameter beta
determines the weights in this measure correspondingly as beta
and 1 - beta
.
The higher it is, the more weight is put to the mean of the residuals and the less to the range
of their corresponding x-values. The default value of beta
is 0.5.
The argument stoptype
allows to choose between three alternative stopping rules for the knot selection
in stage A of GeDS, the "RD"
, that stands for Ratio of Deviances, the "SR"
,
that stands for Smoothed Ratio of deviances and the "LR"
, that stands for Likelihood Ratio.
The latter is based on the difference of deviances rather than on their ratio as in the case of
"RD"
and "SR"
. Therefore "LR"
can be viewed as
a log likelihood ratio test performed at each iteration of the knot placement.
In each of these cases the corresponding stopping criterion is compared with a threshold value
phi
(see below).
The argument phi
provides a threshold value required for the stopping rule to exit
the knot placement in stage A of GeDS.
The higher the value of phi
, the more knots are added under the "RD"
and "SR"
stopping rules
contrary to the case of the stopping rule "LR"
where the lower phi
is,
more knots are included in the spline regression. Further details for
each of the three alternative stopping rules can be found in Dimitrova et al. (2017).
The argument q
is an input parameter that allows to fine-tune the stopping rule in stage A.
It identifies the number of consecutive iterations
over which the deviance should exhibit stable convergence so as the knot placement in stage A is terminated.
More precisely,
under any of the rules "RD"
, "SR"
or "LR"
the deviance at the current iteration is compared to the deviance computed
q
iterations before, i.e. before selecting
the last q
knots. Setting a higher q
will lead to more knots
being added before exiting stage A of GeDS.
GeDS-Class
object, i.e. a list of items that summarizes the main
details of the fitted GeDS regression. See GeDS-Class
for details.
Some S3 methods are available in order to make these objects tractable, such as coef
,
deviance
, knots
, predict
and print
as well as S4 methods for lines
and plot
.
Kaishev, V.K., Dimitrova, D.S., Haberman, S. and Verrall, R.J. (2016).
Geometrically designed, variable knot regression splines.
Computational Statistics, 31, 1079–1105.
DOI: doi.org/10.1007/s00180-015-0621-7
Dimitrova, D.S., Kaishev, V.K., Lattuada A. and Verrall, R.J. (2017). Geometrically designed, variable knot splines in Generalized (Non-)Linear Models
GGeDS; GeDS-Class; S3 methods such as coef.GeDS, deviance.GeDS, knots.GeDS, print.GeDS and predict.GeDS; Integrate and Derive; PPolyRep.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | ###################################################
# Generate a data sample for the response variable
# Y and the single covariate X
set.seed(123)
N <- 500
f_1 <- function(x) (10*x/(1+100*x^2))*4+4
X <- sort(runif(N, min = -2, max = 2))
# Specify a model for the mean of Y to include only a component
# non-linear in X, defined by the function f_1
means <- f_1(X)
# Add (Normal) noise to the mean of Y
Y <- rnorm(N, means, sd = 0.1)
# Fit a Normal GeDS regression using NGeDS
(Gmod <- NGeDS(Y ~ f(X), beta = 0.6, phi = 0.995, Xextr = c(-2,2)))
# Apply some of the available methods, e.g.
# coefficients, knots and deviance extractions for the
# quadratic GeDS fit
# Note that the first call to the function knots returns
# also the left and right limits of the interval containing
# the data
coef(Gmod, n = 3)
knots(Gmod, n = 3)
knots(Gmod, n = 3, options = "internal")
deviance(Gmod, n = 3)
# Add a covariate, Z, that enters linearly
Z <- runif(N)
Y2 <- Y + 2*Z + 1
# Re-fit the data using NGeDS
(Gmod2 <- NGeDS(Y2 ~ f(X) + Z, beta = 0.6, phi = 0.995, Xextr = c(-2,2)))
coef(Gmod2, n = 3)
coef(Gmod2, onlySpline = FALSE, n = 3)
## Not run:
##########################################
# Real data example
# See Kaishev et al. (2016), section 4.2
data('BaFe2As2')
(Gmod2 <- NGeDS(intensity ~ f(angle), data = BaFe2As2, beta = 0.6, phi = 0.99, q = 3))
plot(Gmod2)
## End(Not run)
#########################################
# bivariate example
# See Dimitrova et al. (2017), section 5
# Generate a data sample for the response variable
# Z and the covariates X and Y assuming Normal noise
set.seed(123)
doublesin <- function(x){
sin(2*x[,1])*sin(2*x[,2])
}
x <- (round(runif(400, min = 0, max = 3),2))
y <- (round(runif(400, min = 0, max = 3),2))
z <- doublesin(cbind(x,y))
z <- z+rnorm(400, 0, sd = 0.1)
# Fit a two dimensional GeDS model using NGeDS
(BivGeDS <- NGeDS(z ~ f(x, y) , phi = 0.9, beta = 0.3,
Xextr = c(0, 3), Yextr = c(0, 3)))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.