predict.sm | R Documentation |
predict
method for class "sm".
## S3 method for class 'sm'
predict(object, newdata = NULL, se.fit = FALSE,
interval = c("none", "confidence", "prediction"),
level = 0.95, type = c("response", "terms"),
terms = NULL, na.action = na.pass,
intercept = NULL, combine = TRUE, design = FALSE,
check.newdata = TRUE, ...)
object |
a fit from |
newdata |
an optional list or data frame in which to look for variables with which to predict. If omitted, the original data are used. |
se.fit |
a switch indicating if standard errors are required. |
interval |
type of interval calculation. Can be abbreviated. |
level |
tolerance/confidence level. |
type |
type of prediction (response or model term). Can be abbreviated. |
terms |
which terms to include in the fit. The default of |
na.action |
function determining what should be done with missing values in |
intercept |
a switch indicating if the intercept should be included in the prediction. If |
combine |
a switch indicating if the parametric and smooth components of the prediction should be combined (default) or returned separately. |
design |
a switch indicating if the model (design) matrix for the prediction should be returned. |
check.newdata |
a switch indicating if the |
... |
additional arguments affecting the prediction produced (currently ignored). |
Inspired by the predict.lm
function in R's stats package.
Produces predicted values, obtained by evaluating the regression function in the frame newdata
(which defaults to model.frame(object)
). If the logical se.fit
is TRUE
, standard errors of the predictions are calculated. Setting interval
s specifies computation of confidence or prediction (tolerance) intervals at the specified level, sometimes referred to as narrow vs. wide intervals.
If newdata
is omitted the predictions are based on the data used for the fit. Regardless of the newdata
argument, how cases with missing values are handled is determined by the na.action
argument. If na.action = na.omit
omitted cases will not appear in the predictions, whereas if na.action = na.exclude
they will appear (in predictions, standard errors or interval limits), with value NA
.
Similar to the lm
function, setting type = "terms"
returns a matrix giving the predictions for each of the requested model terms
. Unlike the lm
function, this function allows for predictions using any subset of the model terms. Specifically, when type = "response"
the predictions will only include the requested terms
, which makes it possible to obtain estimates (and standard errors and intervals) for subsets of model terms. In this case, the newdata
only needs to contain data for the subset of variables that are requested in terms
.
Default use returns a vector of predictions. Otherwise the form of the output will depend on the combination of argumments: se.fit
, interval
, type
, combine
, and design
.
type = "response"
:
When se.fit = FALSE
and design = FALSE
, the output will be the predictions (possibly with lwr
and upr
interval bounds). When se.fit = TRUE
or design = TRUE
, the output is a list with components fit
, se.fit
(if requested), and X
(if requested).
type = "terms"
:
When se.fit = FALSE
and design = FALSE
, the output will be the predictions for each term (possibly with lwr
and upr
interval bounds). When se.fit = TRUE
or design = TRUE
, the output is a list with components fit
, se.fit
(if requested), and X
(if requested).
Regardless of the type
, setting combine = FALSE
decomposes the requested result(s) into the parametric and smooth contributions.
Nathaniel E. Helwig <helwig@umn.edu>
https://stat.ethz.ch/R-manual/R-devel/library/stats/html/predict.lm.html
Craven, P. and Wahba, G. (1979). Smoothing noisy data with spline functions: Estimating the correct degree of smoothing by the method of generalized cross-validation. Numerische Mathematik, 31, 377-403. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/BF01404567")}
Gu, C. (2013). Smoothing spline ANOVA models, 2nd edition. New York: Springer. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/978-1-4614-5369-7")}
Helwig, N. E. (2020). Multiple and Generalized Nonparametric Regression. In P. Atkinson, S. Delamont, A. Cernat, J. W. Sakshaug, & R. A. Williams (Eds.), SAGE Research Methods Foundations. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.4135/9781526421036885885")}
sm
# generate data
set.seed(1)
n <- 100
x <- seq(0, 1, length.out = n)
z <- factor(sample(letters[1:3], size = n, replace = TRUE))
fun <- function(x, z){
mu <- c(-2, 0, 2)
zi <- as.integer(z)
fx <- mu[zi] + 3 * x + sin(2 * pi * x + mu[zi]*pi/4)
}
fx <- fun(x, z)
y <- fx + rnorm(n, sd = 0.5)
# define marginal knots
probs <- seq(0, 0.9, by = 0.1)
knots <- list(x = quantile(x, probs = probs),
z = letters[1:3])
# fit sm with specified knots
smod <- sm(y ~ x * z, knots = knots)
# get model "response" predictions
fit <- predict(smod)
mean((smod$fitted.values - fit)^2)
# get model "terms" predictions
trm <- predict(smod, type = "terms")
attr(trm, "constant")
head(trm)
mean((smod$fitted.values - rowSums(trm) - attr(trm, "constant"))^2)
# get predictions with "newdata" (= the original data)
fit <- predict(smod, newdata = data.frame(x = x, z = z))
mean((fit - smod$fitted.values)^2)
# get predictions and standard errors
fit <- predict(smod, se.fit = TRUE)
mean((fit$fit - smod$fitted.values)^2)
mean((fit$se.fit - smod$se.fit)^2)
# get 99% confidence interval
fit <- predict(smod, interval = "c", level = 0.99)
head(fit)
# get 99% prediction interval
fit <- predict(smod, interval = "p", level = 0.99)
head(fit)
# get predictions only for x main effect
fit <- predict(smod, newdata = data.frame(x = x),
se.fit = TRUE, terms = "x")
plotci(x, fit$fit, fit$se.fit)
# get predictions only for each group
fit.a <- predict(smod, newdata = data.frame(x = x, z = "a"), se.fit = TRUE)
fit.b <- predict(smod, newdata = data.frame(x = x, z = "b"), se.fit = TRUE)
fit.c <- predict(smod, newdata = data.frame(x = x, z = "c"), se.fit = TRUE)
# plot results (truth as dashed line)
plotci(x = x, y = fit.a$fit, se = fit.a$se.fit,
col = "red", col.ci = "pink", ylim = c(-6, 6))
lines(x, fun(x, rep(1, n)), lty = 2, col = "red")
plotci(x = x, y = fit.b$fit, se = fit.b$se.fit,
col = "blue", col.ci = "cyan", add = TRUE)
lines(x, fun(x, rep(2, n)), lty = 2, col = "blue")
plotci(x = x, y = fit.c$fit, se = fit.c$se.fit,
col = "darkgreen", col.ci = "lightgreen", add = TRUE)
lines(x, fun(x, rep(3, n)), lty = 2, col = "darkgreen")
# add legends
legend("bottomleft", legend = c("Truth", "Estimate", "CI"),
lty = c(2, 1, NA), lwd = c(1, 2, NA),
col = c("black", "black","gray80"),
pch = c(NA, NA, 15), pt.cex = 2, bty = "n")
legend("bottomright", legend = letters[1:3],
lwd = 2, col = c("red", "blue", "darkgreen"), bty = "n")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.