predict.sop: Prediction from a fitted SOP model

View source: R/predict.sop.R

predict.sopR Documentation

Prediction from a fitted SOP model

Description

The function takes a fitted sop object and produces predictions for the original data if the argument newdata is not set or predictions for new data if newdata is specified. Predictions can be accompanied by standard errors, based on the Bayesian posterior distribution of the model coefficients.

Usage

## S3 method for class 'sop'
predict(object, newdata, type = c("response", "link", "terms"), 
      se.fit = FALSE, ...)

Arguments

object

a fitted sop object as produced by sop().

newdata

a data frame containing the values of the model covariates at which predictions are required. If this is not provided then predictions corresponding to the original data are returned. If the data frame newdata is provided then it should contain all the variables needed for prediction: a warning is generated if not. If newdata contains a variable offset, it is included into the predictions when type = "link" and type = "response".

type

When this has the value "link" the linear predictor fitted values or predictions (possibly with associated standard errors) are returned. When type = "terms" each component of the linear predictor is returned separately (possibly with approximate standard errors): this includes parametric model components, followed by each smooth component, but excludes any offset and any intercept. When type = "response" (default) fitted values or predictions on the scale of the response are returned (possibly with approximate standard errors).

se.fit

when this is TRUE (not default) standard error estimates are returned for each prediction.

...

other arguments. Not yet implemented.

Value

A vector/matrix (or list, with elements fit and se.fit, is se = TRUE) equal to:

"link"

a vector of linear predictor values.

"response"

a vector of linear predictor values on the scale of the response.

"terms"

a matrix with a column per term, and may have an attribute "constant".

See Also

sop, plot.sop

Examples

library(SOP)
## Example training/set
# Simulate the data
set.seed(123)
n <- 1000
sigma <- 0.5
x <- runif(n)
f0 <- function(x)2*sin(pi*x)
f <- f0(x)
y <- f + rnorm(n, 0, sigma)
da <- data.frame(x = x, y = y)# all data
rand <-  sample(2, 610, replace=TRUE, prob=c(0.6,0.4))
traindata <- da[rand==1,] # training data
valdata <- da[rand==2,] # validation data  
plot(y ~ x, data = traindata, pch = 20, col = gray(.7))
points(y ~ x, data = valdata, pch = 20, col = gray(.2))

# Fit the model in the training data
m0 <- sop(formula = y ~ f(x, nseg = 10), data = traindata)
lines(fitted(m0)[order(traindata$x)]~traindata$x[order(traindata$x)], 
       col="red", lwd=2)

# Predict and plot in the data used for the fit
po <- predict(m0)
plot(y ~ x, data = traindata, pch = 20, col = gray(.7))
lines(po[order(traindata$x)] ~ traindata$x[order(traindata$x)], 
      col="red", lwd=2)

# Predict and plot in new data
pn <- predict(m0, newdata = valdata)
plot(y ~ x, data = traindata, pch = 20, col = gray(.7))
lines(pn[order(valdata$x)] ~ valdata$x[order(valdata$x)], col = "yellow", lwd = 2)

# Example Gamma distribution
# Simulate the data
set.seed(123)
n <- 1000
alpha <- 0.75
x0 <- runif(n)
x1 <- x0*alpha + (1-alpha)*runif(n)
x2 <- runif(n)
x3 <- x2*alpha + (1-alpha)*runif(n)
x4 <- runif(n)
x5 <- runif(n)

f0 <- function(x)2*sin(pi*x)
f1 <- function(x)exp(2*x)
f2 <- function(x) 0.2*x^11*(10*(1-x))^6+10*(10*x)^3*(1-x)^10

f <- f0(x0) + f1(x1) + f2(x2)
y <- rgamma(f,exp(f/4),scale=1.2)

df <- data.frame(y = y, x0 = x0, x1 = x1, x2 = x2, x3 = x3, x4 = x4, x5 = x5)

# Fit the model
m1 <- sop(formula = y ~ f(x0, nseg = 17) + f(x1, nseg = 17) + 
      f(x2, nseg = 17) + f(x3, nseg = 17) + 
      f(x4, nseg = 17) + f(x5, nseg = 17), 
      family = Gamma(link = log), data = df)
summary(m1)

# Predict in a new dataframe
x <- seq(max(c(min(x1),min(x3))), min(c(max(x1),max(x3))), l = 100)
df.p <- data.frame(x0 = x, x1 = x, x2 = x, x3 = x, x4 = x, x5 = x)
p <- predict(m1, type = "terms", newdata = df.p)
colnames(p)

# Plot the different smooth terms
op <- par(mfrow = c(2,3))
plot(m1, select = 1)
lines(x, p[,1], col = "red")
plot(m1, select = 2)
lines(x, p[,2], col = "red")
plot(m1, select = 3)
lines(x, p[,3], col = "red")
plot(m1, select = 4)
lines(x, p[,4], col = "red")
plot(m1, select = 5)
lines(x, p[,5], col = "red")
plot(m1, select = 6)
lines(x, p[,6], col = "red")
par(op)

SOP documentation built on Sept. 16, 2023, 1:07 a.m.