cubespline: Smooth cubic spline estimation

Description Usage Arguments Value References See Also Examples

View source: R/cubespline.R

Description

Estimates a smooth cubic spline for a model of the form y = f(z) + XB +u. The function divides the range of z into knots+1 equal intervals. The regression is y=cons+λ_1 (z-z_0) + λ_2 (x-x_0)^2 + λ_3 (x-x_0)^3 + ∑_{k=1}^{K} γ_k(x-x_k)^3 D_k + X β + u where z_0 = min(z), z_1...z_K are the knots, and D_k=1 if z >= z_k. Estimation can be carried out for a fixed value of K or for a range of K. In the latter case, the function indicates the value of K that produces the lowest value of one of the following criteria: the AIC, the Schwarz information criterion, or the gcv.

Usage

1
cubespline(form,knots=1,mink=1,maxk=1,crit="gcv",data=NULL)

Arguments

form

Model formula. The spline is used with the first explanatory variable.

knots

If knots is specified, fits a cubic spline with K = knots. Default is knots=1, mink = 1, and maxk = 1, which implies a cubic spline with a single knot.

mink

The lower bound to search for the value of K that minimizes crit. mink can take any value greater than zero. The default is mink = 1

maxk

The upper bound to search for the value of K that minimizes crit. maxk must be great than or equal to mink. The default is maxk = 1.

crit

The selection criterion. Must be in quotes. The default is the generalized cross-validation criterion, or "gcv". Options include the Akaike information criterion, "aic", and the Schwarz criterion, "sc". Let nreg be the number of explanatory variables in the regression and sig2 the estimated variance. The formulas for the available crit options are
gcv = n*(n*sig2)/((n-nreg)^2)
aic = log(sig2) + 2* nreg /n
sc = log(sig2) + log(n)*nreg /n

data

A data frame containing the data. Default: use data in the current working directory

Value

yhat

The predicted values of the dependent variable at the original data points

rss

The residual sum of squares

sig2

The estimated error variance

aic

The value for AIC

sc

The value for sc

gcv

The value for gcv

coef

The estimated coefficient vector, B

splinehat

The predicted values for z alone, normalized to have the same mean as the dependent variable. If no X variables are included in the regression, splinehat = yhat.

knots

The vector of knots

References

McMillen, Daniel P., "Testing for Monocentricity," in Richard J. Arnott and Daniel P. McMillen, eds., A Companion to Urban Economics, Blackwell, Malden MA (2006), 128-140.

McMillen, Daniel P., "Issues in Spatial Data Analysis," Journal of Regional Science 50 (2010), 119-141.

Suits, Daniel B., Andrew Mason, and Louis Chan, "Spline Functions Fitted by Standard Regression Methods," Review of Economics and Statistics 60 (1978), 132-139.

See Also

cparlwr

fourier

lwr

lwrgrid

semip

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
data(cookdata)
fardata <- cookdata[!is.na(cookdata$LNFAR),]
par(ask=TRUE)

# single variable
o <- order(fardata$DCBD)
fit1 <- cubespline(LNFAR~DCBD, mink=1, maxk=10,data=fardata)
c(fit1$rss, fit1$sig2, fit1$aic, fit1$sc, fit1$gcv, fit1$knots)
plot(fardata$DCBD[o], fardata$LNFAR[o], xlab="Distance from CBD", ylab="Log FAR")
lines(fardata$DCBD[o], fit1$splinehat[o], col="red")

# multiple explanatory variables
fit2 <- cubespline(fardata$LNFAR~fardata$DCBD+fardata$AGE,  mink=1, maxk=10)
c(fit2$rss, fit2$sig2, fit2$aic, fit2$sc, fit2$gcv, fit2$knots)
plot(fardata$DCBD[o], fardata$LNFAR[o], xlab="Distance from CBD", ylab="Log FAR")
lines(fardata$DCBD[o], fit2$splinehat[o], col="red")

# pre-specified number of knots
fit3 <- cubespline(LNFAR~DCBD+AGE,  knots=4, data=fardata)

McSpatial documentation built on May 2, 2019, 9:32 a.m.