rk.model.matrix: Construct Design Matrices via Reproducing Kernels

View source: R/rk.model.matrix.R

rk.model.matrixR Documentation

Construct Design Matrices via Reproducing Kernels

Description

Creates a design (or model) matrix using the rk function to expand variables via a reproducing kernel basis.

Usage

rk.model.matrix(object, data = environment(object), ...)

Arguments

object

a formula or terms object describing the fit model

data

a data frame containing the variables referenced in object

...

additional arguments passed to the rk function, e.g., df, knots, m, etc. Arguments must be passed as a named list, see Examples.

Details

Designed to be a more flexible alternative to the model.matrix function. The rk function is used to construct a marginal basis for each variable that appears in the input object. Tensor product interactions are formed by taking a row.kronecker product of marginal basis matrices. Interactions of any order are supported using standard formulaic conventions, see Note.

Value

The design matrix corresponding to the input formula and data, which has the following attributes:

assign

an integer vector with an entry for each column in the matrix giving the term in the formula which gave rise to the column

term.labels

a character vector containing the labels for each of the terms in the model

knots

a named list giving the knots used for each variable in the formula

m

a named list giving the penalty order used for each variable in the formula

periodic

a named list giving the periodicity used for each variable in the formula

xlev

a named list giving the factor levels used for each variable in the formula

Note

For formulas of the form y ~ x + z, the constructed model matrix has the form cbind(rk(x), rk(z)), which simply concatenates the two marginal basis matrices. For formulas of the form y ~ x : z, the constructed model matrix has the form row.kronecker(rk(x), rk(z)), where row.kronecker denotes the row-wise kronecker product. The formula y ~ x * z is a shorthand for y ~ x + z + x : z, which concatenates the two previous results. Unless it is suppressed (using 0+), the first column of the basis will be a column of ones named (Intercept).

Author(s)

Nathaniel E. Helwig <helwig@umn.edu>

References

Helwig, N. E. (2017). Regression with ordered predictors via ordinal smoothing splines. Frontiers in Applied Mathematics and Statistics, 3(15), 1-13. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.3389/fams.2017.00015")}

Helwig, N. E. (2021). Spectrally sparse nonparametric regression via elastic net regularized smoothers. Journal of Computational and Graphical Statistics, 30(1), 182-191. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/10618600.2020.1806855")}

Helwig, N. E. (2024). Precise tensor product smoothing via spectral splines. Stats, 7(1), 34-53. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.3390/stats7010003")}

See Also

See rk for details on the reproducing kernel basis

Examples

# load auto data
data(auto)

# additive effects
x <- rk.model.matrix(mpg ~ ., data = auto)
dim(x)                      # check dimensions
attr(x, "assign")           # check group assignments
attr(x, "term.labels")      # check term labels

# two-way interactions
x <- rk.model.matrix(mpg ~ . * ., data = auto)
dim(x)                      # check dimensions
attr(x, "assign")           # check group assignments
attr(x, "term.labels")      # check term labels

# specify df for horsepower, weight, and acceleration
# note: default df = 5 is used for displacement and model.year
df <- list(horsepower = 6, weight = 7, acceleration = 8)
x <- rk.model.matrix(mpg ~ ., data = auto, df = df)
sapply(attr(x, "knots"), length)   # check df

# specify knots for model.year
# note: default knots are selected for other variables
knots <- list(model.year = c(1970, 1974, 1978, 1982))
x <- rk.model.matrix(mpg ~ ., data = auto, knots = knots)
sapply(attr(x, "knots"), length)   # check df


grpnet documentation built on Oct. 12, 2024, 1:07 a.m.