sim_vcpsr: Varying-coefficient single-index signal regression using...

View source: R/sim_vcpsr.R

sim_vcpsrR Documentation

Varying-coefficient single-index signal regression using tensor P-splines.

Description

sim_vcpsr is a varying-coefficient single-index signal regression approach that allows both the signal coefficients and the unknown link function to vary with an indexing variable t, e.g. temperature. Two surfaces are estimated (coefficent and link) that can be sliced at arbitary t. Anisotripic penalization with P-splines is used on both.

Usage

sim_vcpsr(
  y,
  X,
  t_var,
  x_index = c(1:ncol(X)),
  nsegs = rep(10, 4),
  bdegs = rep(3, 4),
  lambdas = rep(1, 4),
  pords = rep(2, 4),
  max_iter = 100,
  mins = c(min(x_index), min(t_var)),
  maxs = c(max(x_index), max(t_var))
)

Arguments

y

a response vector of length m, usually continuous.

X

the signal regressors with dimension m by p1.

t_var

the varying coeffient indexing variable of length m.

x_index

an index of length p for columns of signal matrix; default is simple sequence.

nsegs

a vector of length 4 containing the number of evenly spaced segments between min and max, for each the coefficient surface (row and col) and link surface (row and col), resp. (default rep(10, 4).

bdegs

a vector of length 4 containing the degree of B-splines, for each the coefficient surface (row and col) and link surface (row and col), resp. (default cubic rep(3, 4)).

lambdas

a vector of length 4 containing the positive tuning parameters, for each the coefficient surface (row and col) and link surface (row and col), resp. (default rep(1, 4)).

pords

a vector of length 4 containing the difference penalty order, for each the coefficient surface (row and col) and link surface (row and col), resp. (default rep(2, 4)).

max_iter

a scalar for the maximum number of iterations (default 100)

mins

A vector length 2, containing min for signal index and t_var, default associated with x_index and t_var minimums; default is respective minimums.

maxs

A vector length 2, containing max for signal index and t_var, default associated with x_index and t_var maximums; default is respective maximums.

Value

y

the response vector of length m.

alpha

the P-spline coefficient vector (unfolded) of length (nsegs[1]+bdeg[1])*(negs[2]+bdeg[2]).

iter

the number of iterations used for the single-index fit.

yint

the estimated y-intercept for the single-index model.

Bx

the B-spline matrix built along the signal index, using nsegs[1], used for the coefficient surface.

By

the B-spline matrix built along the t_var index, using nsegs[2], used for the coefficient surface.

Q

the effective regressors from the psVCSignal portion of the single-index fit with dimension m by length(alpha).

t_var

the VC indexing variable of length m.

nsegs

a vector of length 4 containing the number of evenly spaced segments between min and max, for each the coefficient surface (row and col) and link surface (row and col).

bdegs

a vector of length 4 containing the degree of B-splines, for each the coefficient surface (row and col) and link surface (row and col).

lambdas

a vector of length 4 containing the positive tuning parameters, for each the coefficient surface (row and col) and link surface (row and col).

pords

a vector of length 4 containing the difference penalty order, for each the coefficient surface (row and col) and link surface (row and col).

mins

a vector length 2, containing min for signal index and t_var.

maxs

a vector length 2, containing max for signal index and t_var.

eta

the estimated linear predictor for the single-index fit.

Pars

a matrix of 2 rows associated with the signal coefficient surface design parameters, each row: c(min, max, nseg, bdeg, lambda, pord) for linear predictor x_index and t_var, resp.

pPars

a matrix of 2 rows associated with the link function design parameters, each row: c(min, max, nseg, bdeg, lambda, pord) for linear predictor eta and t_var, resp.

cv

the leave-one-out cross-validation statistic or the standard error of prediction for the single-index fit.

delta_alpha

change measure in signal-coefficent parameters at convergence.

fit2D

ps2DNormal object, fitting f(eta, t_var).

Author(s)

Paul Eilers and Brian Marx

References

Marx, B. D. (2015). Varying-coefficient single-index signal regression. Chemometrics and Intelligent Laboratory Systems, 143, 111–121.

Eilers, P.H.C. and Marx, B.D. (2021). Practical Smoothing, The Joys of P-splines. Cambridge University Press.

Examples

# Load libraries
library(fields) # Needed for plotting

# Get the data
Dat <- Mixture

# Dimensions: observations, temperature index, signal
m <- 34
p1 <- 401
p2 <- 12

# Stacking mixture data, each mixture has 12 signals stacked
# The first differenced spectra are also computed.
mixture_data <- matrix(0, nrow = p2 * m, ncol = p1)
for (ii in 1:m)
{
  mixture_data[((ii - 1) * p2 + 1):(ii * p2), 1:p1] <-
    t(as.matrix(Dat$xspectra[ii, , ]))
  d_mixture_data <- t(diff(t(mixture_data)))
}

# Response (typo fixed) and index for signal
y_mixture <- Dat$fractions
y_mixture[17, 3] <- 0.1501
index_mixture <- Dat$wl

# Select response and replicated for the 12 temps
# Column 1: water; 2: ethanediol; 3: amino-1-propanol
y <- as.vector(y_mixture[, 2])
y <- rep(y, each = p2)

bdegs = c(3, 3, 3, 3)
pords <- c(2, 2, 2, 2)
nsegs <- c(12, 5, 5, 5) # Set to c(27, 7, 7 ,7) for given lambdas
mins <- c(700, 30)
maxs <- c(1100, 70)
lambdas <- c(1e-11, 100, 0.5, 1) # based on svcm search
x_index <- seq(from = 701, to = 1100, by = 1) # for dX
t_var_sub <- c(30, 35, 37.5, 40, 45, 47.5, 50, 55, 60, 62.5, 65, 70)
t_var <- rep(t_var_sub, m)
max_iter <- 2 # Set higher in practice, e.g. 100
int <- TRUE

# Defining x as first differenced spectra, number of channels.
x <- d_mixture_data


# Single-index VC model using optimal tuning
fit <- sim_vcpsr(y, x, t_var, x_index, nsegs, bdegs, lambdas, pords,
             max_iter = max_iter, mins = mins, maxs = maxs)

plot(fit, xlab = "Wavelength (nm)", ylab = "Temp C")

JOPS documentation built on Sept. 8, 2023, 5:42 p.m.

Related to sim_vcpsr in JOPS...