fosr_select: Decoupling shrinkage and selection for function-on-scalars...

Description Usage Arguments Value Note Examples

View source: R/variable_selection_functions.R

Description

For a functional response and scalar predictors, construct a posterior summary that balances predictive accuracy and sparsity. Given posterior draws of regression coefficients (or coefficient functions) from a FOSR model, use a suitably-defined loss function to select important variables for prediction.

Usage

1
2
fosr_select(X, post_alpha, post_trace_sigma_2, weighted = TRUE,
  alpha_level = 0.1, remove_int = TRUE, include_plot = TRUE)

Arguments

X

n x p matrix of predictors

post_alpha

Nsims x p x K array of Nsims posterior draws of the p predictors for each of K factors

post_trace_sigma_2

Nsims x 1 vector of posterior draws of the trace of the (marginal) covariance (see below for details)

weighted

logical; if TRUE, use weighted group lasso (recommended)

alpha_level

coverage for the credible interval on the proportion of variance explained

remove_int

logical; if TRUE, remove the intercept term from model comparisons

include_plot

logical; if TRUE, include a plot showing proportion of variability explained against model size

Value

alpha_dss a p x K matrix of (sparse) regression coefficents

Note

This function is value for the regression functions (m-dimensional) as well as the regression factors (K-dimensional). Since K << m, the latter is much faster.

The matrix of predictors, X, may be different from the given matrix in the data; i.e., we may have a different set of design points for prediction.

post_trace_sigma_2 is the (posterior samples of) the trace of the error covariance matrix jointly across subjects i=1,...,n and observations j=1,...,m, after marginalizing out the random effects gamma_ik. This is given by nm x sigma_e^2 + sum_ik sigma_gamma_ik^2, where the second term is necessary only when random effects are included in the model AND integrated over in the predictive distribution.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# Simulate some data:
sim_data = simulate_fosr(n = 100, m = 20, p_0 = 100, p_1 = 5)

# Data:
Y = sim_data$Y; X = sim_data$X; tau = sim_data$tau

# Dimensions:
n = nrow(Y); m = ncol(Y); p = ncol(X)

# Run the FOSR:
out = fosr(Y = Y, tau = tau, X = X, K = 6,
           mcmc_params = list("fk", "alpha", "Yhat", "sigma_e", "sigma_g"))

# Run the DSS:
alpha_dss = fosr_select(X = X,
                       post_alpha = out$alpha,
                       post_trace_sigma_2 = n*m*out$sigma_e^2 + apply(out$sigma_g^2, 1, sum))
# Variables selected:
(select_dss = which(apply(alpha_dss, 1, function(x) any(x != 0))))

drkowal/dfosr documentation built on May 7, 2020, 3:09 p.m.