link_viol_sim: Simulate violations of link function

Description Usage Arguments Details Value Functions

View source: R/simulation.R

Description

Simulate violations of link function. The true model is E(Y_i | X_i) = α_0 + Φ(β_0^T X_i), where Φ is the Normal CDF, but we fit the model E(Y_i | X_i) = α + β^T X using adaptive LASSO.

Usage

1
2
3
4
link_viol_sim(nsims, betas, x_simulator, n, error_simulator = rnorm,
  testsize = 5000, cv = FALSE)

sim_data(betas, x_simulator, error_simulator, n)

Arguments

nsims

Number of simulations to run.

betas

True effects, with first element equal to the intercept. Should be a length-(p+1) vector.

x_simulator

Function whose first argument is n. Generates n replicates of X. The return value of this function should be an n x p matrix, or an n x 1 vector.

n

Number of samples in each simulation.

error_simulator

Function whose first argument is n. Generates n replicates of epsilon. The return value of this function should be an n x 1 vector.

testsize

Sample size for the simulated validation dataset.

cv

Whether to use cross-validation to select the optimal bandwidth for nonparametric smoothing step.

true_link

True link function. Should be an R function that takes one argument.

Details

We calculate the errors using a large validation dataset. The errors considered are

  1. E((Y_i - \hat{α} - \hat{β}^T X_i)^2), where \hat{α} and \hat{β} are from adaptive LASSO.

  2. E((Y_i - \tilde{m}(X_i))^2), where \tilde{m}(X_i) corresponds to the conditional mean from fitting the true mode.

  3. E((Y_i - \hat{m}(X_i))^2), where \hat{m} is a nonparametrically calibrated version of the conditional mean. It uses the fitted values from adaptive LASSO for the old data and new data, and uses a kernel to smooth over them.

link_viol_sim runs the simulation.

Value

link_viol_sim returns a list with two named elements:

betas

nsims x p matrix of estimated coefficients for each iteration of the simulation.

errors

nsims x 3 matrix of three estimated errors.

sim_data returns a list with two named elements:

xs

n x p matrix of predictors

ys

length-n vector of outcomes

Functions


shiandy/bst235project documentation built on May 14, 2019, 2:01 a.m.