link_viol_sim: Simulate violations of link function
In shiandy/bst235project:

Description Usage Arguments Details Value Functions

View source: R/simulation.R

Simulate violations of link function. The true model is E(Y_i | X_i) = α_0 + Φ(β_0^T X_i), where Φ is the Normal CDF, but we fit the model E(Y_i | X_i) = α + β^T X using adaptive LASSO.

link_viol_sim(nsims, betas, x_simulator, n, error_simulator = rnorm,
  testsize = 5000, cv = FALSE)

sim_data(betas, x_simulator, error_simulator, n)

`nsims`	Number of simulations to run.
`betas`	True effects, with first element equal to the intercept. Should be a length-(p+1) vector.
`x_simulator`	Function whose first argument is n. Generates n replicates of X. The return value of this function should be an n x p matrix, or an n x 1 vector.
`n`	Number of samples in each simulation.
`error_simulator`	Function whose first argument is n. Generates n replicates of epsilon. The return value of this function should be an n x 1 vector.
`testsize`	Sample size for the simulated validation dataset.
`cv`	Whether to use cross-validation to select the optimal bandwidth for nonparametric smoothing step.
`true_link`	True link function. Should be an R function that takes one argument.

We calculate the errors using a large validation dataset. The errors considered are

E((Y_i - \hat{α} - \hat{β}^T X_i)^2), where \hat{α} and \hat{β} are from adaptive LASSO.
E((Y_i - \tilde{m}(X_i))^2), where \tilde{m}(X_i) corresponds to the conditional mean from fitting the true mode.
E((Y_i - \hat{m}(X_i))^2), where \hat{m} is a nonparametrically calibrated version of the conditional mean. It uses the fitted values from adaptive LASSO for the old data and new data, and uses a kernel to smooth over them.