Description Usage Arguments Value References Examples

Uses cross-validation on the LEGIT model. Note that this is not a very fast implementation since it was written in R.

1 2 3 4 5 |

`data` |
data.frame of the dataset to be used. |

`genes` |
data.frame of the variables inside the genetic score |

`env` |
data.frame of the variables inside the environmental score |

`formula` |
Model formula. Use |

`cv_iter` |
Number of cross-validation iterations (Default = 5). |

`cv_folds` |
Number of cross-validation folds (Default = 10). Using |

`folds` |
Optional list of vectors containing the fold number for each observation. Bypass cv_iter and cv_folds. Setting your own folds could be important for certain data types like time series or longitudinal data. |

`Huber_p` |
Parameter controlling the Huber cross-validation error (Default = 1.345). |

`classification` |
Set to TRUE if you are doing classification (binary outcome). |

`start_genes` |
Optional starting points for genetic score (must be the same length as the number of columns of |

`start_env` |
Optional starting points for environmental score (must be the same length as the number of columns of |

`eps` |
Threshold for convergence (.01 for quick batch simulations, .0001 for accurate results). |

`maxiter` |
Maximum number of iterations. |

`family` |
Outcome distribution and link function (Default = gaussian). |

`ylim` |
Optional vector containing the known min and max of the outcome variable. Even if your outcome is known to be in [a,b], if you assume a Gaussian distribution, predict() could return values outside this range. This parameter ensures that this never happens. This is not necessary with a distribution that already assumes the proper range (ex: [0,1] with binomial distribution). |

`seed` |
Seed for cross-validation folds. |

`id` |
Optional id of observations, can be a vector or data.frame (only used when returning list of possible outliers). |

`crossover` |
If not NULL, estimates the crossover point of |

`crossover_fixed` |
If TRUE, instead of estimating the crossover point of E, we force/fix it to the value of "crossover". (Used when creating a diathes-stress model) (Default = FALSE). |

If `classification`

= FALSE, returns a list containing, in the following order: a vector of the cross-validated *R^2* at each iteration, a vector of the Huber cross-validation error at each iteration, a vector of the L1-norm cross-validation error at each iteration, a matrix of the possible outliers (standardized residuals > 2.5 or < -2.5) and their corresponding standardized residuals and standardized pearson residuals. If `classification`

= TRUE, returns a list containing, in the following order: a vector of the cross-validated *R^2* at each iteration, a vector of the Huber cross-validation error at each iteration, a vector of the L1-norm cross-validation error at each iteration, a vector of the AUC at each iteration, a matrix of the best choice of threshold (based on Youden index) and the corresponding specificity and sensitivity at each iteration, and a list of objects of class "roc" (to be able to make roc curve plots) at each iteration. The Huber and L1-norm cross-validation errors are alternatives to the usual cross-validation L2-norm error (which the *R^2* is based on) that are more resistant to outliers, the lower the values the better.

Denis Heng-Yan Leung. *Cross-validation in nonparametric regression with outliers.* Annals of Statistics (2005): 2291-2310.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ```
## Not run:
train = example_3way(250, 2.5, seed=777)
# Cross-validation 4 times with 5 Folds
cv_5folds = LEGIT_cv(train$data, train$G, train$E, y ~ G*E*z, cv_iter=4, cv_folds=5)
cv_5folds
# Leave-one-out cross-validation (Note: very slow)
cv_loo = LEGIT_cv(train$data, train$G, train$E, y ~ G*E*z, cv_iter=1, cv_folds=250)
cv_loo
# Cross-validation 4 times with 5 Folds (binary outcome)
train_bin = example_2way(500, 2.5, logit=TRUE, seed=777)
cv_5folds_bin = LEGIT_cv(train_bin$data, train_bin$G, train_bin$E, y ~ G*E,
cv_iter=4, cv_folds=5, classification=TRUE, family=binomial)
cv_5folds_bin
par(mfrow=c(2,2))
pROC::plot.roc(cv_5folds_bin$roc_curve[[1]])
pROC::plot.roc(cv_5folds_bin$roc_curve[[2]])
pROC::plot.roc(cv_5folds_bin$roc_curve[[3]])
pROC::plot.roc(cv_5folds_bin$roc_curve[[4]])
## End(Not run)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.