Description Usage Arguments Details Value Author(s) References See Also
The function wrapper_loss
estimates the deviance loss in a multinomial regression model by leave-one-out cross validation using fast_multinom
and deviance_loss
. This wrapper was used in our analysis in Bertl et al. (2007) (see References). The function wrapper_loss_binom
uses a binomial model instead.
1 2 3 4 5 | wrapper_loss(cv, cv_index, mi_index, model_index, modelfile, datafolder,
resultsfolder, per_obs, nested_samples = T)
wrapper_loss_binom(cv, cv_index, mi_index, model_index, modelfile, datafolder,
resultsfolder, per_obs, nested_samples = T)
|
cv |
integer. Number of pieces the dataset has been divided into for cross validation. |
cv_index |
integer. Which cross-validation slice is currently used? |
mi_index |
integer. Number of multiple imputation replicate. |
model_index |
integer. Number of the model in the model matrix. |
modelfile |
character. File that contains the models in the form of a matrix (see examples). |
datafolder |
character. Folder that contains the dataset at the location |
resultsfolder |
character. Where to save the estimated loss and the estimated regression model. Note that the VC matrix is not saved. |
per_obs |
logical. If per_obs==T, the loss is normalized by the total number of observations (sum of all counts), so it is the mean loss. |
nested_samples |
logical. Default=T. Are the samples nested in the cancer types? |
This function estimates a multinomial regression model on the joint set of all cross validation pieces that dataset has been divided into except cv_index
. Then, the deviance loss is estimated on the dataset cv_index
. In a further step, the function wrapper_average_loss should be used for averaging over the loss estimates.
The data is prepared and the regression is estimated as in wrapper_fast_multinom
. As the contrasts are irrelevant for prediction, they cannot be set here. By default, nested contrasts are used for the sample to avoid overspecifying the model (because this is not handled correctly by the function glm4
, see fast_multinom
for details. The option nested_samples allows to remove the nesting, if the cancer_type is not part of the model.
The scripts that were used to run this function and that show all settings used in Bertl et al. (2007) are available in this package in the folder inst/Bertl_et_al_2017
. The pre-processed data can be downloaded from figshare.
There is no output. The regression coefficients and the loss estimate are saved.
Johanna Bertl
Bertl, J.; Guo, Q.; Rasmussen, M. J.; Besenbacher, S; Nielsen, M. M.; Hornshøj, H.; Pedersen, J. S. & Hobolth, A. A Site Specific Model And Analysis Of The Neutral Somatic Mutation Rate In Whole-Genome Cancer Data. bioRxiv, 2017. doi: https://doi.org/10.1101/122879 http://www.biorxiv.org/content/early/2017/06/21/122879
fast_multinom
, deviance_loss
, wrapper_fast_multinom
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.