View source: R/dss_sex_estimation.R
dss_sex_estimation | R Documentation |
Estimate the sex of a target individual using an (imputed, complete) reference dataset of individuals of know sex. This functions is essentially a wrapper for various methods of supervised learning.
dss_sex_estimation(ref, target, conf = 0.95,
method = c("lda", "glmnet", "linda", "rf"),
lda_selvar = c("none", "backward", "forward"),
rf_ntrees = 200, rf_downsampling = FALSE,
glmnet_type = 0,
glmnet_measure = c("deviance", "class"),
linda_alpha = 0.9)
ref |
dataframe (previously imputed if necessary) of reference individuals. No missing values allowed. |
target |
1-row dataframe, target individual. |
conf |
numeric value lying in [0.5, 1[; confidence level for sex estimation (i.e., posterior probability threshold). |
method |
character string; supervised learning method to be used for sex estimation. See Details below. |
lda_selvar |
character string. Only parsed if |
rf_ntrees |
numeric value. Only parsed if |
rf_downsampling |
boolean. Only parsed if |
glmnet_type |
numeric value. Only parsed if |
glmnet_measure |
Only parsed if |
linda_alpha |
numeric value. Only parsed if |
The argument method
leaves the choice between four methods of
supervised learning: classical linear discriminant analysis
("lda"
) performed with lda
; robust
discriminant analysis ("linda"
) performed with
Linda
; random forests ("rf"
) performed
with randomForest
; penalized logistic
regression ("glmnet"
) performed with
glmnet
. See their respective help pages for more
details.
Classification accuracy is automatically assessed using leave-one-out cross-validation (or out-of-bag error for random forests). The confusion matrix which is return thus corresponds to cross-validated results.
A list of three components:
res_dss |
A dataframe of results for the target individual, with all necessary details about the model used for sex estimation. |
table_loocv |
A confusion matrix obtained by leave-one-out
cross-validation on the reference sample |
details |
Additional method-specific details, such as coefficient
values or variable importance, depending on the value of the
|
Frédéric Santos.
lda
, Linda
,
randomForest
,
glmnet
, cv.glmnet
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.