dss_check_data: Test whether your data file has the required format for use...
In frederic-santos/rdss: An R-Shiny Application for a Semi-Automatized Approach of Sex Estimation in Biological Anthropology

dss_check_data

R Documentation

Test whether your data file has the required format for use in `rdss`.

Description

This is the mandatory first step when using rdss. This function performs several checks for possible formatting mistakes, and returns a dataframe with “normalized” reformatted contents.

Usage

dss_check_data(dtf, sex, females, males,
               tbd, rm_empty_rows = FALSE,
               mode = "console")

Arguments

`dtf`	previously imported dataframe. Warning: at this stage, individual IDs must be indicated as a character vector in the first column of the dataframe, and not directly as custom row names (see Notes below, see also the package vignette).
`sex`	character string; name of the column filled with the sex of individuals in the dataframe `dtf`.
`females`	character string; abbreviation used for female individuals in the sex column.
`males`	character string; abbreviation used for male individuals in the sex column.
`tbd`	character string; abbreviation used for target individuals in the sex column.
`rm_empty_rows`	boolean. Should individuals with no value at all be removed from the dataframe?
`mode`	for internal use in the shiny app only; final users in R scripts should stick with the default value, `console`.

Details

This functions performs a series a six checks on the dataframe dtf, and displays explicit and useful error messages when formatting mistakes are found (duplicates in row names, typos in the Sex column, etc.).

Also, it returns a dataframe whose the contents are “standardized”:

the sex column is automatically renamed as Sex
the sex factor is then releveled: females now match the level F, males now match the level M, target individuals now match the level TBD. This will facilitate and standardize the presentation of classification results for all users.

Value

A dataframe with same contents as dtf, but whose sex factor has possibly be renamed and releveled (see Details).

Note

Please note that the input dataframe dtf must not have row names, i.e. must not have been imported using the argument row.names = 1 from read.csv(), for instance. Instead, its first column must be a character vector filled with individual IDs. This character vector will be transformed as row names (after several checks) by this function. See the package vignette for additional details.