Description Usage Arguments Value Examples
View source: R/investigate_metadata.R
A function to investigate the metadata based on its missingness and variance. A missingness plot will be outputted to get a general sense of the amount of missingness in the metadata. In addition, the percentage of missingness for the variables and rows as well as variables that lack variance will be returned. Finally, using this information, the function will provide information on which variables and rows should be dropped.
1 2 3 4 5 6 | investigate_metadata(
metadata,
first_column_as_id = TRUE,
missing_value_lst = NULL,
missing_threshold = 0.1
)
|
metadata |
The corresponding metadata for a gene count matrix. |
first_column_as_id |
Boolean value specifying whether the first column in the metadata is the identifier/key. If not, it is assumed that the row names are. |
missing_value_lst |
A named character list specifying the missing value(s), if it exists, in each variable. |
missing_threshold |
A value between 0 and 1 inclusive signifying the threshold and cutoff for the percentage of acceptable missingness in each variable. |
A list containing 4 objects of relevant metadata info.
missing_percent_col - Named vector consisting of the missingness percentages in each variable.
missing_percent_row - Named vector consisting of the missingness percentages in each row.
missing_drop - Vector of column names that should be dropped based on missingness.
variance_drop - Vector of column names that should be dropped due to lack of variance.
1 2 3 4 5 6 7 8 | # Using tcga_metadata from package.
library(MetaConIdentifier)
meta_info <- investigate_metadata(tcga_meta_original,
first_column_as_id = FALSE, missing_value_lst = NULL,
missing_threshold = 0.1)
# Obtain missingness percentages for each variable.
meta_info$missing_percent_col
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.