Description Usage Arguments Value Examples
View source: R/standardize_metadata.R
A function to standardize metadata by truncating it to a subset of clinically relevant variables, specifying their variable types, and converting missing values into a single format.
1 2 3 4 5 6 7 | standardize_metadata(
metadata,
first_column_as_id = TRUE,
variable_subset,
variable_type_vec,
missing_value_lst = NULL
)
|
metadata |
The corresponding metadata for a gene count matrix. |
first_column_as_id |
Boolean value specifying whether the first column in the metadata is the identifier/key. If not, it is assumed that the row names are. |
variable_subset |
A character vector of the metadata variables that the user wishes to subset. This should be the most clinically relevant and population relevant variables such as age, sex, and race. |
variable_type_vec |
A named character vector specifying the type of each variable. There are 3 types: categorical, numeric, and ordinal. |
missing_value_lst |
A named character list specifying the missing value(s), if it exists, in each variable. |
A data.frame object of the cleaned metadata, with classes of each column specifying the variable type and all missing values converted to NA.
1 2 3 4 5 6 7 8 | # Using tcga_metadata from package.
library(MetaConIdentifier)
tcga_meta_new <- standardize_metadata(tcga_meta_original,
first_column_as_id = FALSE, variable_subset = tcga_variable_subset,
variable_type_vec = tcga_variable_type_vec, missing_value_lst = NULL)
# The clean metadata should contain 2 classes: data.frame and metaStandard.
class(tcga_meta_new)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.