View source: R/functions_pheno_data.R
harmonize_pheno_data | R Documentation |
Extend original phenotype data by new variables with harmonized names and values including sample and subject identifiers. Values of numeric variables are converted to numeric. Values of character variables are mapped to pre-specified values. For time variables numbers are extracted and converted to days. If a study contains only patients of a specific disease, this variable can be set globally.
harmonize_pheno_data( project, pheno, info.var, col.id, ind.use.id = NULL, cols.use, disease = NULL )
project |
[character(1)] name of project (used as prefix for all harmonized variables) |
pheno |
[data.frame] original phenotype data (e.g. as returned by
|
info.var |
[list] project level information about variables that should be harmonized (see Details) |
col.id |
[character(1)] column with information about subject identifiers |
ind.use.id |
[numeric(1)] part of subject identifier that should be kept after splitting using " ", "_" or "-" |
cols.use |
[vector(n)] vector of columns used for harmonization, named by variable names given in info.var |
disease |
[character(1)] disease that should be set for all samples |
info.var contains information about all variables that should be harmonized within a project, i.e. across studies. The list needs to be named by the names of the harmonized variables and each element contains:
type: either "character", "numeric" or "time"
values: list named by final value and original values that should be mapped (can include regular expressions)
[data.frame] harmonized phenotype data
# example study study.id = "GSE67785" # extract phenotype data from GEO pheno.original = extract_pheno_data( study.id = study.id) # prepare information about variables to be harmonized info.var = list( lesional = list( type = "character", values = list( lesional = "PP", nonlesional = "PN")), sex = list( type = "character", values = list( female = "female", male = "^male")), tissue = list( type = "character", values = list(skin = "skin"))) # define columns that should be harmonized cols.use = c( lesional = "group:ch1", tissue = "source_name_ch1", sex = "gender:ch1") pheno = harmonize_pheno_data( project = "project", pheno = pheno.original, info.var = info.var, col.id = "patient:ch1", cols.use = cols.use) head(pheno[, 1:6])
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.