convert2biodata: Format biological data

View source: R/convert2biodata.R

convert2biodataR Documentation

Format biological data

Description

Merges gene and cell datasets with the same TCGA sample identifiers, splits samples according to the expression levels of a selected gene into two categories (below or above average) and formats into a 3-column data frame: gene expression levels, cell types, and gene expression values.

Usage

convert2biodata(algorithm, disease, tissue, gene_x, stat = "mean", path = ".")

Arguments

algorithm

character for the algorithm used to estimate the distribution of cell type abundance among : 'Cibersort', 'Cibersort_ABS', 'EPIC', 'MCP_counter', 'Quantiseq', 'Timer', 'Xcell', 'Xcell (2)' and 'Xcell64'.

disease

character for the type of TCGA cancer (see the list in extdata/disease_names.csv).

tissue

character for the type of TCGA tissue among : 'Additional - New Primary', 'Additional Metastatic', 'Metastatic', 'Primary Blood Derived Cancer - Peripheral Blood', 'Primary Tumor', 'Recurrent Tumor', 'Solid Tissue Normal'

gene_x

character for the gene selected in the differential analysis (see the list in extdata/gene_names.csv).

stat

character for the statistic to be chosen among "mean", "median" or "quantile".

path

character for the path name of the tcga dataset.

Value

data frame with the following columns:

  • high (logical): the expression levels of a selected gene, TRUE for below or FALSE for above average.

  • cells (factor): cell types.

  • value (float): the abundance estimation of the cell types.

Examples

data(tcga)
(convert2biodata(
    algorithm = "Cibersort_ABS",
    disease = "breast invasive carcinoma",
    tissue = "Primary Tumor",
    gene_x = "ICOS"
))

tcgaViz documentation built on April 4, 2023, 5:14 p.m.