ge_single: ge_single

View source: R/ge_single.R

ge_singleR Documentation

ge_single

Description

Performs Gene Set Variation Analysis.

Usage

ge_single(
  counts,
  metadata,
  genes_id,
  response,
  design,
  biomart,
  gsva_gmt = "hallmark",
  method = "gsva",
  kcdf = "Gaussian",
  colors = c("orange", "black"),
  row.names = TRUE,
  col.names = TRUE
)

Arguments

counts

Data frame that contains gene expression data as raw counts.

metadata

Data frame that contains supporting variables to the data.

genes_id

Name of the column that contains gene identifiers. Should be one of the following:'entrezgene_id', 'ensembl_gene_id' or 'hgnc_symbol'.

response

Unquoted name of the variable indicating the groups to analyse.

design

Variables in the design formula in the form of: 'Var1 + Var2 + ... Var_n'.

biomart

Data frame containing a biomaRt query with the following attributes: ensembl_gene_id, hgnc_symbol, entrezgene_id, transcript_length, refseq_mrna. In the case of mus musculus data, external_gene_name must be obtained and then change the column name for hgnc_symbol. Uploaded biomaRt queries in GEGVIC: 'ensembl_biomartGRCh37', ensembl_biomartGRCh38_p13' and 'ensembl_biomartGRCm38_p6', 'ensembl_biomartGRCm39'.

gsva_gmt

Path to the gmt file that contain the gene sets of interest. By default the parameter is set to 'hallmark' which provides all HALLMARK gene sets from MSigDB (version 7.5.1).

method

Name of the method to perform Gene set variation analysis. The options are: 'gsva', 'ssgea' or 'zscore'. Default value is 'gsva'.

kcdf

Character string denoting the kernel to use during the non-parametric estimation of the cumulative distribution function of expression levels across samples when method="gsva". By default, "Gaussian" since GEGVIC transforms raw counts using the vst transformation. Other options are 'Poisson' or 'none'.

colors

Character vector indicating the colors of the different groups to compare. Default values are two: black and orange.

row.names

Logical value to determine if row-names are shown in the heatmap.

col.names

Logical value to determine if column-names are shown in the heatmap.

Value

Returns a heatmap and the expression values in a form of a matrix.

Examples

gsva.res <- ge_single(counts = sample_counts,
                      metadata = sample_metadata,
                      genes_id = 'ensembl_gene_id',
                      response = MSI_status,
                      design = 'MSI_status',
                      biomart = ensembl_biomart_GRCh38_p13,
                      gsva_gmt = 'hallmark',
                      method = 'gsva',
                      kcdf = 'Gaussian',
                      colors = c('orange', 'black'),
                      row.names = TRUE,
                      col.names = TRUE)


oriolarques/GEGVIC documentation built on Oct. 30, 2024, 10:44 p.m.