find_variable_genes: Identify Variable Genes in Expression Data

View source: R/find_variable_genes.R

find_variable_genesR Documentation

Identify Variable Genes in Expression Data

Description

Identifies variable genes from a gene expression dataset using specified selection criteria. Supports multiple methods, including expression thresholding and variability estimation via median absolute deviation (MAD).

Usage

find_variable_genes(
  eset,
  data_type = c("count", "normalized"),
  methods = c("low", "mad"),
  prop = 0.7,
  quantile = 0.75,
  min.mad = 0.1,
  feas = NULL
)

Arguments

eset

Numeric matrix. Gene expression data (genes as rows, samples as columns).

data_type

Character. Type of data: '"count"' or '"normalized"'. Default is '"count"'.

methods

Character vector. Methods for gene selection: '"low"', '"mad"'. Default is 'c("low", "mad")'.

prop

Numeric. Proportion of samples in which a gene must be expressed. Default is 0.7.

quantile

Numeric. Quantile threshold for minimum MAD (0.25, 0.5, 0.75). Default is 0.75.

min.mad

Numeric. Minimum allowable MAD value. Default is 0.1.

feas

Character vector or 'NULL'. Additional features to include. Default is 'NULL'.

Value

Matrix subset of 'eset' containing variable genes.

Author(s)

Dongqiang Zeng

Examples

# Simulate data
set.seed(123)
sim_eset <- matrix(rnorm(100 * 20), 100, 20)
rownames(sim_eset) <- paste0("Gene", 1:100)
colnames(sim_eset) <- paste0("Sample", 1:20)

# Identify variable genes
eset_var <- find_variable_genes(
  eset = sim_eset,
  data_type = "normalized",
  methods = "mad",
  quantile = 0.25
)
head(eset_var)

IOBR documentation built on May 30, 2026, 5:07 p.m.