find_variable_genes: find_variable_genes

View source: R/find_variable_genes2.R

find_variable_genesR Documentation

find_variable_genes

Description

This function identifies variable genes based on specified criteria in a given gene expression dataset.

Usage

find_variable_genes(
  eset,
  data_type = c("count", "normalized"),
  methods = c("low", "mad"),
  prop = 0.7,
  quantile = c(0.75, 0.5, 0.25),
  min.mad = 0.1,
  feas = NULL
)

Arguments

eset

The gene expression dataset as a matrix.

data_type

(character, optional): The type of data in the dataset. Default is "count". Possible values: "count", "normalized".

methods

(character vector, optional): The methods to be used for gene selection. Default is c("low", "mad"). Possible values: "low", "mad".

prop

(numeric, optional): The proportion of samples in which a gene should be expressed. Default is 0.7.

quantile

(numeric vector, optional): The quantiles used to calculate the minimum allowable median absolute deviation (mad) value. Default is c(0.75, 0.5, 0.25).

min.mad

(numeric, optional): The minimum allowable mad value. Default is 0.1.

feas

(character vector, optional): Additional features to include in the variable gene selection. Default is NULL.

Details

This function identifies variable genes from a gene expression dataset based on specified criteria. It allows the use of multiple selection methods, including expression thresholding and variability estimation via median absolute deviation (MAD). The function can handle both count and normalized data.

Value

A matrix subset of 'eset' containing only the genes identified as variable according to the specified criteria.

Author(s)

Dongqiang Zeng

Examples

# loading expression data
data("eset_tme_stad", package = "IOBR")
# Determination of filtration criteria
eset <- find_variable_genes(eset = eset_tme_stad, data_type = "normalized", methods = "mad", quantile = 0.25)


IOBR/IOBR documentation built on April 3, 2025, 2:19 p.m.