filterPanelGenes: Filter genes for spatial transcriptomics panel

View source: R/markerGenesAndMapping.r

filterPanelGenesR Documentation

Filter genes for spatial transcriptomics panel

Description

Returns a set of genes for inclusion in a spatial transcriptomics panel based on a series of hard-coded and user-defined constraints

Usage

filterPanelGenes(
  summaryExpr,
  propExpr = summaryExpr,
  onClusters = 1:dim(summaryExpr)[2],
  offClusters = NULL,
  geneLengths = NULL,
  startingGenes = c("GAD1", "SLC17A7"),
  numBinaryGenes = 500,
  minOn = 10,
  maxOn = 250,
  maxOff = 50,
  minLength = 960,
  fractionOnClusters = 0.5,
  onThreshold = 0.5,
  excludeGenes = NULL,
  excludeFamilies = c("LOC", "LINC", "FAM", "ORF", "KIAA", "FLJ", "DKFZ", "RIK", "RPS",
    "RPL", "\\-")
)

Arguments

summaryExpr

Matrix of summarized expression levels for a given cluster. Typically the median or mean should be used. Rows are genes and columns are samples. ROW NAMES MUST BE GENE SYMBOLS!

propExpr

Proportion of cells expressed in each cluster for use with binary score calculation (default = summaryExpr, which is not recommended)

onClusters

Vector indicating which clusters should be included in the gene panel (default is all clusters. Can be logical or numeric, or a character string of cluster names)

offClusters

Vector indidicating from which clusters expression should be avoided

numBinaryGenes

Number of genes to include in the final panel. Genes are sorted by binary score using 'getBetaScore' and this number of genes are chosen (default = 500)

minOn

Minimum summary expression level in most highly expressed "on" cluster (default = 10)

maxOn

Maximum summary expression level in most highly expressed "on" cluster (default = 250)

maxOff

Maximum summary expression level in most highly expressed "off" cluster (default = 50)

minLength

Minimum gene length for marker gene selection. Ignored if geneLength is not provided (default = 960)

fractionOnClusters

What is the maximum fraction of clusters in which a gene can be expressed (as defined by propExpr>onThreshold; default = 0.5). This prevents nearly ubiquitous genes from selection

onThreshold

What fraction of cells need to have expression for a gene to be defined as expressed (default = 0.5)

excludeGenes

Which genes should be excluded from the analysis (default is none)

excludeFamilies

Which gene classes or families should be excluded from the analysis? More specifically, any gene that contain these strings of characters anywhere in the symbol will be excluded (default is "LOC","LINC","FAM","ORF","KIAA","FLJ","DKFZ","RIK","RPS","RPL","\-").

geneLength

Optional vector of gene lengths in same order as summaryExpr. Default is NULL

Value

A character vector of genes meeting all constraints


AllenInstitute/mfishtools documentation built on July 5, 2023, 4:20 p.m.