chooseCandGenes: Select candidate genes

Description Usage Arguments Value

View source: R/chooseCandGenes.R

Description

This function can be used to independently select candidate genes from a given real RNA-srq data (bulk/single) for the SPsimSeq simulation. It chooses genes with various chracteristics, such as log-fold-change above a certain thereshold.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
chooseCandGenes(
  cpm.data,
  group,
  lfc.thrld,
  llStat.thrld,
  t.thrld,
  w = w,
  max.frac.zeror.diff = Inf,
  pDE,
  n.genes,
  prior.count
)

Arguments

cpm.data

logCPM transformed matrix (if log.CPM.transform=FALSE, then it is the source gene expression data)

group

a grouping factor

lfc.thrld

a positive numeric value for the minimum absolute log-fold-change for selecting candidate DE genes in the source data (when group is not NULL and pDE>0)

llStat.thrld

a positive numeric value for the minimum squared test statistics from the log-linear model to select candidate DE genes in the source data (when group is not NULL and pDE>0) containing X as a covariate to select DE genes

t.thrld

a positive numeric value for the minimum absolute t-test statistic for the log-fold-changes of genes for selecting candidate DE genes in the source data (when group is not NULL and pDE>0)

w

a numeric value between 0 and 1. The number of classes to construct the probability distribution will be round(w*n), where n is the total number of samples/cells in a particular batch of the source data

max.frac.zeror.diff

a numeric value >=0 indicating the maximum absolute difference in the fraction of zero counts between the groups for DE genes.

pDE

fraction of DE genes

n.genes

total number of genes

prior.count

a positive constant to be added to the CPM before log transformation, to avoid log(0). The default is 1.

Value

a list object contating a set of candidate null and non-null genes and additional results


CenterForStatistics-UGent/SPsimSeq documentation built on Jan. 31, 2022, 3:32 a.m.