estimate.exp.prob.param: Estimate gene expression probability based on experimental...

View source: R/expression_fit.R

estimate.exp.prob.paramR Documentation

Estimate gene expression probability based on experimental parameters

Description

This function estimates the expression probability of each gene in pseudobulk with a certain cutoff of more than min.counts UMI counts, based on experimental parameters which lead to certain mean and dispersion values for each gene

Usage

estimate.exp.prob.param(
  nSamples,
  readDepth,
  nCellsCt,
  read.umi.fit,
  gamma.mixed.fits,
  ct,
  disp.fun.param,
  min.counts = 3,
  perc.indiv.expr = 0.5,
  cutoffVersion = "absolute",
  nGenes = 21000,
  samplingMethod = "quantiles"
)

Arguments

nSamples

Sample size

readDepth

Target read depth per cell

nCellsCt

Mean number of cells per individual and cell type

read.umi.fit

Data frame for fitting the mean UMI counts per cell depending on the mean readds per cell (required columns: intercept, reads (slope))

gamma.mixed.fits

Data frame with gamma mixed fit parameters for each cell type (required columns: parameter, ct (cell type), intercept, meanUMI (slope))

ct

Cell type of interest (name from the gamma mixed models)

disp.fun.param

Function to fit the dispersion parameter dependent on the mean (required columns: ct (cell type), asymptDisp, extraPois (both from taken from DEseq))

min.counts

Expression cutoff in one individual: if cutoffVersion=absolute, more than this number of UMI counts for each gene per individual and cell type is required; if cutoffVersion=percentage, more than this percentage of cells need to have a count value large than 0

perc.indiv.expr

Expression cutoff on the population level: if number < 1, percentage of individuals that need to have this gene expressed to define it as globally expressed; if number >=1 absolute number of individuals that need to have this gene expressed

cutoffVersion

Either "absolute" or "percentage" leading to different interpretations of min.counts (see description above)

nGenes

Number of genes to simulate (should match the number of genes used for the fitting)

samplingMethod

Approach to sample the gene mean values (either taking quantiles or random sampling)

Value

Vector with expression probabilities for each gene


heiniglab/scPower documentation built on Jan. 9, 2025, 12:13 p.m.