run.CSIDE.general: Runs CSIDE on a 'RCTD' object with a general design matrix

View source: R/CSIDE.R

run.CSIDE.generalR Documentation

Runs CSIDE on a RCTD object with a general design matrix

Description

Identifies differential expression (DE) across a general design matrix of covariates. DE parameters can be cell type-specific or shared across all cell types. Uses maximum likelihood estimation to estimate DE and standard errors for each gene and each cell type. Selects genes with significant nonzero DE. The type of test is determined by test_mode, and the parameters tested is determined by params_to_test.

Usage

run.CSIDE.general(
  myRCTD,
  X1,
  X2,
  barcodes,
  cell_types = NULL,
  gene_threshold = 5e-05,
  cell_type_threshold = 125,
  doublet_mode = T,
  test_mode = "individual",
  weight_threshold = NULL,
  sigma_gene = T,
  PRECISION.THRESHOLD = 0.05,
  cell_types_present = NULL,
  test_genes_sig = T,
  fdr = 0.01,
  params_to_test = NULL,
  normalize_expr = F,
  logs = F,
  cell_type_filter = NULL,
  log_fc_thresh = 0.4,
  test_error = FALSE,
  fdr_method = "BH"
)

Arguments

myRCTD

an RCTD object with annotated cell types e.g. from the run.RCTD function.

X1

a matrix containing the covariates shared across all cell types. The rownames represent pixel names and should be a subset of the pixels in the SpatialRNA object. The columns each represent a covariate for explaining differential expression and need to be linearly independent.

X2

a matrix containing the cell type-specific covariates. The rownames represent pixel names and should be a subset of the pixels in the SpatialRNA object. The columns each represent a covariate for explaining differential expression and need to be linearly independent.

barcodes

the barcodes, or pixel names, of the SpatialRNA object to be used when fitting the model.

cell_types

the cell types used for CSIDE. If null, cell types will be chosen with aggregate occurences of at least 'cell_type_threshold', as aggregated by aggregate_cell_types

gene_threshold

(default 5e-5) minimum average normalized expression required for selecting genes

cell_type_threshold

(default 125) min occurence of number of cells for each cell type to be used, as aggregated by aggregate_cell_types

doublet_mode

(default TRUE) if TRUE, uses RCTD doublet mode weights. Otherwise, uses RCTD full mode weights

test_mode

(default 'individual') if 'individual', tests for DE individually for each parameter. If 'categorical', then tests for differences across multiple categorical parameters

weight_threshold

(default NULL) the threshold of total normalized weights across all cell types in cell_types per pixel to be included in the model. Default 0.99 for doublet_mode or 0.8 for full_mode.

sigma_gene

(default TRUE) if TRUE, fits gene specific overdispersion parameter. If FALSE, overdispersion parameter is same across all genes.

PRECISION.THRESHOLD

(default 0.05) for checking for convergence, the maximum parameter change per algorithm step

cell_types_present

cell types (a superset of 'cell_types') to be considered as occuring often enough to consider for gene expression contamination during the step filtering out marker genes of other cell types.

test_genes_sig

(default TRUE) logical controlling whether genes will be tested for significance

fdr

(default 0.01) false discovery rate for hypothesis testing

normalize_expr

(default FALSE) if TRUE, constrains total gene expression to sum to 1 in each condition. Setting normalize_expr = TRUE is only valid for testing single parameters with test_mode = 'individual'.

logs

(default FALSE) if TRUE, writes progress to logs/de_logs.txt

log_fc_thresh

(default 0.4) the natural log fold change cutoff for differential expression

test_error

(default FALSE) if TRUE, exits after testing for error messages without running CSIDE. If set to TRUE, this can be used to quickly evaluate if CSIDE will run without error.

fdr_method

(default BH) if BH, uses the Benjamini-Hochberg method. Otherwise, uses local fdr with an empirical null.

params_to_test:

(default 2 for test_mode = 'individual', all parameters for test_mode = 'categorical'). An integer vector of parameter indices to test. For example c(1,4,5) would test only parameters corresponding to columns 1, 4, and 5 of the design matrix X2.

Value

an RCTD object containing the results of the CSIDE algorithm. Contains objects de_results, which contain the results of the CSIDE algorithm including 'gene_fits', which contains the results of fits on individual genes, in addition 'sig_gene_list', a list, for each cell type, of significant genes detected by CSIDE, whereas 'all_gene_list' is the analogous list for all genes (including nonsignificant). Additionally, the object contains 'internal_vars_de' a list of variables that are used internally by CSIDE


dmcable/RCTD documentation built on Feb. 24, 2024, 11:03 p.m.