fit_Gaussian_copula: Fit a Gaussian copula model for a count matrix of a single...

View source: R/model_fitting.R

fit_Gaussian_copulaR Documentation

Fit a Gaussian copula model for a count matrix of a single cell type

Description

Fit a Gaussian copula model for a count matrix of a single cell type

Usage

fit_Gaussian_copula(
  x,
  marginal = c("auto_choose", "zinb", "nb", "poisson"),
  jitter = TRUE,
  zp_cutoff = 0.8,
  min_nonzero_num = 2
)

Arguments

x

A matrix of shape p by n that contains count values.

marginal

Specification of the types of marginal distribution. Default value is 'auto_choose' which chooses between ZINB, NB, ZIP and Poisson by a likelihood ratio test (lrt) and whether there is underdispersion. 'zinb' will fit the ZINB model. If there is underdispersion, it will choose between ZIP and Poisson by a lrt. Otherwise, it will try to fit the ZINB model. If in this case, there is no zero at all or an error occurs, it will fit an NB model instead. 'nb' fits the NB model that chooses between NB and Poisson depending on whether there is underdispersion. 'poisson' simply fits the Poisson model.

jitter

Logical, whether a random projection should be performed in the distributional transform.

zp_cutoff

The maximum propotion of zero allowed for a gene to be included in the joint copula model.

min_non_zero_num

The minimum number of non-zero values required for a gene to be fitted a marginal model.

Value

The genes of x will be partitioned into three groups. The first group contains genes whose zero proportion is less than zp_cutoff. The second group contains genes whose zero proportion is greater than zp_cutoff but still contains at least min_non_zero_num non-zero values. The third and last group contains the rest of the genes. For the first group, a joint Gaussian copula model will be fitted. For the second group, only the marginal distribution of each gene will be fitted. For the last group, no model will be fitted and only the index of these genes will be recorded. A list that contains the above fitted model will be returned that contains the following components.

cov_mat

The fitted covariance (or equivalently in this case, correlation) matrix of the Gaussin copula model.

marginal_param1

A matrix of the parameters for the marginal distributions of genes in group one.

marginal_param2

A matrix of the parameters for the marginal distributions of genes in group two.

gene_sel1

A numeric vector of the row indices of the genes in group one.

gene_sel2

A numeric vector of the row indices of the genes in group two.

gene_sel3

A numeric vector of the row indices of the genes in group three.

zp_cutoff

Same as the input.

min_non_zero_num

Same as the input.

sim_method

A character string that says 'copula'. To be distinguished with the (w/o copula) model.


JSB-UCLA/scDesign2 documentation built on Nov. 2, 2024, 4:26 a.m.