fit_model: fit a CARseq negative binomial model using iteratively...

Description Usage Arguments Details Examples

View source: R/loglik.R

Description

fit_model fits a negative binomial distribution whose mean is a sum of nonnegative terms with covariates. The overdispersion parameter is estimated by maximizing the adjusted profile log-likelihood.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
fit_model(
  cell_type_specific_variables,
  other_variables,
  read_depth,
  cellular_proportions,
  counts,
  verbose = TRUE,
  init = NULL,
  fix_overdispersion = FALSE,
  number_of_resample = 1,
  use_log_scale_algorithm = FALSE,
  lambda = 1e-06,
  return_coefficients_only = FALSE,
  allow_negative_expression = FALSE
)

Arguments

cell_type_specific_variables

an array of n_B x H x K of cell type-independent variables.

other_variables

a design matrix of n_B x M of cell type-specific variables. Can be NULL.

read_depth

a vector of n_B sample-specific read depths. It it used as an offset term in the regression model. Alternatively, it can be 1, NA or NULL so that we don't have an offset term in the model, and log(read_depth) can be included as one of the cell type-independent variables.

cellular_proportions

a matrix of n_B x H of cellular proportions.

counts

A vector of n_B total read counts observed.

verbose

logical. If TRUE (default), display information of negative log-likelihood of each iteration.

init

("expert" argument) a numeric vector of (K + H x M + 1) corresponding to the initial value of c(beta, gamma, overdispersion). Can be NULL. For gamma, log scale is assumed when use_log_scale_algorithm is TRUE, and non-log scale is assumed when use_log_scale_algorithm is FALSE.

fix_overdispersion

("expert" argument) logical (or numerical for a fixed overdispersion). If FALSE (default), the overdispersion parameter will be estimated within the function. Otherwise, the overdispersion parameter can be

number_of_resample

("expert" argument) numeric. The number of initial values we use in optimization. The optimization algorithm generally is not sensitive to initial values, so we advise to leave it as the default value 1.

use_log_scale_algorithm

("expert" argument) logical. The default is FALSE, where the cell type-specific effects are fitted on a non-log scale and non-negative least squares is used (implemented in bvls). This is generally advisable than fitting them on a log scale, especially when the cell type-specific variables are binary (for example group indicators).

lambda

("expert" argument) numerical. 1/σ^2 in Gaussian prior. Only takes effect when use_log_scale_algorithm is TRUE. Defaults to 0. It can also be a vector of length K + H * (M + 1).

return_coefficients_only

("expert" argument) logical. If FALSE (default), lfc, log-likelihood, Cook's distance and fitted values will be provided together with coefficient estimates. Otherwise, only coefficient estimates are returned.

allow_negative_expression

("expert" argument) logical. If FALSE (default), the cell type-specific expression cannot be negative and will be returned in a log scale, along with LFCs. If TRUE, the cell type-specific expression will be on a non-log scale and could possibly be negative, while LFCs will not be returned.

Details

Please use the wrapper run_CARseq. This function is not intended for general use.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
library(CARseq)
set.seed(1234)
H = 4      # number of cell types
n_B = 60   # total number of subjects
K = 1      # number of cell type-independent covariates
M = 3      # number of cell type-specific covariates
cell_type_specific_variables = array(0, dim=c(n_B, H, M))
cell_type_specific_variables[1:(n_B/3), , 1] = 1
cell_type_specific_variables[(n_B/3+1):(n_B*2/3), , 2] = 1
cell_type_specific_variables[(n_B*2/3+1):n_B, , 3] = 1
other_variables = matrix(runif(n_B * K, min = -1, max = 1), nrow=n_B, ncol=K)
read_depth = round(runif(n_B, min = 50, max = 100))
cellular_proportions = matrix(runif(n_B * H), nrow = n_B, ncol = H)
cellular_proportions = cellular_proportions / rowSums(cellular_proportions)
counts = round(runif(n_B, min = 100, max = 200))
res = fit_model(cell_type_specific_variables, other_variables, read_depth, cellular_proportions, counts)
str(res)

Sun-lab/CARseq documentation built on Oct. 7, 2021, 1:52 p.m.