betaBinomXI: Fit mixture model

Description Usage Arguments Details Value See Also Examples

View source: R/betabin.R

Description

Fit a mixture model to estimate mosaicism and XCI-escape.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
betaBinomXI(
  genic_dt,
  model = "AUTO",
  plot = FALSE,
  hist = FALSE,
  flag = 0,
  xciGenes = NULL,
  a0 = NULL,
  optimizer = c("nlminb", "optim"),
  method = NULL,
  limits = TRUE,
  keep_params = FALSE,
  debug = FALSE
)

Arguments

genic_dt

A data.table. The table as outputted by getGenicDP.

model

A character indicating which model to use to estimate the mosaicism. Valid choices are "AUTO", "M0", "M1", "M2", "MF". See details.

plot

A logical. If set to TRUE, information about the training set and the skewing estimate will be plotted.

hist

A logical. If set to TRUE, an histogram of the skewing estimates will be displayed.

flag

A numeric. Specify how to handle convergence issues. See details.

xciGenes

A character or NULL. To be passed to readXCI to select the training set of inactivated genes.

a0

A numeric or NULL. Starting values for the optimization. This should not be used with more than one model as different models have different parameters. Leave NULL unless you know what you're doing.

optimizer

A character. The optimization function to use for minimization of the log-likelihood. Should be one of "nlminb" or "optim".

method

A character. The method to be passed to optim when it is the selected optimizer.

limits

A logical. If set to TRUE, the optimization will be constrained. Using upper bounds on the probability of sequencing error and escape in the training set ensures that the dominant mixture represents the skewing for inactivated genes.

keep_params

A logical. If set to TRUE, all parameters will be reported instead of just the alpha and beta estimates. Can useful for model specific analysis but clutters the table.

debug

A logical. If set to TRUE, information about each iteration will be printed (Useful to identify problematic samples).

Details

The model determines the number of components used in the mixture model. By default, "AUTO" tries all combinations of mixtures and the best estimate is kept using backward selection based on AIC. M0 is a simple beta-binomial. M1 adds a binomial component to model the sequencing errors. M2 jointly models the probability of misclasification in the training set. MF include all 3 components.

Flags in the output reports issues in convergence. If flag is set to 0, nothing is done. If set to 1, the model selection will avoid flagged models (will favor parcimonious models). If set to 2, calls for which the best selected model had convergence issue will be removed.

Value

A data.table with an entry per sample and per gene.

See Also

getGenicDP readXCI

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
library(data.table)
# Simulated data
dtf <- system.file("extdata/data2_vignette.tsv", package = "XCIR")
dt <- fread(dtf)
xcigf <- system.file("extdata/xcig_vignette.txt", package = "XCIR")
xcig <- readLines(xcigf)
# Run all models on the data
all <- betaBinomXI(dt, xciGenes = xcig)
# Simple BetaBinomial model and show histogram of skewing ~~A~~~
bb <- betaBinomXI(dt, xciGenes = xcig, model = "M0", hist = TRUE)

# Plotting fits
stoshow <- paste0("sample", c(31, 33, 35, 40)) #interesting samples
plotQC(all[sample %in% stoshow], xcig = xcig)

# Summarizing results
# Sample information
samps <- sample_clean(all)
# Gene-level predictions
xcistates <- getXCIstate(all)

SRenan/XCIR documentation built on Oct. 8, 2021, 3:11 a.m.