normalize_scope_group: Group-wise normalization of read depth with latent factors...

View source: R/normalize_scope_group.R

normalize_scope_groupR Documentation

Group-wise normalization of read depth with latent factors using Expectation-Maximization algorithm and shared clonal memberships

Description

Fit a Poisson generalized linear model to normalize the raw read depth data from single-cell DNA sequencing, with latent factors and shared clonal memberships. Model GC content bias using an expectation-maximization algorithm, which accounts for clonal specific copy number states.

Usage

normalize_scope_group(Y_qc, gc_qc, K, norm_index, groups, T, 
                        ploidyInt, beta0, minCountQC = 20)

Arguments

Y_qc

read depth matrix after quality control

gc_qc

vector of GC content for each bin after quality control

K

Number of latent Poisson factors

norm_index

indices of normal/diploid cells using group/clone labels

groups

clonal membership labels for each cell

T

a vector of integers indicating number of CNV groups. Use BIC to select optimal number of CNV groups. If T = 1, assume all reads are from normal regions so that EM algorithm is not implemented. Otherwise, we assume there is always a CNV group of heterozygous deletion and a group of null region. The rest groups are representative of different duplication states.

ploidyInt

a vector of group-wise initialized ploidy return from initialize_ploidy_group. Users are also allowed to provide prior-knowledge ploidies as the input and to manually tune a few cells/clones that have poor fitting

beta0

a vector of initialized bin-specific biases returned from CODEX2 without latent factors

minCountQC

the minimum read coverage required for normalization and EM fitting. Defalut is 20

Value

A list with components

Yhat

A list of normalized read depth matrix with EM

alpha.hat

A list of absolute copy number matrix

fGC.hat

A list of EM estimated GC content bias matrix

beta.hat

A list of EM estimated bin-specific bias vector

g.hat

A list of estimated Poisson latent factor

h.hat

A list of estimated Poisson latent factor

AIC

AIC for model selection

BIC

BIC for model selection

RSS

RSS for model selection

K

Number of latent Poisson factors

Author(s)

Rujin Wang rujin@email.unc.edu

Examples

Gini <- get_gini(Y_sim)

# first-pass CODEX2 run with no latent factors
normObj.sim <- normalize_codex2_ns_noK(Y_qc = Y_sim,
                                        gc_qc = ref_sim$gc,
                                        norm_index = which(Gini<=0.12))
Yhat.noK.sim <- normObj.sim$Yhat
beta.hat.noK.sim <- normObj.sim$beta.hat
fGC.hat.noK.sim <- normObj.sim$fGC.hat
N.sim <- normObj.sim$N

# Group-wise ploidy initialization
clones <- c("normal", "tumor1", "normal", "tumor1", "tumor1")
ploidy.sim.group <- initialize_ploidy_group(Y = Y_sim, Yhat = Yhat.noK.sim, 
                                ref = ref_sim, groups = clones)
ploidy.sim.group

normObj.scope.sim.group <- normalize_scope_group(Y_qc = Y_sim, 
                                    gc_qc = ref_sim$gc,
                                    K = 1, ploidyInt = ploidy.sim.group,
                                    norm_index = which(clones=="normal"), 
                                    groups = clones, 
                                    T = 1:5,
                                    beta0 = beta.hat.noK.sim)
Yhat.sim.group <- normObj.scope.sim.group$Yhat[[which.max(
                                    normObj.scope.sim.group$BIC)]]
fGC.hat.sim.group <- normObj.scope.sim.group$fGC.hat[[which.max(
                                    normObj.scope.sim.group$BIC)]]


rujinwang/SCOPE documentation built on Jan. 1, 2023, 5:40 a.m.