Impute: Dropout imputation using different methods

Description Usage Arguments Details Value See Also Examples

View source: R/Wrap.R

Description

Impute performs dropout imputation on normalized data, based on the choice of imputation methods.

Usage

1
2
3
4
5
6
7
8
Impute(data, sce = NULL, do = 'Ensemble', write = FALSE,
outdir = getwd(), method.choice = NULL, scale = 1, pseudo.count = 1,
labels = NULL, cell.clusters = 2, drop_thre = NULL, type = 'count',
tr.length = ADImpute::transcript_length,
cores = BiocParallel::bpworkers(BPPARAM),
BPPARAM = BiocParallel::SnowParam(type = "SOCK"),
net.coef = ADImpute::network.coefficients, net.implementation = 'iteration',
bulk = NULL, true.zero.thr = NULL, prob.mat = NULL, ...)

Arguments

data

matrix; raw counts (genes as rows and samples as columns)

sce

SingleCellExperiment; normalized counts and associated metadata.

do

character; choice of methods to be used for imputation. Currently supported methods are 'Baseline', 'DrImpute', 'Network', and 'Ensemble'. Defaults to 'Ensemble'. Not case-sensitive. Can include one or more methods. Non-supported methods will be ignored.

write

logical; write intermediary and imputed objects to files?

outdir

character; path to directory where output files are written. Defaults to working directory

method.choice

character; best performing method in training data for each gene

scale

integer; scaling factor to divide all expression levels by (defaults to 1)

pseudo.count

integer; pseudo-count to be added to expression levels to avoid log(0) (defaults to 1)

labels

character; vector specifying the cell type of each column of data

cell.clusters

integer; number of cell subpopulations

drop_thre

numeric; between 0 and 1 specifying the threshold to determine dropout values

type

A character specifying the type of values in the expression matrix. Can be 'count' or 'TPM'

tr.length

matrix with at least 2 columns: 'hgnc_symbol' and 'transcript_length'

cores

integer; number of cores used for paralell computation

BPPARAM

parallel back-end to be used during parallel computation. See BiocParallelParam-class.

net.coef

matrix; network coefficients. Please provide if you don't want to use ADImpute's network model. Must contain one first column 'O' acconting for the intercept of the model and otherwise be an adjacency matrix with hgnc_symbols in rows and columns. Doesn't have to be squared. See ADImpute::demo_net for a small example.

net.implementation

character; either 'iteration', for an iterative solution, or 'pseudoinv', to use Moore-Penrose pseudo-inversion as a solution. 'pseudoinv' is not advised for big data.

bulk

vector of reference bulk RNA-seq, if available (average across samples)

true.zero.thr

if set to NULL (default), no true zero estimation is performed. Set to numeric value between 0 and 1 for estimation. Value corresponds to the threshold used to determine true zeros: if the probability of dropout is lower than true.zero.thr, the imputed entries are set to zero.

prob.mat

matrix of the same size as data, filled with the dropout probabilities for each gene in each cell

...

additional parameters to pass to network-based imputation

Details

Values that are 0 in data are imputed according to the best-performing methods indicated in method.choice. Currently supported methods are:

If 'Ensemble' is included in do, method.choice has to be provided (use output from EvaluateMethods()). Impute can create a directory imputation containing the imputation results of all methods in do. If true.zero.thr is set, dropout probabilities are computed using scImpute's framework. Expression values with dropout probabilities below true.zero.thr will be set back to 0 if imputed, as they likely correspond to true biological zeros (genes not expressed in cell) rather than technical dropouts (genes expressed but not captured). If sce is set, imputed values by the different methods are added as new assays to sce. Each assay corresponds to one imputation method. If true.zero.thr is set, only the values after filtering for biological zeros will be added. This is different from the output if sce is not set, where the original values before filtering and the dropout probability matrix are returned.

Value

See Also

EvaluateMethods, ImputeBaseline, ImputeDrImpute, ImputeNetwork, ImputeSAVER

Examples

1
2
3
4
5
6
7
# Normalize demo data
norm_data <- NormalizeRPM(demo_data)
# Impute with particular method(s)
imputed_data <- Impute(do = 'Network', data = norm_data[,1:10],
net.coef = ADImpute::demo_net)
imputed_data <- Impute(do = 'Network', data = norm_data[,1:10],
net.implementation = 'pseudoinv', net.coef = ADImpute::demo_net)

ADImpute documentation built on Nov. 8, 2020, 5:30 p.m.