Description Usage Arguments Details Value Author(s) See Also
View source: R/cnv_inference.R
This function infers CNAs (chromosomal copy-number variations) from the single-cell expression data. CNA inference is the main method of the scandal framework for classifying malignant and non-malignant cells.
1 2 3 4 5 6 7 8 9 10 11 12 | scandal_cna_infer(
object,
reference_cells,
genome = "hg19",
max_genes = 5000,
expression_limits = c(-3, 3),
window = 100,
scaling_factor = 0.2,
initial_centering = "col",
base_metric = "median",
verbose = FALSE
)
|
object |
a ScandalDataSet object. |
reference_cells |
a named vector of the cluster assignments of the reference cells.
The names should correspond to the cell IDs of the reference (non-malignant) cells. The
CNA matrix can be computed without a reference (with |
genome |
a string indicating the genome to be used for CNA inference. Must be one of the available genomes in the infercna package. Default is hg19. |
max_genes |
maximal number of genes to use for computing the CNA matrix. Default is 5000. |
expression_limits |
a numeric vector with two elements representing the upper and lower values with which to bound the centered expression matrix prior to calculating the CNA matrix. This blunts the effect of noisy genes. Defaut is (-3, 3). |
window |
number of genes to consider when calculating the running mean. Default is a window of 100 genes. |
scaling_factor |
a small constant by which to increase the calculated (-BM, +BM) interval to compensate for possible noise. Default is 0.2. |
initial_centering |
direction of centering the expression matrix (row-wise or col-wise) prior to computing the CNA matrix. Accepts either strings "row" or "col", default is "col". |
base_metric |
a metric to use for calculating the (-BM, + BM) interval. Accepts either strings "mean" or "median", default is "median". |
verbose |
suppresses all messages from this function. Default is FALSE. |
The CNA algorithm is as follows:
Preprocessing steps:
Compute mean expression for each gene (log2[mean(TPM) + 1])
Keep the max_genes
highest expressed genes
Order the rows (genes) of the expression matrix according to chromosomal position
Log-transform the expression matrix
Mean-center of the expression matrix in the initial_centering
direction
Bound the expression matrix according to the expression_limits
Returns the ScandalDataSet object with CNA matrix in the "cna" element of the reducedDim slot (accessible by reducedDim(object, "cna")). Note that the matrix is stored with cell IDs as row names and gene IDs as column names.
Avishay Spitzer
The CNA inference method was defined and developed by **Dr. Itay Tirosh** during his time at the *Broad Institute* and published in several high-impact papers including the following paper from *Cell*: https://www.cell.com/cell/fulltext/S0092-8674(17)31270-9.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.