View source: R/auto_cna_call.R
auto_cna_call | R Documentation |
Automated pipeline to call CNA using metacells.
auto_cna_call( ge_df, comm_df, nb_metacells = 10, metacell_size = 3, multisamps = TRUE, trans_prob = 0.1, baseline_cells = NULL, baseline_communities = NULL, prefix = "scCNAutils_out", nb_cores = 1, chrs = c(1:22, "X", "Y"), bin_mean_exp = 3, z_wins_th = 3, smooth_wsize = 3, rcpp = TRUE )
ge_df |
normalized gene expression of all cells (e.g. output from
|
comm_df |
a data.frame with community information, output from
|
nb_metacells |
the number of metacells per comunity. |
metacell_size |
the number of cells in a metacell. |
multisamps |
use the multi-sample version of the HMM segmentation? Default is TRUE. See details. |
trans_prob |
the transition probability for the HMM. |
baseline_cells |
cells to use as baseline. |
baseline_communities |
communities to use as baseline. Used if baseline.cells is NULL. |
prefix |
the prefix to use for the files created by this function (e.g. graphs). |
nb_cores |
the number of processors to use. |
chrs |
the chromosome names to keep. NULL to include all the chromosomes. |
bin_mean_exp |
the desired minimum mean expression in the bin. |
z_wins_th |
the threshold to winsorize Z-score. Default is 3 |
smooth_wsize |
the window size for smoothing. Default is 3. |
rcpp |
use Rcpp function. Default is TRUE. More memory-efficient and faster when running on one core. |
Once the metacells are created there are two ways to call CNA. First, if
multisamps=FALSE
, to call CNA on each metacell and merge the result
per community, keeping the information about how many metacell support the
CNA. Second, if multisamps=TRUE
(default), to run the HMM on all the
metacells for a community. The multi-sample approach should be more robust.
The transition probability (trans_prob
) is going to affect the HMM
segmentation. Smaller values will create longer segments. One approach,
often advocated by HMM aficionados, is to try different values and use the
ones that gives the best results, for example based on the QC graphs (TODO).
Another approach is to use a loose transition probability and then filter
short segments ('length' column or 'pass.filter' column).
a data.frame with CNAs
Jean Monlong
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.