auto_cna_signal: Automated pipeline to compute CNA signal from scRNA...

View source: R/auto_cna_signal.R

auto_cna_signalR Documentation

Automated pipeline to compute CNA signal from scRNA expression

Description

Goes from reading raw gene counts to CNA-level signal, tSNE and community detection.

Usage

auto_cna_signal(
  data,
  genes_coord,
  prefix = "scCNAutils_out",
  nb_cores = 1,
  pause_after_qc = FALSE,
  use_cache = TRUE,
  sample_names = NULL,
  info_df = NULL,
  max_mito_prop = 0.2,
  min_total_exp = 0,
  cells_sel = NULL,
  chrs = c(1:22, "X", "Y"),
  cell_cycle = NULL,
  bin_mean_exp = 3,
  rm_cv_quant = NULL,
  z_wins_th = 3,
  smooth_wsize = 3,
  cc_sd_th = 3,
  nb_pcs = 10,
  comm_k = 100,
  viz = c("tsne", "umap", "both"),
  tsne.seed = 999,
  rcpp = TRUE
)

Arguments

data

a data.frame with gene expression or the path to the folder with the 'matrix.mtx', 'genes.tsv' and 'barcodes.tsv' files. A list if multiple samples.

genes_coord

either a file name or a data.frame with coordinates and gene names.

prefix

the prefix to use for the files created by this function (e.g. graphs).

nb_cores

the number of processors to use.

pause_after_qc

pause after the QC to pick custom QC thresholds.

use_cache

should intermediate files used and avoid redoing steps?

sample_names

the names of each sample. If NULL, tries to use data's names.

info_df

a data.frame with information about cells.

max_mito_prop

the maximum proportion of mitochondrial RNA.

min_total_exp

the minimum total cell expression

cells_sel

consider only these cells. Other cells filtered no matter what.

chrs

the chromosome names to keep. NULL to include all the chromosomes.

cell_cycle

if non-null, either a file or data.frame to compute cell cycle scores. See details.

bin_mean_exp

the desired minimum mean expression in the bin.

rm_cv_quant

the quantile threshold to remove CV outlier. Default NULL (i.e. not used).

z_wins_th

the threshold to winsorize Z-score. Default is 3

smooth_wsize

the window size for smoothing. Default is 3.

cc_sd_th

the number of SD used for the thresholds when defining cycling cells.

nb_pcs

the number of PCs used in the community detection or tSNE.

comm_k

the number of nearest neighbor for the KNN graph. Default 100.

viz

which method to use for visualization ('tsne', 'umap' or 'both'). Default is 'tsne'.

tsne.seed

the seed for the tSNE.

rcpp

use Rcpp function. Default is TRUE. More memory-efficient and faster when running on one core.

Value

a data.frame with QC, community and tSNE for each cell.

Author(s)

Jean Monlong


jmonlong/scCNAutils documentation built on May 3, 2022, 4:34 a.m.