generate_coverage_tracks: Generate cell cluster pseudo-bulk coverage tracks

View source: R/coverage.R

generate_coverage_tracksR Documentation

Generate cell cluster pseudo-bulk coverage tracks

Description

Generate cell cluster pseudo-bulk coverage tracks. First, scBED files are concatenated into cell clusters contained in the 'by' column of your SingleCellExperiment object. To do so, for each sample in the given list, the barcodes of each cluster are grepped and BED files are merged into pseudo-bulk of clusters (C1,C2...). Two cells from different can have the same barcode ID as cell affectation is done sample by sample. Then coverage of pseudo-bulk BED files is calculated by averaging & smoothing reads on small genomic window (150bp per default). The pseudo bulk BED and BigWigs coverage tracks are writtend to the output directory. This functionality is not available on Windows as it uses the 'cat' and 'gzip' utilities from Unix OS.

Usage

generate_coverage_tracks(
  scExp_cf,
  input,
  odir,
  format = "scBED",
  ref_genome = c("hg38", "mm10", "ce11")[1],
  bin_width = 150,
  n_smoothBin = 5,
  read_size = 101,
  quantile_for_peak_calling = 0.85,
  by = "cell_cluster",
  progress = NULL
)

Arguments

scExp_cf

A SingleCellExperiment with cluster selected. (see choose_cluster_scExp). It is recommended having a minimum of ~100 cells per cluster in order to obtain smooth tracks.

input

Either a named list of character vector of path towards single-cell BED files or a sparse raw matrix of small bins (<<500bp). If a named list specifying scBED the names MUST correspond to the 'sample_id' column in your SingleCellExperiment object. The single-cell BED files names MUST match the barcode names in your SingleCellExperiment (column 'barcode'). The scBED files can be gzipped or not.

odir

The output directory to write the cumulative BED and BigWig files.

format

File format, either "raw_mat", "BED" or "BAM"

ref_genome

The genome of reference, used to constrain to canonical chromosomes. Either 'hg38' or 'mm10'. 'hg38' per default.

bin_width

The width of the bin to create the coverage track. The smaller the greater the resolution & runtime. Default to 150.

n_smoothBin

Number of bins left & right to average ('smooth') the signal on. Default to 5.

read_size

The estimated size of reads. Default to 101.

quantile_for_peak_calling

The quantile to define the threshold above which signal is considered as a peak.

by

A character specifying a categoricla column of scExp_cf metadata by which to group cells and generate coverage tracks and peaks.

progress

A Progress object for Shiny. Default to NULL.

Value

Generate coverage tracks (.bigwig) for each group in the SingleCellExperiment "by" column.

Examples

## Not run: 
data(scExp)
input_files_coverage = list(
  "scChIP_Jurkat_K4me3" = paste0("/path/to/",scExp$barcode[1:51],".bed"),
  "scChIP_Ramos_K4me3" = paste0("/path/to/",scExp$barcode[52:106],".bed")
)
generate_coverage_tracks(scExp, input_files_coverage, "/path/to/output",
ref_genome = "hg38")

## End(Not run)

vallotlab/ChromSCape documentation built on Oct. 15, 2023, 1:47 p.m.