alignRsubread: A wrapper to 'Rsubread' read alignment function 'align'

View source: R/alignRsubread.R

alignRsubreadR Documentation

A wrapper to Rsubread read alignment function align

Description

This function is not available in Windows environment. Align cell specific reads to reference genome and write sequence alignment results to output directory. A wrapper to the align function in Rsubread package. For details please refer to Rsubread manual.

Usage

alignRsubread(
  sce,
  index,
  unique = FALSE,
  nBestLocations = 1,
  format = "BAM",
  outDir = "./Alignment",
  cores = max(1, parallelly::availableCores() - 2),
  threads = 1,
  summaryPrefix = "alignment",
  overwrite = FALSE,
  verbose = FALSE,
  logfilePrefix = format(Sys.time(), "%Y%m%d_%H%M%S"),
  ...
)

Arguments

sce

A SingleCellExperiment object of which the colData slot contains the fastq_path column with paths to input cell-specific FASTQ files.

index

Path to the Rsubread index of the reference genome. For generation of Rsubread indices, please refer to buildindex function in Rsubread package.

unique

Argument passed to align function in Rsubread package. Boolean indicating if only uniquely mapped reads should be reported. A uniquely mapped read has one single mapping location that has less mis-matched bases than any other candidate locations. If set to FALSE, multi-mapping reads will be reported in addition to uniquely mapped reads. Number of alignments reported for each multi-mapping read is determined by the nBestLocations parameter. Default is FALSE.

nBestLocations

Argument passed to align function in Rsubread package. Numeric value specifying the maximal number of equally-best mapping locations that will be reported for a multi-mapping read. 1 by default. The allowed value is between 1 to 16 (inclusive). In the mapping output, "NH" tag is used to indicate how many alignments are reported for the read and "HI" tag is used for numbering the alignments reported for the same read. This argument is only applicable when unique option is set to FALSE. Scruff package does not support counting alignment files with nBestLocations > 1.

format

File format of sequence alignment results. "BAM" or "SAM". Default is "BAM".

outDir

Output directory for alignment results. Sequence alignment files will be stored in folders in this directory, respectively. Make sure the folder is empty. Default is "./Alignment".

cores

Number of cores used for parallelization. Default is max(1, parallelly::availableCores() - 2), i.e. the number of available cores minus 2.

threads

Do not change. Number of threads/CPUs used for mapping for each core. Refer to align function in Rsubread for details. Default is 1. It should not be changed in most cases.

summaryPrefix

Prefix for alignment summary filename. Default is "alignment".

overwrite

Boolean indicating whether to overwrite the output directory. Default is FALSE.

verbose

Boolean indicating whether to print log messages. Useful for debugging. Default to FALSE.

logfilePrefix

Prefix for log file. Default is current date and time in the format of format(Sys.time(), "%Y%m%d_%H%M%S").

...

Additional arguments passed to the align function in Rsubread package.

Value

A SingleCellExperiment object containing the alignment summary information in the colData slot. The alignment_path column of the annotation table contains the paths to output alignment files.

Examples

# The SingleCellExperiment object returned by demultiplex function is
# required for running alignRsubread function

## Not run: 
data(barcodeExample, package = "scruff")
fastqs <- list.files(system.file("extdata", package = "scruff"),
    pattern = "\\.fastq\\.gz", full.names = TRUE)

de <- demultiplex(
    project = "example",
    experiment = c("1h1"),
    lane = c("L001"),
    read1Path = c(fastqs[1]),
    read2Path = c(fastqs[2]),
    barcodeExample,
    bcStart = 1,
    bcStop = 8,
    umiStart = 9,
    umiStop = 12,
    keep = 75,
    overwrite = TRUE)

# Alignment
library(Rsubread)
# Create index files for GRCm38_MT.
fasta <- system.file("extdata", "GRCm38_MT.fa", package = "scruff")
# Specify the basename for Rsubread index
indexBase <- "GRCm38_MT"
buildindex(basename = indexBase, reference = fasta, indexSplit = FALSE)

al <- alignRsubread(de, indexBase, overwrite = TRUE)

## End(Not run)

campbio/scruff documentation built on April 2, 2024, 12:53 a.m.