generateSummaries: A function to generate summaries from BC32 fastq files.

Description Usage Arguments Details Value Examples

View source: R/generateSummaries.R

Description

A function to generate summaries from BC32 fastq files.

Usage

1
2
3
4
5
6
7
8
9
generateSummaries(
  pat,
  restriction,
  sampname,
  base_q = 20,
  idx_mis = 1,
  bb_mis = 1,
  indels = 1
)

Arguments

pat

The barcode backbone pattern for matching.

restriction

The nucleotide sequence of the restriction site flanking the barcode at the 3' end.

sampname

A data frame with 3 columns containing the name of each sample to process, its file name and the expected multiplex index sequence.

base_q

The minimum mean base quality in the first 90 nucleotides required to keep a read for the processing. Defaults to 20.

idx_mis

The number of mismatches allowed during index matching. Defaults to 1.

bb_mis

The number of mismatches allowed during barcode matching. Defaults to 1.

indels

The number of indels allowed during barcode matching. This strongly impacts the speed of the function. Try to keep it low. Defaults to 1.

Details

This function extracts the number of different barcode sequences found in a provided list of fastq files and export a summary table, quality control plots, and a list of the most common sequences that failed index or barcode matching.

Value

Returns data in a nested list of samples containing a list of sequences not matching the index, QC data and sequences not matching the barcode index.

Examples

1
2
3
bc_data <- generateSummaries(pat = "CTAGCCAGTT",
                             restriction = "CTCGAG"
                             sampname = sampname)

vroh/BC32_BarSeq documentation built on Jan. 25, 2021, 9:24 p.m.