amplicanAlign: Align reads to amplicons.

Description Usage Arguments Value See Also Examples

View source: R/amplicanAlign.R

Description

amplicanAlign takes a configuration files, fastq reads and output directory to prepare alignments and summary. It uses global Needleman-Wunsch algorithm with parameters optimized for CRISPR experiment. After alignments, object of AlignmentsExperimentSet is returned that allows for coercion into GRanges (plus is for forward and minus for reverse reads). It is also possible to output alignments in other, additional formats.

Usage

1
2
3
4
5
6
amplicanAlign(config, fastq_folder, use_parallel = FALSE,
  average_quality = 30, min_quality = 20,
  scoring_matrix = Biostrings::nucleotideSubstitutionMatrix(match = 5,
  mismatch = -4, baseOnly = TRUE, type = "DNA"), gap_opening = 25,
  gap_extension = 0, fastqfiles = 0.5, primer_mismatch = 0,
  donor_mismatch = 3)

Arguments

config

(string) The path to your configuration file. For example: system.file("extdata", "config.txt", package = "amplican"). Configuration file can contain additional columns, but first 11 columns have to follow the example config specification.

fastq_folder

(string) Path to FASTQ files. If not specified, FASTQ files should be in the same directory as config file.

use_parallel

(boolean) Set to TRUE, if you have registered multicore back-end.

average_quality

(numeric) The FASTQ file have a quality for each nucleotide, depending on sequencing technology there exist many formats. This package uses readFastq to parse the reads. If the average quality of the reads fall below value of average_quality then sequence is filtered. Default is 0.

min_quality

(numeric) Similar as in average_quality, but depicts the minimum quality for ALL nucleotides in given read. If one of nucleotides has quality BELLOW min_quality, then the sequence is filtered. Default is 20.

scoring_matrix

(matrix) Default is 'NUC44'. Pass desired matrix using nucleotideSubstitutionMatrix.

gap_opening

(numeric) The opening gap score.

gap_extension

(numeric) The gap extension score.

fastqfiles

(numeric) Normally you want to use both FASTQ files. But in some special cases, you may want to use only the forward file, or only the reverse file. Possible options:

  • 0 Use both FASTQ files.

  • 0.5 Use both FASTQ files, but only for one of the reads (forward or reverse) is required to have primer perfectly matched to sequence - eg. use when reverse reads are trimmed of primers, but forward reads have forward primer in the sequence.

  • 1 Use only the forward FASTQ file.

  • 2 Use only the reverse FASTQ file.

primer_mismatch

(numeric) Decide how many mismatches are allowed during primer matching of the reads, that groups reads by experiments. When primer_mismatch = 0 no mismatches are allowed, which can increase number of unasssigned read.

donor_mismatch

(numeric) How many events of length 1 (mismatches, deletions and insertions of length 1) are allowed when aligning toward the donor template. This parameter is only used when donor template is specified. The higher the parameter the less strict will be algorithm accepting read as HDR. Set to 0 if only perfect alignments to the donor template marked as HDR, unadvised due to error rate of the sequencers.

Value

(AlignmentsExperimentSet) Check AlignmentsExperimentSet class for details. You can use lookupAlignment to examine alignments visually.

See Also

Other analysis steps: amplicanConsensus, amplicanFilter, amplicanMap, amplicanNormalize, amplicanOverlap, amplicanPipelineConservative, amplicanPipeline, amplicanReport, amplicanSummarize

Examples

1
2
3
4
5
6
# path to example config file
config <- system.file("extdata", "config.csv", package = "amplican")
# path to example fastq files
fastq_folder <- system.file("extdata", package = "amplican")
aln <- amplicanAlign(config, fastq_folder)
aln

amplican documentation built on Nov. 8, 2020, 11:10 p.m.