rmEndAdapter | R Documentation |
This function is used to remove end adapters, starting from a fastq file.
rmEndAdapter(
fn,
nRead = 1e+08,
EndAdapter = "P7_last10",
adapter.mismatch = 0,
verbose = FALSE
)
fn |
Fully qualified name (i.e. the complete path) of the fastq file |
nRead |
The number of bytes or characters to be read at one time. See
|
EndAdapter |
A character vector with the sequence of the end adapter, "P7" or "P7_last10" (See details) |
adapter.mismatch |
The maximum number of allowed mismatch (See details) |
verbose |
Whether print out information on hits (default: FALSE) |
As mentioned in the general description of this package, most functions are
tailored to Illumina architecture. rmEndAdapter
was developed to
remove the P7 adapter at the end of single-reads. However, it can be actually
used to remove any 'tail'. Reads are trimmed at the first position of the
match with the passed EndAdapter
pattern. The sequence of the
EndAdapter
is passed (as character vector) in a 5' to 3' direction and
it is internally reversed and complemented. Other than the sequence, it is
possible to pass the character vector "P7" or "P7_last10". With the first,
the sequence of the P7 adapter is selected (CAAGCAGAAGACGGCATACGAGAT). With
the latter a partial match is searched for (the last 10 bp: CATACGAGAT).
Matches are searched using vmatchPattern
, with
adapter.mismatch
used for max.mismatch (min.mismatch is fixed to
zero).
The search is conducted with fixed=TRUE
, which means (from Biostring):
"an IUPAC ambiguity code in the pattern can only match the same code in the
subject, and vice versa".
A fastq file with the reads where the end adapter was found (and removed) saved in the same location where the input data was located. The file is named with the suffix "_EndAdRm". A list with the total number of reads that were processed and retained is also returned.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.