extractSubseq: Extract subsequence

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Extract an arbitrary read subsequence corresponding to positions of the aligned adaptor.

Usage

1
extractSubseq(aligned, subseq1, subseq2, number=1e5, BPPARAM=SerialParam()) 

Arguments

aligned

A DataFrame containing the output of adaptorAlign.

subseq1

A list of two integer vectors start and end of equal length. Parallel entries specify the start and end positions on adaptor 1 to extract the aligned read subequence.

subseq2

Same as subseq1 but for adaptor 2.

number

Integer scalar specifying the number of records to read at once from the FASTQ file, see ?FastqStreamer.

BPPARAM

A BiocParallelParam object specifying how the parallelization is to be performed.

Details

This function will align the adaptors in aligned to the start and end of each read (see ?adaptorAlign). From the alignment, it will extract the subsequence of the read corresponding to the specified positions on the adaptor sequence in subseq1 or subseq2. This is useful in other functions such as expectedDist, which rely on read sequences corresponding to constant regions of the adaptor.

At least one of subseq1 or subseq2 must be specified.

Value

A list containing up to two DataFrames. Each DataFrame corresponds to an adaptor and contains the extracted read subsequences where each row corresponds to a row of aligned. DataFrames are only returned for adaptors where subseq* was specified.

Author(s)

Aaron Lun

See Also

adaptorAlign to generate aligned.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
example(adaptorAlign)

# Let's say we want to take the first part of 'a1'.
substr(a1, 1, 9)
extractSubseq(out, subseq1=list(starts=1, ends=9))

# Let's say we also want to take some part of 'a2'.
substr(a2, 5, 11)
extractSubseq(out, subseq1=list(starts=1, ends=9),
    subseq2=list(starts=5, ends=11))

MarioniLab/sarlacc documentation built on May 13, 2019, 12:51 p.m.