extractSeqs: Extract sequences for a feature in the sampleInfo object.

Description Usage Arguments Value See Also Examples

View source: R/hiReadsProcessor.R

Description

Given a sampleInfo object, the function extracts sequences for a defined feature.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
extractSeqs(
  sampleInfo,
  sector = NULL,
  samplename = NULL,
  feature = "genomic",
  trim = TRUE,
  minReadLength = 1,
  sideReturn = NULL,
  pairReturn = "both",
  strict = FALSE
)

Arguments

sampleInfo

sample information SimpleList object, which samples per sector/quadrant information along with other metadata.

sector

specific sector to extract sequences from. Default is NULL, which extracts all sectors.

samplename

specific sample to extract sequences from. Default is NULL, which extracts all samples.

feature

which part of sequence to extract (case sensitive). Options include: primed, !primed, LTRed, !LTRed, linkered, !linkered, primerIDs, genomic, genomicLinkered, decoded, and unDecoded. If a sample was primerIDed and processed by primerIDAlignSeqs, then all the rejected and unmatched attributes can be prepended to the feature. Example: vectored, Rejectedlinkered, RejectedprimerIDslinkered, Absentlinkered, or unAnchoredprimerIDslinkered. When feature is genomic, it includes sequences which are primed, LTRed, linkered, and !linkered. The genomicLinkered is same as genomic minus the !linkered. When feature is decoded, it includes everything that demultiplexed. The '!' in front of a feature extracts the inverse. One can only get unDecoded sequences if returnUnmatched was TRUE in findBarcodes. If findVector was run and "vectored" feature was found in the sampleInfo object, then genomic & genomicLinkered output will have vectored reads removed.

trim

whether to trim the given feature from sequences or keep it. Default is TRUE. This option is ignored for feature with '!'.

minReadLength

threshold for minimum length of trimmed sequences to return.

sideReturn

if trim=TRUE, which side of the sequence to return: left, middle, or right. Defaults to NULL and determined automatically. Doesn't apply to features: decoded, genomic or genomicLinkered.

pairReturn

if the data is paired end, then from which pair to return the feature. Options are "pair1", "pair2", or defaults to "both". Ignored if data is single end.

strict

this option is used when feature is either 'genomic' or 'genomicLinkered'. When a sample has no LTRed reads, primer ends are used as starting points by default to extract the genomic part. Enabling this option will strictly ensure that only reads with primer and LTR are trimmed for the 'genomic' or 'genomicLinkered' feature. Default is FALSE.

Value

a listed DNAStringSet object structed by sector then sample. Note: when feature='genomic' or 'genomicLinkered' and when data is paired end, then "pair2" includes union of reads from both pairs which found LTR.

See Also

findPrimers, findLTRs, findLinkers, trimSeqs, extractFeature, getSectorsForSamples

Examples

1
2
3
4
5
6
7
8
load(file.path(system.file("data", package = "hiReadsProcessor"),
"FLX_seqProps.RData"))
samples <- c('Roth-MLV3p-CD4TMLVWell6-Tsp509I', 
'Roth-MLV3p-CD4TMLVWell6-MseI', 'Roth-MLV3p-CD4TMLVwell5-MuA')
extractSeqs(seqProps, sector='2', samplename=samples, feature="primed")
extractSeqs(seqProps, sector='2', samplename=samples, feature="!primed")
extractSeqs(seqProps, sector='2', samplename=samples, feature="linkered")
extractSeqs(seqProps, sector='2', samplename=samples, feature="genomic")

malnirav/hiReadsProcessor documentation built on July 29, 2021, 6:33 a.m.