identifySTRRegions: Identify the STR regions of a fastq-file or...

identifySTRRegionsR Documentation

Identify the STR regions of a fastq-file or ShortReadQ-object.

Description

identifySTRRegions takes a fastq-file location or a ShortReadQ-object and identifies the STR regions based on a directly adjacent flanking regions. The function allows for mutation in the flanking regions through the numberOfMutation argument.

Usage

identifySTRRegions(reads, flankingRegions, numberOfMutation, control)

Arguments

reads

either a fastq-file location or a ShortReadQ-object

flankingRegions

containing marker ID/name, the directly adjacent forward and reverse flanking regions, used for identification.

numberOfMutation

the maximum number of mutations (base-calling errors) allowed during flanking region identification.

control

an identifySTRRegions.control-object.

Value

The returned object is a list of lists. If the reverse complement strings are not included or if the control$combineLists == TRUE, a list, contains lists of untrimmed and trimmed strings for each row in flankingRegions. If control$combineLists == FALSE, the function returns a list of two such lists, one for forward strings and one for the reverse complement strings.

Examples

library("Biostrings")
library("ShortRead")

# Path to file
readPath <- system.file('extdata', "sampleSequences.fastq", package = 'STRMPS')

# Flanking regions
data("flankingRegions")

# Read the file into memory
readFile <- readFastq(readPath)
sread(readFile)
quality(readFile)

# Identify the STR's of the file, both readPath and readFile can be used.
identifySTRRegions(
    reads = readFile,
    flankingRegions = flankingRegions,
    numberOfMutation = 1,
    control = identifySTRRegions.control(
        numberOfThreads = 1,
        includeReverseComplement = FALSE)
)


svilsen/STRMPS documentation built on Feb. 22, 2025, 4:51 p.m.