motif_detection: DNA-motif detection in a given DNAStringSet, a given...

Description Usage Arguments Value Author(s) References See Also Examples

Description

This function searches for a given "DNA-motif" in a DNA-sequence. The argument seqName can be either a DNAStringSet object or it refers to a fasta-file. Additionally, we provide the option to specify a species, a chromosome, a start, and a stop position for a region of any reference genome to be analyzed. By default a region of the human genome is analyzed. Optionally, one can also specify the number of mismatches of the DNA-motif and whether the reverse complement has to be searched.

Usage

1
2
3
motif_detection(seqName, chrs, start.position, end.position, motif,
  nr.mismatch = 0, reverse.comp = F, print.status = T,
  species = BSgenome.Hsapiens.UCSC.hg19::Hsapiens)

Arguments

seqName

A character string which can either be the name of a DNAStringSet object or a sequence name referring to a fasta-file to be analyzed. This argument can only be ignored if chr and start.position and end.position are specified.

chrs

A character string reflecting the chromosome under study (starting with "chr" and adding either the integers from 1-22 or "X" respectively "Y" for the human chromosome). This argument can also be a vector of strings to study several chromosomes.

start.position

An integer value reflecting the start position of the region to be analyzed. If set to NA the analysis starts from the beginning of the chromosome.

end.position

An integer value reflecting the end position of the region to be analyzed. If set to NA the analysis is performed until the end of the chromosome.

motif

A character string reflecting the specified DNA-motif to be searched for in the DNA-sequence.

nr.mismatch

This integer specifies the number of allowed mismatches when searching for the specified DNA-motif.

reverse.comp

A logical value, by default FALSE, which enables to search the reverse complement of the sequence if set to TRUE.

print.status

A logical value reflecting whether the current status of the worked sequence (relative to the sequence length) is printed (TRUE) or not (FALSE).

species

The human genome (version 19) is default but an alternative genome can be provided. For chimpanzees the parameter has to be BSgenome.Ptroglodytes.UCSC.panTro5 (given that the data is installed).

Value

The output of the function is a list with the following content:

Species

The name of the species under study

Sequence Name

The name of the region under study

Reverse Complement

Indicator whether the reverse complement was searched

Number of Matches

The frequency of found DNA-motifs in the region under study

Start Positions of Matches

The start positions of the found DNA-motifs

Number of allowed Mismatches

The number of allowed mismatches when searching for the DNA-motif

Matched Segments

The list of the segments containing the DNA-motif

Author(s)

Philipp Hermann, philipp.hermann@jku.at, Monika Heinzl, monika.heinzl@edumail.at Angelika Heissl, Irene Tiemann-Boege, Andreas Futschik

References

Heissl, A., et al. (2018) Length asymmetry and heterozygosity strongly influences the evolution of poly-A microsatellites at meiotic recombination hotspots. doi: https://doi.org/10.1101/431841

See Also

getflank2

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
data(chr6_1580213_1582559)
motif_detection(seqName = chr6_1580213_1582559, start.position = NA, end.position = NA,
motif = "CCNCCNTNNCCNC", nr.mismatch = 1, reverse.comp = FALSE, print.status = FALSE)


motif_detection(chrs = "chr6", start.position = 1580213, end.position = 1582559,
motif = "CCNCCNTNNCCNC", nr.mismatch = 1, reverse.comp = FALSE, print.status = FALSE)
# If you want to use the function with a different reference genome
# make your choice and install it before:
if(requireNamespace("BSgenome.Ptroglodytes.UCSC.panTro5")) {
motif_detection(chrs = "chr1", start.position =222339618, end.position = 222339660,
motif = "A", nr.mismatch = 0, reverse.comp = FALSE, print.status = FALSE,
species = BSgenome.Ptroglodytes.UCSC.panTro5::BSgenome.Ptroglodytes.UCSC.panTro5)
}

STRAH documentation built on May 2, 2019, 11:03 a.m.

Related to motif_detection in STRAH...