Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/exclude.seqsite.R
Given a vector of sequences representing DNA fragments digested by restriction enzyme, the function return the DNA fragments that do not contain a specified restriction site, which is typically used to reduce the number of loci in the RESTseq method. The function can be use repeatedly for excluding fragments containing several restriction sites.
1 | exclude.seqsite(sequences, site, verbose=TRUE)
|
sequences |
a vector of DNA sequences representing DNA fragments after digestion by restriction enzyme(s), typically the output of another function such as |
site |
restriction site to target DNA fragments to exclude. This typically corresponds to recognition site of a frequent cutter restriction enzyme. |
verbose |
If TRUE (the default), returns the number of fragments excluded and kept. FALSE makes the function silent to be used in a loop. |
Frequent cutter restriction enzyme can be easily used to further reduce the number of fragments as demonstrated by RESTseq method. This approach looks interesting in some species with complex genomes as it allows removing parts of the genomes containing highly repetitive CG or / and AT rich sequences.
This function can be used directly after a single enzyme digestion using insilico.digest
function to remove fragments containing restriction site of a second enzyme. An equivalent alternative would be to simulate a double digestion using insilico.digest
followed by adapt.select
with type = "AA", which would remove fragments containing restriction site of the enzyme 2 (see example below).
An unlimited number of exclusion steps using different restriction enzyme can be simulated by running the function with the output of a previous execution of the function (see example below).
A vector of DNA fragment sequences.
Olivier Lepais
Lepais O & Weir JT. 2014. SimRAD: an R package for simulation-based prediction of the number of loci expected in RADseq and similar genotyping by sequencing approaches. Molecular Ecology Resources, 14, 1314-1321. DOI: 10.1111/1755-0998.12273.
Stolle & Moritz 2013. RESTseq - Efficient benchtop population genomics with RESTriction fragment SEQuencing. PLoS ONE 8: e63960. doi:10.1371/journal.pone.0063960
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | ### Example 1:
# simulating some sequence:
simseq <- sim.DNAseq(size=1000000, GCfreq=0.433)
#Restriction Enzyme 1
#PstI
cs_5p1 <- "CTGCA"
cs_3p1 <- "G"
#Restriction Enzyme 2
#MseI #
cs_5p2 <- "T"
cs_3p2 <- "TAA"
# hence, recognition site: "TTAA"
# single digestion:
simseq.dig <- insilico.digest(simseq, cs_5p1, cs_3p1, cs_5p1, cs_3p1, verbose=TRUE)
# excluding fragments coutaining restriction site of the enzyme 2
simseq.exc <- exclude.seqsite(simseq.dig, "TTAA")
## which is equivalent to:
simseq.dig2 <- insilico.digest(simseq, cs_5p1, cs_3p1, cs_5p2, cs_3p2, verbose=TRUE)
simseq.selectAA <- adapt.select(simseq.dig2, type="AA", cs_5p1, cs_3p1, cs_5p2, cs_3p2)
length(simseq.selectAA)
### Example 2:
simseq <- sim.DNAseq(size=1000000, GCfreq=0.51)
#Restriction Enzyme 1
#TaqI
cs_5p1 <- "T"
cs_3p1 <- "CGA"
simseq.dig <- insilico.digest(simseq, cs_5p1, cs_3p1, cs_5p1, cs_3p1, verbose=TRUE)
# removing fragments countaining restiction sites of MseI ("TTAA"), MliCI ("AATT"),
# HaellI ("GGCC"), MspI ("CCGG") and HinP1I ("GCGC"):
excl1 <- exclude.seqsite(simseq.dig, "TTAA")
excl2 <- exclude.seqsite(excl1, "AATT")
excl3 <- exclude.seqsite(excl2, "GGCC")
excl4 <- exclude.seqsite(excl3, "CCGG")
excl5 <- exclude.seqsite(excl4, "GCGC")
# which can be followed by size selection step.
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.