View source: R/shuffle_sequences.R
shuffle_sequences  R Documentation 
Given a set of input sequences, shuffle the letters within those sequences with any klet size.
shuffle_sequences(sequences, k = 1, method = "euler", nthreads = 1, rng.seed = sample.int(10000, 1), window = FALSE, window.size = 0.1, window.overlap = 0.01)
sequences 

k 

method 

nthreads 

rng.seed 

window 

window.size 

window.overlap 

If method = 'markov'
, then the Markov model is used to
generate sequences which will maintain (on average) the klet
frequencies. Please note that this method is not a 'true' shuffling, and
for short sequences (e.g. <100bp) this can result in slightly more
dissimilar sequences versus true shuffling. See
Fitch (1983) for a discussion on the
topic.
If method = 'euler'
, then the sequence shuffling method proposed by
Altschul and Erickson (1985) is used. As opposed
to the 'markov' method, this one preserves exact klet frequencies. This
is done by creating a klet edge graph, then following a
random Eulerian walk through the graph. Not all walks will use up all
available letters however, so the cyclepopping algorithm proposed by
Propp and Wilson (1998) is used to find a
random Eulerian path. A side effect of using this method is that the
starting and ending sequence letters will remain unshuffled.
If method = 'linear'
, then the input sequences are split linearly
every k
letters. For example, for k = 3
'ACAGATAGACCC' becomes
'ACA GAT AGA CCC'; after which these 3
lets are shuffled randomly.
Do note however, that the method
parameter is only relevant for k > 1
.
For k = 1
, a simple shuffling is performed using the shuffle
function
from the C++ standard library.
XStringSet
The input sequences will be returned with
identical names and lengths.
Benjamin JeanMarie Tremblay, benjamin.tremblay@uwaterloo.ca
Altschul SF, Erickson BW (1985). “Significance of Nucleotide Sequence Alignments: A Method for Random Sequence Permutation That Preserves Dinucleotide and Codon Usage.” Molecular Biology and Evolution, 2, 526538.
Fitch WM (1983). “Random sequences.” Journal of Molecular Biology, 163, 171176.
Propp JG, Wilson DW (1998). “How to get a perfectly random sample from a generic markov chain and generate a random spanning tree of a directed graph.” Journal of Algorithms, 27, 170217.
create_sequences()
, scan_sequences()
, enrich_motifs()
,
shuffle_motifs()
if (R.Version()$arch != "i386") { sequences < create_sequences() sequences.shuffled < shuffle_sequences(sequences, k = 2) }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.