utils-sequence: Sequence-related utility functions.

Description Usage Arguments Value Author(s) See Also Examples

Description

Sequence-related utility functions.

Usage

1
2
3
4
5
6
7
8
count_klets(string, k = 1, alph)

get_klets(lets, k = 1)

mask_seqs(seqs, pattern, RC = FALSE, letter = "-")

shuffle_string(string, k = 1, method = c("euler", "linear", "markov"),
  rng.seed = sample.int(10000, 1))

Arguments

string

character(1) A length one character vector.

k

integer(1) K-let size.

alph

character(1) A single character string with the desired sequence alphabet. If missing, finds the unique letters in the string.

lets

character A character vector where each element will be considered a single unit.

seqs

XStringSet Sequences to mask. Cannot be BStringSet.

pattern

character(1) Pattern to mask.

RC

logical(1) Whether to mask the reverse complement of the pattern.

letter

character(1) Character to use for masking.

method

character(1) Shuffling method. One of c("euler", "linear", "markov"). See shuffle_sequences().

rng.seed

numeric(1) Set random number generator seed. Since shuffling in shuffle_sequences() can occur simultaneously in multiple threads using C++, it cannot communicate with the regular R random number generator state and thus requires an independent seed. Since shuffle_string() uses the same underlying code as shuffle_sequences(), it also requires a separate seed even if it is run in serial.

Value

For count_klets(): A data.frame with columns lets and counts.

For get_klets(): A character vector of k-lets.

For mask_seqs(): The masked XStringSet object.

For shuffle_string(): A single character string.

Author(s)

Benjamin Jean-Marie Tremblay, b2tremblay@uwaterloo.ca

See Also

create_sequences(), shuffle_sequences()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#######################################################################
## count_klets
## Count k-lets for any string of characters
count_klets("GCAAATGTACGCAGGGCCGA", k = 2)
## The default 'k' value (1) counts individual letters
count_klets("GCAAATGTACGCAGGGCCGA")

#######################################################################
## get_klets
## Generate all possible k-lets for a set of characters
get_klets(c("A", "C", "G", "T"), 3)
## Note that each element in 'lets' is considered a single unit;
## see:
get_klets(c("AA", "B"), k = 2)

#######################################################################
## mask_seqs
## Mask repetitive seqeuences
data(ArabidopsisPromoters)
mask_seqs(ArabidopsisPromoters, "AAAAAA")

#######################################################################
## shuffle_string
## Shuffle any string of characters
shuffle_string("ASDADASDASDASD", k = 2)

universalmotif documentation built on April 8, 2021, 6 p.m.