generate_sequences: function counts n-grams in given sequences

Description Usage Arguments Value Examples

View source: R/generate_sequences.R

Description

function counts n-grams in given sequences

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
generate_sequences(
  n_seq,
  l_seq,
  alphabet,
  motifs_list,
  n_motifs,
  fraction = 0.5,
  seqProbs = NULL,
  n = 4,
  d = 6
)

Arguments

n_seq

number of sequences to be generated

l_seq

sequence length

alphabet

elements used to build sequence

motifs_list

list of injected motifs

n_motifs

number of motifs injected to each positive sequence

fraction

TODO: add fraction: of positive sequences / change approach

seqProbs

alphabet probabilites for sequences

n

maximum number of alphabet elements in n-gram

d

maximum number of gaps in n-gram

Value

generated sequences

Examples

1
2
3
4
5
6
7
8
n_seq <- 20
len <- 1200
alph <- letters[1:4]
motifs <- generate_motifs(alph, 2)
results <- generate_sequences(n_seq, len, alph, motifs, 1)
results <- generate_sequences(n_seq, len, alph, motifs, 1, seqProbs = c(0.7, 0.1, 0.1, 0.1))
results
attributes(results)

jakubkala/QuiPTsim documentation built on Jan. 17, 2022, 11:27 p.m.