seq2ngrams: Extract n-grams from sequence

Description Usage Arguments Details Value Examples

View source: R/ngrams.R

Description

Extracts vector of n-grams present in sequence(s).

Usage

1
seq2ngrams(seq, n, u, d = 0, pos = FALSE)

Arguments

seq

a vector or matrix describing sequence(s).

n

integer size of n-gram.

u

integer, numeric or character vector of all possible unigrams.

d

integer vector of distances between elements of n-gram (0 means consecutive elements). See Details.

pos

logical, if TRUE position-specific n_grams are counted.

Details

A format of d vector is discussed in Details of count_ngrams.

Value

A character matrix of n-grams, where every row corresponds to a different sequence.

Examples

1
2
3
# trigrams from multiple sequences
seqs <- matrix(sample(1L:4, 600, replace = TRUE), ncol = 50)
seq2ngrams(seqs, 3, 1L:4)

Example output

Loading required package: slam
      [,1]        [,2]        [,3]        [,4]        [,5]        [,6]       
 [1,] "3.2.1_0.0" "2.1.1_0.0" "1.1.3_0.0" "1.3.3_0.0" "3.3.1_0.0" "3.1.4_0.0"
 [2,] "2.1.3_0.0" "1.3.3_0.0" "3.3.2_0.0" "3.2.2_0.0" "2.2.2_0.0" "2.2.1_0.0"
 [3,] "2.1.3_0.0" "1.3.3_0.0" "3.3.4_0.0" "3.4.4_0.0" "4.4.1_0.0" "4.1.1_0.0"
 [4,] "1.3.1_0.0" "3.1.4_0.0" "1.4.4_0.0" "4.4.1_0.0" "4.1.1_0.0" "1.1.2_0.0"
 [5,] "3.2.3_0.0" "2.3.2_0.0" "3.2.4_0.0" "2.4.2_0.0" "4.2.4_0.0" "2.4.3_0.0"
 [6,] "1.3.3_0.0" "3.3.3_0.0" "3.3.3_0.0" "3.3.4_0.0" "3.4.4_0.0" "4.4.1_0.0"
 [7,] "4.3.2_0.0" "3.2.3_0.0" "2.3.2_0.0" "3.2.4_0.0" "2.4.4_0.0" "4.4.3_0.0"
 [8,] "2.1.2_0.0" "1.2.3_0.0" "2.3.4_0.0" "3.4.1_0.0" "4.1.4_0.0" "1.4.2_0.0"
 [9,] "3.2.1_0.0" "2.1.4_0.0" "1.4.1_0.0" "4.1.4_0.0" "1.4.2_0.0" "4.2.3_0.0"
[10,] "1.4.2_0.0" "4.2.4_0.0" "2.4.2_0.0" "4.2.4_0.0" "2.4.4_0.0" "4.4.3_0.0"
[11,] "2.3.1_0.0" "3.1.4_0.0" "1.4.2_0.0" "4.2.1_0.0" "2.1.2_0.0" "1.2.2_0.0"
[12,] "1.4.4_0.0" "4.4.1_0.0" "4.1.4_0.0" "1.4.2_0.0" "4.2.2_0.0" "2.2.3_0.0"
      [,7]        [,8]        [,9]        [,10]       [,11]       [,12]      
 [1,] "1.4.4_0.0" "4.4.1_0.0" "4.1.1_0.0" "1.1.1_0.0" "1.1.3_0.0" "1.3.2_0.0"
 [2,] "2.1.3_0.0" "1.3.2_0.0" "3.2.1_0.0" "2.1.4_0.0" "1.4.3_0.0" "4.3.1_0.0"
 [3,] "1.1.1_0.0" "1.1.3_0.0" "1.3.3_0.0" "3.3.2_0.0" "3.2.4_0.0" "2.4.3_0.0"
 [4,] "1.2.2_0.0" "2.2.1_0.0" "2.1.1_0.0" "1.1.3_0.0" "1.3.3_0.0" "3.3.2_0.0"
 [5,] "4.3.1_0.0" "3.1.4_0.0" "1.4.4_0.0" "4.4.1_0.0" "4.1.3_0.0" "1.3.4_0.0"
 [6,] "4.1.1_0.0" "1.1.3_0.0" "1.3.2_0.0" "3.2.3_0.0" "2.3.4_0.0" "3.4.2_0.0"
 [7,] "4.3.4_0.0" "3.4.3_0.0" "4.3.3_0.0" "3.3.4_0.0" "3.4.3_0.0" "4.3.4_0.0"
 [8,] "4.2.4_0.0" "2.4.4_0.0" "4.4.2_0.0" "4.2.2_0.0" "2.2.3_0.0" "2.3.3_0.0"
 [9,] "2.3.3_0.0" "3.3.1_0.0" "3.1.4_0.0" "1.4.4_0.0" "4.4.4_0.0" "4.4.1_0.0"
[10,] "4.3.3_0.0" "3.3.4_0.0" "3.4.3_0.0" "4.3.4_0.0" "3.4.3_0.0" "4.3.2_0.0"
[11,] "2.2.2_0.0" "2.2.3_0.0" "2.3.1_0.0" "3.1.4_0.0" "1.4.3_0.0" "4.3.1_0.0"
[12,] "2.3.3_0.0" "3.3.4_0.0" "3.4.3_0.0" "4.3.1_0.0" "3.1.4_0.0" "1.4.4_0.0"
      [,13]       [,14]       [,15]       [,16]       [,17]       [,18]      
 [1,] "3.2.2_0.0" "2.2.4_0.0" "2.4.1_0.0" "4.1.3_0.0" "1.3.3_0.0" "3.3.2_0.0"
 [2,] "3.1.1_0.0" "1.1.4_0.0" "1.4.2_0.0" "4.2.2_0.0" "2.2.4_0.0" "2.4.4_0.0"
 [3,] "4.3.1_0.0" "3.1.2_0.0" "1.2.2_0.0" "2.2.1_0.0" "2.1.3_0.0" "1.3.3_0.0"
 [4,] "3.2.3_0.0" "2.3.2_0.0" "3.2.1_0.0" "2.1.4_0.0" "1.4.3_0.0" "4.3.2_0.0"
 [5,] "3.4.4_0.0" "4.4.3_0.0" "4.3.4_0.0" "3.4.4_0.0" "4.4.4_0.0" "4.4.2_0.0"
 [6,] "4.2.3_0.0" "2.3.2_0.0" "3.2.1_0.0" "2.1.3_0.0" "1.3.3_0.0" "3.3.1_0.0"
 [7,] "3.4.2_0.0" "4.2.1_0.0" "2.1.3_0.0" "1.3.3_0.0" "3.3.3_0.0" "3.3.2_0.0"
 [8,] "3.3.4_0.0" "3.4.4_0.0" "4.4.2_0.0" "4.2.3_0.0" "2.3.4_0.0" "3.4.2_0.0"
 [9,] "4.1.3_0.0" "1.3.3_0.0" "3.3.3_0.0" "3.3.4_0.0" "3.4.4_0.0" "4.4.1_0.0"
[10,] "3.2.2_0.0" "2.2.1_0.0" "2.1.1_0.0" "1.1.2_0.0" "1.2.1_0.0" "2.1.2_0.0"
[11,] "3.1.2_0.0" "1.2.4_0.0" "2.4.1_0.0" "4.1.3_0.0" "1.3.2_0.0" "3.2.1_0.0"
[12,] "4.4.1_0.0" "4.1.2_0.0" "1.2.2_0.0" "2.2.3_0.0" "2.3.1_0.0" "3.1.2_0.0"
      [,19]       [,20]       [,21]       [,22]       [,23]       [,24]      
 [1,] "3.2.4_0.0" "2.4.4_0.0" "4.4.1_0.0" "4.1.3_0.0" "1.3.4_0.0" "3.4.4_0.0"
 [2,] "4.4.4_0.0" "4.4.1_0.0" "4.1.2_0.0" "1.2.4_0.0" "2.4.3_0.0" "4.3.4_0.0"
 [3,] "3.3.2_0.0" "3.2.3_0.0" "2.3.1_0.0" "3.1.3_0.0" "1.3.1_0.0" "3.1.2_0.0"
 [4,] "3.2.4_0.0" "2.4.1_0.0" "4.1.4_0.0" "1.4.3_0.0" "4.3.3_0.0" "3.3.4_0.0"
 [5,] "4.2.2_0.0" "2.2.2_0.0" "2.2.2_0.0" "2.2.2_0.0" "2.2.1_0.0" "2.1.2_0.0"
 [6,] "3.1.4_0.0" "1.4.1_0.0" "4.1.4_0.0" "1.4.2_0.0" "4.2.2_0.0" "2.2.3_0.0"
 [7,] "3.2.1_0.0" "2.1.4_0.0" "1.4.1_0.0" "4.1.1_0.0" "1.1.1_0.0" "1.1.2_0.0"
 [8,] "4.2.2_0.0" "2.2.2_0.0" "2.2.4_0.0" "2.4.2_0.0" "4.2.3_0.0" "2.3.2_0.0"
 [9,] "4.1.2_0.0" "1.2.1_0.0" "2.1.1_0.0" "1.1.1_0.0" "1.1.3_0.0" "1.3.1_0.0"
[10,] "1.2.4_0.0" "2.4.1_0.0" "4.1.1_0.0" "1.1.4_0.0" "1.4.3_0.0" "4.3.3_0.0"
[11,] "2.1.3_0.0" "1.3.4_0.0" "3.4.1_0.0" "4.1.2_0.0" "1.2.2_0.0" "2.2.4_0.0"
[12,] "1.2.3_0.0" "2.3.3_0.0" "3.3.2_0.0" "3.2.3_0.0" "2.3.2_0.0" "3.2.4_0.0"
      [,25]       [,26]       [,27]       [,28]       [,29]       [,30]      
 [1,] "4.4.2_0.0" "4.2.2_0.0" "2.2.1_0.0" "2.1.4_0.0" "1.4.1_0.0" "4.1.4_0.0"
 [2,] "3.4.4_0.0" "4.4.1_0.0" "4.1.4_0.0" "1.4.1_0.0" "4.1.2_0.0" "1.2.3_0.0"
 [3,] "1.2.1_0.0" "2.1.2_0.0" "1.2.4_0.0" "2.4.4_0.0" "4.4.1_0.0" "4.1.3_0.0"
 [4,] "3.4.4_0.0" "4.4.2_0.0" "4.2.1_0.0" "2.1.2_0.0" "1.2.2_0.0" "2.2.1_0.0"
 [5,] "1.2.4_0.0" "2.4.2_0.0" "4.2.1_0.0" "2.1.4_0.0" "1.4.3_0.0" "4.3.3_0.0"
 [6,] "2.3.2_0.0" "3.2.2_0.0" "2.2.1_0.0" "2.1.3_0.0" "1.3.3_0.0" "3.3.2_0.0"
 [7,] "1.2.2_0.0" "2.2.1_0.0" "2.1.4_0.0" "1.4.4_0.0" "4.4.3_0.0" "4.3.2_0.0"
 [8,] "3.2.3_0.0" "2.3.1_0.0" "3.1.1_0.0" "1.1.1_0.0" "1.1.2_0.0" "1.2.2_0.0"
 [9,] "3.1.2_0.0" "1.2.1_0.0" "2.1.2_0.0" "1.2.4_0.0" "2.4.2_0.0" "4.2.4_0.0"
[10,] "3.3.4_0.0" "3.4.1_0.0" "4.1.1_0.0" "1.1.2_0.0" "1.2.2_0.0" "2.2.4_0.0"
[11,] "2.4.1_0.0" "4.1.1_0.0" "1.1.4_0.0" "1.4.4_0.0" "4.4.4_0.0" "4.4.1_0.0"
[12,] "2.4.3_0.0" "4.3.3_0.0" "3.3.1_0.0" "3.1.4_0.0" "1.4.3_0.0" "4.3.2_0.0"
      [,31]       [,32]       [,33]       [,34]       [,35]       [,36]      
 [1,] "1.4.2_0.0" "4.2.2_0.0" "2.2.2_0.0" "2.2.1_0.0" "2.1.3_0.0" "1.3.1_0.0"
 [2,] "2.3.3_0.0" "3.3.4_0.0" "3.4.2_0.0" "4.2.4_0.0" "2.4.4_0.0" "4.4.1_0.0"
 [3,] "1.3.4_0.0" "3.4.2_0.0" "4.2.4_0.0" "2.4.1_0.0" "4.1.1_0.0" "1.1.1_0.0"
 [4,] "2.1.2_0.0" "1.2.1_0.0" "2.1.1_0.0" "1.1.1_0.0" "1.1.3_0.0" "1.3.1_0.0"
 [5,] "3.3.1_0.0" "3.1.4_0.0" "1.4.3_0.0" "4.3.4_0.0" "3.4.2_0.0" "4.2.2_0.0"
 [6,] "3.2.2_0.0" "2.2.2_0.0" "2.2.3_0.0" "2.3.2_0.0" "3.2.1_0.0" "2.1.4_0.0"
 [7,] "3.2.3_0.0" "2.3.4_0.0" "3.4.1_0.0" "4.1.2_0.0" "1.2.1_0.0" "2.1.2_0.0"
 [8,] "2.2.4_0.0" "2.4.3_0.0" "4.3.3_0.0" "3.3.4_0.0" "3.4.1_0.0" "4.1.1_0.0"
 [9,] "2.4.2_0.0" "4.2.2_0.0" "2.2.1_0.0" "2.1.2_0.0" "1.2.4_0.0" "2.4.1_0.0"
[10,] "2.4.2_0.0" "4.2.3_0.0" "2.3.3_0.0" "3.3.3_0.0" "3.3.3_0.0" "3.3.2_0.0"
[11,] "4.1.2_0.0" "1.2.4_0.0" "2.4.2_0.0" "4.2.2_0.0" "2.2.3_0.0" "2.3.1_0.0"
[12,] "3.2.2_0.0" "2.2.2_0.0" "2.2.1_0.0" "2.1.1_0.0" "1.1.1_0.0" "1.1.4_0.0"
      [,37]       [,38]       [,39]       [,40]       [,41]       [,42]      
 [1,] "3.1.2_0.0" "1.2.4_0.0" "2.4.2_0.0" "4.2.2_0.0" "2.2.3_0.0" "2.3.2_0.0"
 [2,] "4.1.3_0.0" "1.3.2_0.0" "3.2.2_0.0" "2.2.2_0.0" "2.2.4_0.0" "2.4.4_0.0"
 [3,] "1.1.4_0.0" "1.4.2_0.0" "4.2.3_0.0" "2.3.3_0.0" "3.3.2_0.0" "3.2.3_0.0"
 [4,] "3.1.4_0.0" "1.4.1_0.0" "4.1.4_0.0" "1.4.4_0.0" "4.4.2_0.0" "4.2.1_0.0"
 [5,] "2.2.4_0.0" "2.4.3_0.0" "4.3.2_0.0" "3.2.1_0.0" "2.1.2_0.0" "1.2.4_0.0"
 [6,] "1.4.4_0.0" "4.4.4_0.0" "4.4.4_0.0" "4.4.2_0.0" "4.2.1_0.0" "2.1.4_0.0"
 [7,] "1.2.1_0.0" "2.1.3_0.0" "1.3.4_0.0" "3.4.4_0.0" "4.4.2_0.0" "4.2.4_0.0"
 [8,] "1.1.4_0.0" "1.4.3_0.0" "4.3.3_0.0" "3.3.4_0.0" "3.4.4_0.0" "4.4.1_0.0"
 [9,] "4.1.4_0.0" "1.4.4_0.0" "4.4.2_0.0" "4.2.4_0.0" "2.4.1_0.0" "4.1.2_0.0"
[10,] "3.2.2_0.0" "2.2.1_0.0" "2.1.1_0.0" "1.1.2_0.0" "1.2.2_0.0" "2.2.2_0.0"
[11,] "3.1.4_0.0" "1.4.4_0.0" "4.4.1_0.0" "4.1.4_0.0" "1.4.2_0.0" "4.2.4_0.0"
[12,] "1.4.2_0.0" "4.2.2_0.0" "2.2.4_0.0" "2.4.3_0.0" "4.3.3_0.0" "3.3.3_0.0"
      [,43]       [,44]       [,45]       [,46]       [,47]       [,48]      
 [1,] "3.2.2_0.0" "2.2.4_0.0" "2.4.4_0.0" "4.4.1_0.0" "4.1.3_0.0" "1.3.2_0.0"
 [2,] "4.4.4_0.0" "4.4.2_0.0" "4.2.3_0.0" "2.3.2_0.0" "3.2.3_0.0" "2.3.1_0.0"
 [3,] "2.3.3_0.0" "3.3.2_0.0" "3.2.1_0.0" "2.1.2_0.0" "1.2.1_0.0" "2.1.3_0.0"
 [4,] "2.1.3_0.0" "1.3.2_0.0" "3.2.2_0.0" "2.2.2_0.0" "2.2.2_0.0" "2.2.2_0.0"
 [5,] "2.4.4_0.0" "4.4.4_0.0" "4.4.1_0.0" "4.1.2_0.0" "1.2.4_0.0" "2.4.1_0.0"
 [6,] "1.4.1_0.0" "4.1.1_0.0" "1.1.3_0.0" "1.3.1_0.0" "3.1.2_0.0" "1.2.4_0.0"
 [7,] "2.4.1_0.0" "4.1.4_0.0" "1.4.4_0.0" "4.4.1_0.0" "4.1.1_0.0" "1.1.1_0.0"
 [8,] "4.1.4_0.0" "1.4.4_0.0" "4.4.4_0.0" "4.4.2_0.0" "4.2.4_0.0" "2.4.3_0.0"
 [9,] "1.2.4_0.0" "2.4.1_0.0" "4.1.4_0.0" "1.4.3_0.0" "4.3.4_0.0" "3.4.1_0.0"
[10,] "2.2.3_0.0" "2.3.3_0.0" "3.3.4_0.0" "3.4.2_0.0" "4.2.2_0.0" "2.2.2_0.0"
[11,] "2.4.3_0.0" "4.3.4_0.0" "3.4.2_0.0" "4.2.3_0.0" "2.3.2_0.0" "3.2.4_0.0"
[12,] "3.3.4_0.0" "3.4.1_0.0" "4.1.1_0.0" "1.1.3_0.0" "1.3.4_0.0" "3.4.2_0.0"

biogram documentation built on March 31, 2020, 5:14 p.m.