seq2ngrams: Extract n-grams from sequence

View source: R/ngrams.R

seq2ngramsR Documentation

Extract n-grams from sequence

Description

Extracts vector of n-grams present in sequence(s).

Usage

seq2ngrams(seq, n, u, d = 0, pos = FALSE)

Arguments

seq

a vector or matrix describing sequence(s).

n

integer size of n-gram.

u

integer, numeric or character vector of all possible unigrams.

d

integer vector of distances between elements of n-gram (0 means consecutive elements). See Details.

pos

logical, if TRUE position-specific n_grams are counted.

Details

A format of d vector is discussed in Details of count_ngrams.

Value

A character matrix of n-grams, where every row corresponds to a different sequence.

Examples

# trigrams from multiple sequences
seqs <- matrix(sample(1L:4, 600, replace = TRUE), ncol = 50)
seq2ngrams(seqs, 3, 1L:4)

michbur/biogram documentation built on Feb. 4, 2024, 6:38 p.m.