count_total: Count total number of n-grams

Description Usage Arguments Details Value Note Examples

View source: R/indices_and_positions.R

Description

Computes total number of n-grams that can be extracted from sequences.

Usage

1

Arguments

seq

a vector or matrix describing sequence(s).

n

integer size of n-gram.

d

integer vector of distances between elements of n-gram (0 means consecutive elements). See Details.

Details

The maximum number of possible n-grams is limited by their length and the distance between elements of the n-gram.

Value

An integer rperesenting the total number of n-grams.

Note

A format of d vector is discussed in Details of count_ngrams. The maximum

Examples

1
2
3
4
5
seqs <- matrix(sample(1L:4, 600, replace = TRUE), ncol = 50)
# make several sequences shorter by replacing them partially with NA
seqs[8L:11, 46L:50] <- NA
seqs[1L, 31L:50] <- NA
count_total(seqs, 3, c(1, 0))

biogram documentation built on March 31, 2020, 5:14 p.m.