preprocessSemiRedundant: Preprocess string to semi-redundant one-hot vector

Description Usage Arguments

View source: R/preprocess.R

Description

Outputs semi-redundant set of input character string. Collapse, tokenize, and vectorize the character. Use this function with a character string as input. For example, if the input text is ABCDEFGHI and the length(maxlen) is 5, the generating chunks would be: X(1): ABCDE and Y(1): F; X(2): BCDEF and Y(2): G; X(3): CDEFG and Y(3): H; X(4): DEFGH and Y(4): I

Usage

1
2
3
4
5
6
preprocessSemiRedundant(
  char,
  maxlen = 250,
  vocabulary = c("l", "p", "a", "c", "g", "t"),
  verbose = F
)

Arguments

char

character input string of text with the length of one

maxlen

length of the semi-redundant sequences

vocabulary

char contains the vocabulary from the input char If no vocabulary exists, it is generated from the input char

verbose

TRUE/FALSE


hiddengenome/altum documentation built on April 22, 2020, 9:33 p.m.