ngram_sequence_matching: N-Gram Sequence Matching

Description Usage Arguments Value

View source: R/ngram_sequence_matching.R

Description

Calculates the positions of n-grams in two document versions which match an ngram in the other version.

Usage

1
2
ngram_sequence_matching(document_1, document_2, ngram_size,
  use_hashmap = FALSE, tokenized_strings_provided = FALSE)

Arguments

document_1

A string (or a character vector) representing the earlier document version.

document_2

A string (or a character vector) representing the later document version.

ngram_size

The length of n-grams to be compared

use_hashmap

Defaults to FALSE. If TRUE, then a hashmap is used for faster lookup and comparisons.

tokenized_strings_provided

Defaults to FALSE. If TRUE, then pre-tokenized strings are expected as character vectors.

Value

A List object.


matthewjdenny/SpeedReader documentation built on March 25, 2020, 5:32 p.m.