PNcheck: Spell checking using ternary search trees

Description Usage Arguments Details Value See Also Examples

View source: R/PNcheck.R

Description

Spell checking using TST and Peter Norvig's approach.

Usage

1
PNcheck(tree, string, useUpper = FALSE)

Arguments

tree

a ternary search tree containing the dictionary terms.

string

the misspelled string to correct.

useUpper

if TRUE, uppercase letters are also used to construct insertions and alterations of the string. Default is FALSE.

Details

The literature on spelling correction claims that around 80% of spelling errors are an edit distance of 1 from the target. For a word of length n, there will be n deletions, n-1 transpositions, 36n alterations, and 36(n+1) insertions, for a total of 74n+35 (of which a few are typically duplicates). PNcheck computes all these variations and search them in a ternary search tree.

For distance 2 the number of variations becomes (74n+35)^2 which makes PNcheck 3 orders of magnitude more expensive than SDcheck.

Value

A vector with the corrected words.

See Also

newTree

Examples

1
2
3
fruitTree <- newTree(c("Apple", "orange", "lemon"))
PNcheck(fruitTree,"lamon")
PNcheck(fruitTree,"apple", useUpper = TRUE)

Example output

sh: 1: cannot create /dev/null: Permission denied
sh: 1: wc: Permission denied
Could not detect number of cores, defaulting to 1.
Tree created with 3 words and 16 nodes
[1] "lemon"
[1] "Apple"

TSTr documentation built on May 1, 2019, 9:16 p.m.