README.md

rahocorasick

An R interface to a fast java implementation of the Aho-Corasick exact string search algorithm.

Disclaimer

This package is in a very early stage of development. Important changes are expected in future versions.

Installation

if(!any(rownames(installed.packages()) == "devtools")) {
  install.packages("devtools")
}
devtools::install_github("jullybobble/rahocorasick")

Usage

library(rahocorasick)
dictionary <- list(one = 1, two = 2, three = 3)
trie <- ac_build_list(dictionary)
text_1 <- "one apple"
hit_1 <- ac_search(text_1, trie)[[1]]
hit_1
## # A tibble: 1 x 3
##   value begin   end
##   <chr> <int> <int>
## 1     1     1     3
ac_are_boundary_chars(text_1, hit_1$begin, hit_1$end)
## [1] TRUE
text_2 <- "a twoonie in my pocket"
hit_2 <- ac_search(text_2, trie)[[1]]
hit_2
## # A tibble: 1 x 3
##   value begin   end
##   <chr> <int> <int>
## 1     2     3     5
ac_are_boundary_chars(text_2, hit_2$begin, hit_2$end)
## [1] FALSE
text_3 <- "nothing left"
ac_build_and_search_list(c(text_1, text_2, text_3), dictionary)
## [[1]]
## # A tibble: 1 x 3
##   value begin   end
##   <chr> <int> <int>
## 1     1     1     3
## 
## [[2]]
## # A tibble: 1 x 3
##   value begin   end
##   <chr> <int> <int>
## 1     2     3     5
## 
## [[3]]
## # A tibble: 0 x 3
## # ... with 3 variables: value <chr>, begin <int>, end <int>


jullybobble/rahocorasick documentation built on May 30, 2019, 8:14 a.m.