README.md
In mdlincoln/kwm: Explicit Regex Matching Implemented As Model-like Objects

kwm

kwm provides very simiple wrapper functions to produce KeyWord Models that produce classification predictions based on explicit lists of regular expression pattern matches. By supplying a generic prediction function for such lists, it is easy to compare the performance of very simple regex matching to other, more complicated text classification models within the same pipeline.

You can install kwm from github with:

# install.packages("devtools")
devtools::install_github("mdlincoln/kwm")

library(kwm)

month_df <- data.frame(month = month.name, stringsAsFactors = FALSE)

# Locate all matches that INCLUDE either "a" or "e" but EXCLUDE any ending in "r"
month_model <- kwm(include = c("a", "e"), exclude = "r$", varname = "month")

predict(month_model, newdata = month_df, return_names = TRUE)
#>   January  February     March     April       May      June      July 
#>      TRUE      TRUE      TRUE     FALSE      TRUE      TRUE     FALSE 
#>    August September   October  November  December 
#>     FALSE     FALSE     FALSE     FALSE     FALSE

# You can pass options to the underlying search function as well
caseless_month_model <- kwm(include = c("a", "e"), exclude = "r$", 
                            varname = "month", 
                            search_opts = list(ignore_case = TRUE))

predict(caseless_month_model, newdata = month_df, return_names = TRUE)
#>   January  February     March     April       May      June      July 
#>      TRUE      TRUE      TRUE      TRUE      TRUE      TRUE     FALSE 
#>    August September   October  November  December 
#>      TRUE     FALSE     FALSE     FALSE     FALSE

mdlincoln/kwm documentation built on May 14, 2019, 2:15 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com