README.md

kwm

lifecycle Travis-CI Build
Status AppVeyor Build
Status

kwm provides very simiple wrapper functions to produce KeyWord Models that produce classification predictions based on explicit lists of regular expression pattern matches. By supplying a generic prediction function for such lists, it is easy to compare the performance of very simple regex matching to other, more complicated text classification models within the same pipeline.

Installation

You can install kwm from github with:

# install.packages("devtools")
devtools::install_github("mdlincoln/kwm")

Example

library(kwm)

month_df <- data.frame(month = month.name, stringsAsFactors = FALSE)

# Locate all matches that INCLUDE either "a" or "e" but EXCLUDE any ending in "r"
month_model <- kwm(include = c("a", "e"), exclude = "r$", varname = "month")

predict(month_model, newdata = month_df, return_names = TRUE)
#>   January  February     March     April       May      June      July 
#>      TRUE      TRUE      TRUE     FALSE      TRUE      TRUE     FALSE 
#>    August September   October  November  December 
#>     FALSE     FALSE     FALSE     FALSE     FALSE

# You can pass options to the underlying search function as well
caseless_month_model <- kwm(include = c("a", "e"), exclude = "r$", 
                            varname = "month", 
                            search_opts = list(ignore_case = TRUE))

predict(caseless_month_model, newdata = month_df, return_names = TRUE)
#>   January  February     March     April       May      June      July 
#>      TRUE      TRUE      TRUE      TRUE      TRUE      TRUE     FALSE 
#>    August September   October  November  December 
#>      TRUE     FALSE     FALSE     FALSE     FALSE


mdlincoln/kwm documentation built on May 14, 2019, 2:15 p.m.