README.md

sbo

AppVeyor build
status CircleCI build
status GitHub Actions build
status Codecov test
coverage CRAN
status CRAN
downloads

sbo provides utilities for building and evaluating text predictors based on Stupid Back-off N-gram models in R. It includes functions such as:

Installation

Released version

You can install the latest release of sbo from CRAN:

install.packages("sbo")

Development version:

You can install the development version of sbo from GitHub:

# install.packages("devtools")
devtools::install_github("vgherard/sbo")

Example

This example shows how to build a text predictor with sbo:

library(sbo)
p <- sbo_predictor(sbo::twitter_train, # 50k tweets, example dataset
                   N = 3, # Train a 3-gram model
                   dict = sbo::twitter_dict, # Top 1k words appearing in corpus
                   .preprocess = sbo::preprocess, # Preprocessing transformation
                   EOS = ".?!:;" # End-Of-Sentence characters
                   )

The object p can now be used to generate predictive text as follows:

predict(p, "i love") # a character vector
#> [1] "you" "it"  "my"
predict(p, "you love") # another character vector
#> [1] "<EOS>" "me"    "the"
predict(p, 
        c("i love", "you love", "she loves", "we love", "you love", "they love")
        ) # a character matrix
#>      [,1]    [,2]  [,3] 
#> [1,] "you"   "it"  "my" 
#> [2,] "<EOS>" "me"  "the"
#> [3,] "you"   "my"  "me" 
#> [4,] "you"   "our" "it" 
#> [5,] "<EOS>" "me"  "the"
#> [6,] "to"    "you" "and"

Help

For help, see the sbo website.



Try the sbo package in your browser

Any scripts or data that you put into this service are public.

sbo documentation built on Dec. 6, 2020, 1:06 a.m.