R/tokenizers-package.r
In tokenizers: Fast, Consistent Tokenization of Natural Language Text

#' Tokenizers
#'
#' A collection of functions with a consistent interface to convert natural
#' language text into tokens.
#'
#' The tokenizers in this package have a consistent interface. They all take
#' either a character vector of any length, or a list where each element is a
#' character vector of length one. The idea is that each element comprises a
#' text. Then each function returns a list with the same length as the input
#' vector, where each element in the list are the tokens generated by the
#' function. If the input character vector or list is named, then the names are
#' preserved.
#'
#' @name tokenizers
#' @docType package
NULL

#' @useDynLib tokenizers, .registration = TRUE
#' @importFrom Rcpp sourceCpp
NULL

Any scripts or data that you put into this service are public.

tokenizers documentation built on Dec. 28, 2022, 2:34 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

tokenizers
Fast, Consistent Tokenization of Natural Language Text

R/tokenizers-package.r
In tokenizers: Fast, Consistent Tokenization of Natural Language Text

Try the tokenizers package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

tokenizers Fast, Consistent Tokenization of Natural Language Text

R/tokenizers-package.r In tokenizers: Fast, Consistent Tokenization of Natural Language Text

Try the tokenizers package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

tokenizers
Fast, Consistent Tokenization of Natural Language Text

R/tokenizers-package.r
In tokenizers: Fast, Consistent Tokenization of Natural Language Text