R/mallet_stopwords.R

Defines functions remove_file_extension mallet_supported_stoplists mallet_stoplist_file_path

Documented in mallet_stoplist_file_path mallet_supported_stoplists

#' @title
#' Return the file path to the mallet stoplists
#'
#' @details
#' Returns the path to the mallet stop word list.
#' See [mallet_supported_stoplists()] for which stoplists that are included.
#'
#' @param language language to return stoplist for. Defaults to engligs ([en]).
#'
#' @export
mallet_stoplist_file_path <- function(language = "en"){
  checkmate::assert_choice(language, choices = mallet_supported_stoplists())
  fp <- system.file("stoplists", package = "mallet")
  file.path(fp, paste0(language,".txt"))
}

#' @export
#' @rdname mallet_stoplist_file_path
mallet.stoplist.file.path <- mallet_stoplist_file_path


#' @title Mallet supported stoplists
#'
#' @details return vector with included stoplists
#'
#' @export
mallet_supported_stoplists <- function(){
  fns <- dir(system.file("stoplists", package = "mallet"))
  fns <- fns[grepl(fns, pattern = "\\.txt$")]
  remove_file_extension(fns)
}

#' @export
#' @rdname mallet_supported_stoplists
mallet.supported.stoplists <- mallet_supported_stoplists

remove_file_extension <- function(x){
  sub(pattern = "(.*)\\..*$", replacement = "\\1", basename(x))
}

Try the mallet package in your browser

Any scripts or data that you put into this service are public.

mallet documentation built on July 20, 2022, 5:08 p.m.