R/unnest_ptb.R

Defines functions unnest_ptb

Documented in unnest_ptb

#' Wrapper around unnest_tokens for Penn Treebank Tokenizer
#'
#' This function is a wrapper around `unnest_tokens( token = "ptb" )`.
#'
#' @seealso
#' + [unnest_tokens()]
#'
#' @inheritParams unnest_tokens
#' @inheritParams tokenizers::tokenize_ptb
#'
#' @param ... Extra arguments passed on to [tokenizers][tokenizers::tokenizers]
#'
#' @export
#' @importFrom dplyr enquo
#'
#' @examples
#' library(dplyr)
#' library(janeaustenr)
#'
#' d <- tibble(txt = prideprejudice)
#'
#' d %>%
#'   unnest_ptb(word, txt)
#'
unnest_ptb <- function(
  tbl,
  output,
  input,
  format = c("text", "man", "latex", "html", "xml"),
  to_lower = TRUE,
  drop = TRUE,
  collapse = NULL,
  ...
){
  format <- arg_match(format)
  unnest_tokens(tbl,
                !! enquo(output),
                !! enquo(input),
                format = format,
                to_lower = to_lower,
                drop = drop,
                collapse = collapse,
                token = "ptb",
                ...
  )
}

Try the tidytext package in your browser

Any scripts or data that you put into this service are public.

tidytext documentation built on May 29, 2024, 5:42 a.m.