R/token2mwe.R

Defines functions token2mwe

Documented in token2mwe

#' Combine multi-word expressions per dictionary -- via quanteda --
#'
#' @name token2mwe
#' @param tok A list
#' @param mwe A character vector
#' @param concat A string
#' @return A data frame
#'
#' @export
#' @rdname token2mwe
#'
#'
token2mwe <- function(tok, mwe, concat = '_'){

  x1 <- quanteda::as.tokens(tok)
  x2 <- quanteda::tokens_compound(x1,
                                  pattern = quanteda::phrase(mwe),
                                  concatenator = concat)

  return(as.list(x2))
}
jaytimm/text2df documentation built on July 21, 2023, 1:58 a.m.