compositr: Efficient tools for preprocessing text data in R.

Description Usage Details

View source: R/textprep.R

Takes a text variable from a dataframe and runs a number of standard text preprocessing procedures on it, like removing html tags, removing stopwords, converting to lowercase. Preprocessing techniques and tokenization are applied in an interactive yes/no console session with the user. A list of the procedures used are saved in a local .txt file in directory specified by the user.

textprep(
  textdata,
  textvar,
  type = "docs",
  language = "english",
  outdir = NA,
  outname = "/transformations.txt"
)

@param textdata a dataframe containing a text variable @param textvar the name of the column in the first param containing text @param type right now there is only one type called "docs" @param language user specified language, determines what tm::stopword dictionary is used @param outdir the directory that the user wishes to have the output .txt file saved in @param outname the name of the transformations .txt summary file, defaults to transformations.txt but can be renamed

@return the dataframe with a cleaned and/or tokenized text variable

@export

@import textclean @import dplyr @import tidytext @import textstem @import SnowballC @import tibble @import tm @import magrittr @import crayon

@examples ## Not run: results <- textprep(df, "text", language = "english", outdir = "~/Desktop/files") ## End(Not run)

alexlusco/compositr documentation built on Jan. 19, 2021, 8:33 p.m.

alexlusco/compositr index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

alexlusco/compositr
Efficient tools for preprocessing text data in R.

textprep: clean and/or tokenize your text data in a single function
In alexlusco/compositr: Efficient tools for preprocessing text data in R.

Description

Usage

Details

Related to textprep in alexlusco/compositr...

R Package Documentation

Browse R Packages

We want your feedback!

alexlusco/compositr Efficient tools for preprocessing text data in R.

textprep: clean and/or tokenize your text data in a single function In alexlusco/compositr: Efficient tools for preprocessing text data in R.

Description

Usage

Details

Related to textprep in alexlusco/compositr...

R Package Documentation

Browse R Packages

We want your feedback!

alexlusco/compositr
Efficient tools for preprocessing text data in R.

textprep: clean and/or tokenize your text data in a single function
In alexlusco/compositr: Efficient tools for preprocessing text data in R.