standardize.text: Standardize Text

Description Usage Arguments Value Examples

View source: R/standardize.text.R

Description

Takes in a dataframe or vector containing a column of text and returns the data structure with the text standardized.

Usage

1
standardize.text(input, input.column = NULL, standard, standard.column = NULL, regex = NULL, codes = NULL, match = FALSE, only.names = FALSE, na.rm = FALSE, suggest = "prompt", print.changes = TRUE, verbose = FALSE)

Arguments

input

A dataframe or vector containing a column of text

input.column

The column containing text if input is a dataframe, identified by name or number; ignored if input a vector

standard

A dataframe or vector containing a column of standard text

standard.column

The column containing standard text if standard is a dataframe, identified by name or number; ignored if standard a vector

regex

An optional vector of regular expressions; if NULL regex will be generated from standard

codes

An optional vector of identified codes; if NULL codes will be generated automatically

match

Mark true if there is a one-to-one correspondence between provided standard and provided regex

only.names

Only return a vector of standardized names

na.rm

Remove any entries whose text does not appear in the standard set

suggest

Suggestions for inexact matches; "prompt" allows user to select desired suggestions, "auto" applies all, "none" applies none

print.changes

Print which text entries changed

verbose

Print full output, including unidentified text

Value

If input a dataframe, returns the identical dataframe with the text column standardized; if input a vector of text, returns the standardized vector

Examples

1
2
3
4
5
6
library(StandardizeText)
sample.text <- c("blue car","STOP","email","tree")
sample.std <- c("the tree","car","e-mail","stop")
sample.df <- data.frame(foo=2:5,bar=sample.text, baz=7:4, qux=sample.std)
out.a <- standardize.text(sample.text,standard=sample.std,suggest="auto")
out.b <- standardize.text(sample.df,2,sample.df,"qux",suggest="auto")

Example output

The following names were changed:
  Original Modified
1     STOP     stop
2    email   e-mail
3     tree the tree

The following suggested changes were applied:
  Original Suggested
1 blue car       car

The following names were changed:
  Original Modified
1     STOP     stop
2    email   e-mail
3     tree the tree

The following suggested changes were applied:
  Original Suggested
1 blue car       car

StandardizeText documentation built on May 1, 2019, 9:31 p.m.