rmSharedWords: Trim/Remove Redundant Words

View source: R/rmSharedWords.R

rmSharedWordsR Documentation

Trim/Remove Redundant Words

Description

This function allows removing shared words, ie triming to non-redundant words.

Usage

rmSharedWords(
  x,
  sep = c("_", " ", "."),
  anySep = TRUE,
  newSep = NULL,
  minLe = 2,
  na.omit = FALSE,
  fixed = TRUE,
  silent = FALSE,
  debug = FALSE,
  callFrom = NULL
)

Arguments

x

(character) main input for making non-redundant

sep

(character) separator(s) to be used

anySep

(logical) if TRUE, will consider all separators at one time (), thus combinations with different separators won't be distinguished

newSep

(character) new (uniform) separator between words, if NULL the first value/separator of if sep will be used

minLe

(integer) minimum length for allowing being recognised as 'word'

na.omit

(logical) if TRUE NAs will be removed from output

fixed

(logical) will be transmitted to argument fixed of strsplit(); if TRUE regular expressions are allowed/used

silent

(logical) suppress messages

debug

(logical) additional messages for debugging

callFrom

(character) allows easier tracking of messages produced

Details

Heading separators will be removed in any case (even if not followed by a 'word').

Special characters will be automatically protected. When looking for repeated words, the order of such words does NOT matter, multiple repeats will be removed, too.

#'

Value

This function returns character vector of same length (unless na.omit=TRUE), simply with modified text-content

See Also

trimRedundText

Examples

x1 <- c("aa_A1 yy_zz.txt", NA, "B2 yy_aa_aa_zz.txt")
rmSharedWords(x1)
 

wrMisc documentation built on Sept. 11, 2024, 6:10 p.m.