limpiar_slang: Clean slang from multiple Spanish dialects

View source: R/limpiar_slang.R

limpiar_slangR Documentation

Clean slang from multiple Spanish dialects

Description

Replaces slang phrases from various Spanish dialects with everyday terms. Function's primary use is to normalise text for Deep Learning sentiment algorithm. Care should be taken when using this function, e.g. panda -> grupo, as that is by far the most common usage in the texts we use. However, in a data set where many people talk about panda bears or 'oso panda', there will be unwanted changes. I have tried to avoid this problem where possible, by including things like 'me la suda' instead of changing 'suda'.

Usage

limpiar_slang(df, text_var = mention_content)

Arguments

df

Name of Data Frame or Tibble object

text_var

Name of text variable/character vector

Value

Data Frame or Tibble object with text variable altered

Examples

## Not run: 
df %>%
limpiar_slang(text_var = text_var)
## End(Not run)

jpcompartir/LimpiaR documentation built on April 6, 2024, 5:22 a.m.