noise: detect noise

noiseR Documentation

detect noise

Description

detect noise

Usage

noise(.Object, ...)

## S4 method for signature 'DocumentTermMatrix'
noise(
  .Object,
  minTotal = 2,
  minTfIdfMean = 0.005,
  sparse = 0.995,
  stopwordsLanguage = "german",
  minNchar = 2L,
  specialChars = getOption("polmineR.specialChars"),
  numbers = "^[0-9\\.,]+$",
  verbose = TRUE
)

## S4 method for signature 'TermDocumentMatrix'
noise(.Object, ...)

## S4 method for signature 'character'
noise(
  .Object,
  stopwordsLanguage = "german",
  minNchar = 2,
  specialChars = getOption("polmineR.specialChars"),
  numbers = "^[0-9\\.,]+$",
  verbose = TRUE
)

## S4 method for signature 'textstat'
noise(.Object, p_attribute, ...)

Arguments

.Object

An object of class DocumentTermMatrix.

...

further parameters

minTotal

minimum colsum (for DocumentTermMatrix) to qualify a term as non-noise

minTfIdfMean

minimum mean value for tf-idf to qualify a term as non-noise

sparse

Will be passed into tm::removeSparseTerms().

stopwordsLanguage

e.g. "german", to get stopwords defined in the tm package.

minNchar

Minimum number of characters to qualify a term as non-noise.

specialChars

special characters to drop

numbers

regex, to drop numbers

verbose

logical

p_attribute

relevant if applied to a textstat object

Value

a list


polmineR documentation built on Aug. 26, 2022, 5:15 p.m.