FeatureExtraction: Feature extraction.

View source: R/features.R

FeatureExtractionR Documentation

Feature extraction.

Description

Computes a set of simple text-based features.

Usage

FeatureExtraction(text)

Arguments

text

The text.

Details

The features computed are the followings:

ratio.caps

The ratio of uppercase letters.

ratio.specials

The ratio of special characters.

ratio.numbers

The ratio of number characters.

length.words

The average word length.

stopwords

The ratio of English stopwords (using first tokenizer).

stopwords2

The ratio of English stopwords (using second tokenizer).

last.char.nl

Boolean for the use of NL character at the end of the text.

last.char.code

Boolean for the use of code character at the end of text.

first.3.chars.letters

Number of letters in the three first characters.

emoticons

Number of emoticons

first.char.at

Boolean for the use of @ character at the beginning of the line.

Value

A data.table with values of the 11 features.


M3SOulu/NLoN documentation built on June 20, 2022, 6 p.m.