features: Features.
In M3SOulu/NLoN: Natural Language or Not

features

R Documentation

Features.

Module containing functions for individual simple text feature extraction.

features

An object of class module (inherits from list) of length 19.

Most functions have a single text parameter. The module contains the following functions:

Stopwords: Number of stopwords. Uses two optional parameters: Tokenize which is the word tokenizer to use and stopwords which is the list of stopwords to use.
Tokenize1: First tokenizer available for Stopwords.
Tokenize2: Second tokenizer available for Stopwords.
StopwordsRatio1: Ratio of stopwords using Tokenize1
StopwordsRatio2: Ratio of stopwords using Tokenize2
Caps: Number of uppercase letters.
CapsRatio: Ratio of uppercase letters.
SpecialChars: Number of special characters.
SpecialCharsRatio: Ratio of special characters.
Numbers: Number of digit characters.
NumbersRatio: Ratio of digit characters.
Words: Number of words.
AverageWordLength: Average word length.
LastCharCode: Boolean for the use of a code character at the end of the text.
LastCharNL: Boolean for the use of a natural language boolean at the end of the text.
First3Chars: Returns the first three non white characters.
First3CharsLetters: The number of three first non white characters that are letters.
Emoticons: The number of emoticons
StartWithAt: Boolean for the use of @ at the start of the text.