Man pages for textTools
Functions for Text Cleansing and Text Analysis

as.text.tableConvert a data.table column of character vectors into a...
flag_wordsFlag rows in a text.table with specific words
label_parts_of_speechAdd a column with the parts of speech for each word in a...
l_posParts of speech for English words from the Moby Project.
ngramsCreate n-grams
posParts of speech for English words from the Moby Project.
regex_paragraphRegular expression that might be used to split strings of...
regex_sentenceRegular expression that might be used to split strings of...
regex_wordRegular expression that might be used to split strings of...
rm_frequent_wordsDelete rows in a text.table where the number of identical...
rm_infrequent_wordsDelete rows in a text.table where the number of identical...
rm_long_wordsDelete rows in a text.table where the word has more than a...
rm_no_overlapDelete rows in a text.table where the records within a group...
rm_overlapDelete rows in a text.table where the records within a group...
rm_parts_of_speechDelete rows in a text.table where the word has a certain part...
rm_regexp_matchDelete rows in a text.table where the record has a certain...
rm_short_wordsDelete rows in a text.table where the word has less than a...
rm_wordsRemove rows from a text.table with specific words
sampleStrGenerates (pseudo)random strings of the specified char length
stopwordsVector of lowercase English stop words.
str_any_matchDetect if there are any words in a vector also found in...
str_count_intersectCount the intersecting words in a vector that are found in...
str_count_jaccard_similarityCalculates the intersect divided by union of two vectors of...
str_count_matchCount the words in a vector that are found in another vector.
str_count_nomatchCount the words in a vector that are not found in another...
str_count_positional_matchCount words from a vector that are found in the same position...
str_count_positional_nomatchCount words from a vector that are not found in the same...
str_countsCreate a list of a vector of unique words found in x and a...
str_count_setdiffCount the words in a vector that don't intersect with another...
str_dt_col_combineCombine columns of a data.table into a list in a new column,...
str_extract_matchExtract words from a vector that are found in another vector.
str_extract_nomatchExtract words from a vector that are not found in another...
str_extract_positional_matchExtract words from a vector that are found in the same...
str_extract_positional_nomatchExtract words from a vector that are not found in the same...
str_rm_blank_spaceRemove and replace excess white space from strings.
str_rm_long_wordsRemove words from a vector that have more than a maximum...
str_rm_non_alphanumericRemove and replace non-alphanumeric characters from strings.
str_rm_non_printableRemove and replace non-printable characters from strings.
str_rm_numbersRemove and replace numbers from strings.
str_rm_punctuationRemove and replace punctuation from strings.
str_rm_regexp_matchRemove words from a vector that match a regular expression.
str_rm_short_wordsRemove words from a vector that don't have a minimum number...
str_rm_wordsRemove words from a vector of words found in another vector...
str_rm_words_by_lengthRemove words from a vector based on the number of characters...
str_stopwords_by_part_of_speechCreate a vector of English words associated with particular...
str_tolowerCalls base::tolower(), which converts letters to lowercase....
str_weighted_count_matchWeighted count of the words in a vector that are found in...
textTools documentation built on Feb. 5, 2021, 5:07 p.m.