Man pages for eellpp/textutils
Utilities for text processing while building models

addNLTKStopwords	add NLTK stopwords to the current list of stopwords
addWordsToDB	Add keywords to an sqlite database
centerScaleData	center and scale a numeric dataset
cleanTitleString	Get a clean string which is based on hueristic for document...
createFreqOfKeywordsTerms	Get freq of keywords in dataset
createTextVectorFromDataset	createTextVectorFromDataset from title
getAcronymRatio	get Acronym Ratio
getAUCandPlotROC	Generic method to quickly plot ROC curve and print the AUC...
getBOWfeatures_Binary	get BOW features binary - y/n
getBOWfeatures_freq	get bag of words features with frew
getBOWfeaturesTestDataset	get bag of words test dataset
getBOWfeatures_Tfidf	get bag of words with tfidf
getBOWKeywords	get bagofwords keywords from a dataset
getCamelCaseKeywords	Get camel case keywords from string
getCapLettersToCharactersRatio	get capital letters to character ratio
getCharacterVector	Get Character Vector
getCorpusFromTextVector	get corpus from text vector
getCountByPattern	get count by pattern
getDataframeFromWordVector	Creates a dataframe whose columns are words in wordvector...
getDataSetWithFeatures	get data set with features
getDigitCount	Get digit count
getDocProbabilityDistribution	Get the topic probability distribution for the docs
getFeaturesAboveThreshold	get the features above a threshold value
getFeaturesForDataset	Given a dataset, it returns the dataset updated with features
getFreqOfKeywordInDataset	Get freq of keywords in dataset
getFreqWordFeaturesFromTdm	get freq words from tdm
getKeywordWeights	Get the suma of weights of all keywords in string
getLDATopicForDocs	Get LDA top Topic terms for docs
getLDATopTopicTerms	Get top terms from lda object
getModelPerformance	show the model performance statistics like AUC, ROC,...
getPredictionsForModel	get predicted results for the model built for text only...
getRatio	Apply a regex pattern on each word of a string and find the...
getStopWords	Get the stopwords to be used for feature building
getStopWordsCount	Get stop Words Count in string
getStopWordsRatio	Get the ratio of stop words
getSymbolCount	Get symbol count
getSymbolToWordsRatio	get Symbol to words ratio
getTestTrainDomainDataset	Get test and train dataset
getTestTrainGenericDataset	get the test train generic dataset
getTextVectorAll	Get clean text from dataframe
getWordFeaturesFromTdm	get the feature vector form term document matrix (tdm)
getWordFeatureVector	Apply a function to create feature vector from dataset
getWordFeatureVectorWithdbh	Apply a function to create feature vector from dataset with...
getWords	Get Words
isDictWord	Checks if the word is a dict word
recreateAndRunModelWithSelectedFeatures	Given a model, will detect the significant features and rerun...
removeNonNumericFeatures	remove non numeric features from dataframe
runLDAGibbs	Run LDA based on gibbs method
setAllBOWFeaturesInDataFrame	set all the words in a dataframe
setBOWFeatureInDataFrame	set bag of words in a data frame
setBOWkeywords	return the dataset with bag of words keywords set

eellpp/textutils documentation built on May 16, 2019, 12:12 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

eellpp/textutils
Utilities for text processing while building models

Man pages for eellpp/textutils
Utilities for text processing while building models

R Package Documentation

Browse R Packages

We want your feedback!

eellpp/textutils Utilities for text processing while building models

Man pages for eellpp/textutils Utilities for text processing while building models

R Package Documentation

Browse R Packages

We want your feedback!

eellpp/textutils
Utilities for text processing while building models

Man pages for eellpp/textutils
Utilities for text processing while building models