addNLTKStopwords | add NLTK stopwords to the current list of stopwords |
addWordsToDB | Add keywords to an sqlite database |
centerScaleData | center and scale a numeric dataset |
cleanTitleString | Get a clean string which is based on hueristic for document... |
createFreqOfKeywordsTerms | Get freq of keywords in dataset |
createTextVectorFromDataset | createTextVectorFromDataset from title |
getAcronymRatio | get Acronym Ratio |
getAUCandPlotROC | Generic method to quickly plot ROC curve and print the AUC... |
getBOWfeatures_Binary | get BOW features binary - y/n |
getBOWfeatures_freq | get bag of words features with frew |
getBOWfeaturesTestDataset | get bag of words test dataset |
getBOWfeatures_Tfidf | get bag of words with tfidf |
getBOWKeywords | get bagofwords keywords from a dataset |
getCamelCaseKeywords | Get camel case keywords from string |
getCapLettersToCharactersRatio | get capital letters to character ratio |
getCharacterVector | Get Character Vector |
getCorpusFromTextVector | get corpus from text vector |
getCountByPattern | get count by pattern |
getDataframeFromWordVector | Creates a dataframe whose columns are words in wordvector... |
getDataSetWithFeatures | get data set with features |
getDigitCount | Get digit count |
getDocProbabilityDistribution | Get the topic probability distribution for the docs |
getFeaturesAboveThreshold | get the features above a threshold value |
getFeaturesForDataset | Given a dataset, it returns the dataset updated with features |
getFreqOfKeywordInDataset | Get freq of keywords in dataset |
getFreqWordFeaturesFromTdm | get freq words from tdm |
getKeywordWeights | Get the suma of weights of all keywords in string |
getLDATopicForDocs | Get LDA top Topic terms for docs |
getLDATopTopicTerms | Get top terms from lda object |
getModelPerformance | show the model performance statistics like AUC, ROC,... |
getPredictionsForModel | get predicted results for the model built for text only... |
getRatio | Apply a regex pattern on each word of a string and find the... |
getStopWords | Get the stopwords to be used for feature building |
getStopWordsCount | Get stop Words Count in string |
getStopWordsRatio | Get the ratio of stop words |
getSymbolCount | Get symbol count |
getSymbolToWordsRatio | get Symbol to words ratio |
getTestTrainDomainDataset | Get test and train dataset |
getTestTrainGenericDataset | get the test train generic dataset |
getTextVectorAll | Get clean text from dataframe |
getWordFeaturesFromTdm | get the feature vector form term document matrix (tdm) |
getWordFeatureVector | Apply a function to create feature vector from dataset |
getWordFeatureVectorWithdbh | Apply a function to create feature vector from dataset with... |
getWords | Get Words |
isDictWord | Checks if the word is a dict word |
recreateAndRunModelWithSelectedFeatures | Given a model, will detect the significant features and rerun... |
removeNonNumericFeatures | remove non numeric features from dataframe |
runLDAGibbs | Run LDA based on gibbs method |
setAllBOWFeaturesInDataFrame | set all the words in a dataframe |
setBOWFeatureInDataFrame | set bag of words in a data frame |
setBOWkeywords | return the dataset with bag of words keywords set |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.