| addNLTKStopwords | add NLTK stopwords to the current list of stopwords |
| addWordsToDB | Add keywords to an sqlite database |
| centerScaleData | center and scale a numeric dataset |
| cleanTitleString | Get a clean string which is based on hueristic for document... |
| createFreqOfKeywordsTerms | Get freq of keywords in dataset |
| createTextVectorFromDataset | createTextVectorFromDataset from title |
| getAcronymRatio | get Acronym Ratio |
| getAUCandPlotROC | Generic method to quickly plot ROC curve and print the AUC... |
| getBOWfeatures_Binary | get BOW features binary - y/n |
| getBOWfeatures_freq | get bag of words features with frew |
| getBOWfeaturesTestDataset | get bag of words test dataset |
| getBOWfeatures_Tfidf | get bag of words with tfidf |
| getBOWKeywords | get bagofwords keywords from a dataset |
| getCamelCaseKeywords | Get camel case keywords from string |
| getCapLettersToCharactersRatio | get capital letters to character ratio |
| getCharacterVector | Get Character Vector |
| getCorpusFromTextVector | get corpus from text vector |
| getCountByPattern | get count by pattern |
| getDataframeFromWordVector | Creates a dataframe whose columns are words in wordvector... |
| getDataSetWithFeatures | get data set with features |
| getDigitCount | Get digit count |
| getDocProbabilityDistribution | Get the topic probability distribution for the docs |
| getFeaturesAboveThreshold | get the features above a threshold value |
| getFeaturesForDataset | Given a dataset, it returns the dataset updated with features |
| getFreqOfKeywordInDataset | Get freq of keywords in dataset |
| getFreqWordFeaturesFromTdm | get freq words from tdm |
| getKeywordWeights | Get the suma of weights of all keywords in string |
| getLDATopicForDocs | Get LDA top Topic terms for docs |
| getLDATopTopicTerms | Get top terms from lda object |
| getModelPerformance | show the model performance statistics like AUC, ROC,... |
| getPredictionsForModel | get predicted results for the model built for text only... |
| getRatio | Apply a regex pattern on each word of a string and find the... |
| getStopWords | Get the stopwords to be used for feature building |
| getStopWordsCount | Get stop Words Count in string |
| getStopWordsRatio | Get the ratio of stop words |
| getSymbolCount | Get symbol count |
| getSymbolToWordsRatio | get Symbol to words ratio |
| getTestTrainDomainDataset | Get test and train dataset |
| getTestTrainGenericDataset | get the test train generic dataset |
| getTextVectorAll | Get clean text from dataframe |
| getWordFeaturesFromTdm | get the feature vector form term document matrix (tdm) |
| getWordFeatureVector | Apply a function to create feature vector from dataset |
| getWordFeatureVectorWithdbh | Apply a function to create feature vector from dataset with... |
| getWords | Get Words |
| isDictWord | Checks if the word is a dict word |
| recreateAndRunModelWithSelectedFeatures | Given a model, will detect the significant features and rerun... |
| removeNonNumericFeatures | remove non numeric features from dataframe |
| runLDAGibbs | Run LDA based on gibbs method |
| setAllBOWFeaturesInDataFrame | set all the words in a dataframe |
| setBOWFeatureInDataFrame | set bag of words in a data frame |
| setBOWkeywords | return the dataset with bag of words keywords set |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.