TfIdfVectorizer: TfIDF(Term Frequency Inverse Document Frequency) Vectorizer

Description Usage Format Usage Methods Examples

Description

Provides an easy way to create tf-idf matrix of features in R. It consists of fit, transform methods (similar to sklearn) to generate tf-idf features.

Usage

1

Format

R6Class object.

Usage

For usage details see Methods, Arguments and Examples sections.

1
2
3
4
tf_object = TfIdfVectorizer$new(max_df=1, min_df=1, max_features=1, smooth_idf=TRUE)
tf_object$fit(sentences)
tf_matrix = tf_object$transform(sentences)
tf_matrix = tf_object$fit_transform(sentences) ## alternate

Methods

$new()

Initialise the instance of the vectorizer

$fit()

creates a memory of count vectorizers but doesn't return anything

$transform()

based on encodings learned in fit method, returns the tf-idf matrix

$fit_transform()

returns tf-idf matrix

Examples

1
2
3
4
5
6
df <- data.frame(sents = c('i am alone in dark.',
                           'mother_mary a lot',
                           'alone in the dark?',
                           'many mothers in the lot....'))
tf <- TfIdfVectorizer$new(smooth_idf = TRUE, min_df = 0.3)
tf_features <- tf$fit_transform(df$sents)

ssi-ashraf/superml documentation built on Nov. 5, 2019, 9:18 a.m.