text_intersect: intersection of words or letters in tokenized text
In textTinyR: Text Processing for Small or Big Data Files

text_intersect

R Documentation

intersection of words or letters in tokenized text

Description

intersection of words or letters in tokenized text

Usage

# utl <- text_intersect$new(token_list1 = NULL, token_list2 = NULL)

Details

This class includes methods for text or character intersection. If both distinct and letters are FALSE then the simple (count or ratio) word intersection will be computed.

Value

a numeric vector

Methods

text_intersect$new(file_data = NULL)
--------------
count_intersect(distinct = FALSE, letters = FALSE)
--------------
ratio_intersect(distinct = FALSE, letters = FALSE)

Methods

Method `new()`

Usage

text_intersect$new(token_list1 = NULL, token_list2 = NULL)

Arguments

token_list1: a list, where each sublist is a tokenized text sequence (token_list1 should be of same length with token_list2)
token_list2: a list, where each sublist is a tokenized text sequence (token_list2 should be of same length with token_list1)

Method `count_intersect()`

Usage

text_intersect$count_intersect(distinct = FALSE, letters = FALSE)

Arguments

distinct: either TRUE or FALSE. If TRUE then the intersection of distinct words (or letters) will be taken into account
letters: either TRUE or FALSE. If TRUE then the intersection of letters in the text sequences will be computed

Method `ratio_intersect()`

Usage

text_intersect$ratio_intersect(distinct = FALSE, letters = FALSE)

Arguments

distinct: either TRUE or FALSE. If TRUE then the intersection of distinct words (or letters) will be taken into account
letters: either TRUE or FALSE. If TRUE then the intersection of letters in the text sequences will be computed

Method `clone()`

The objects of this class are cloneable with this method.

Usage

text_intersect$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

References

https://www.kaggle.com/c/home-depot-product-search-relevance/discussion/20427 by Igor Buinyi

Examples


library(textTinyR)

tok1 = list(c('compare', 'this', 'text'),

            c('and', 'this', 'text'))

tok2 = list(c('with', 'another', 'set'),

            c('of', 'text', 'documents'))


init = text_intersect$new(tok1, tok2)


init$count_intersect(distinct = TRUE, letters = FALSE)


init$ratio_intersect(distinct = FALSE, letters = TRUE)

textTinyR documentation built on June 24, 2024, 5:16 p.m.

textTinyR index

README.md Functionality of the textTinyR package Word vectors - doc2vec - text clustering

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

textTinyR
Text Processing for Small or Big Data Files

text_intersect: intersection of words or letters in tokenized text
In textTinyR: Text Processing for Small or Big Data Files

intersection of words or letters in tokenized text

Description

Usage

Details

Value

Methods

Methods

Public methods

Method `new()`

Usage

Arguments

Method `count_intersect()`

Usage

Arguments

Method `ratio_intersect()`

Usage

Arguments

Method `clone()`

Usage

Arguments

References

Examples

Related to text_intersect in textTinyR...

R Package Documentation

Browse R Packages

We want your feedback!

textTinyR Text Processing for Small or Big Data Files

text_intersect: intersection of words or letters in tokenized text In textTinyR: Text Processing for Small or Big Data Files

intersection of words or letters in tokenized text

Description

Usage

Details

Value

Methods

Methods

Public methods

Method new()

Usage

Arguments

Method count_intersect()

Usage

Arguments

Method ratio_intersect()

Usage

Arguments

Method clone()

Usage

Arguments

References

Examples

Related to text_intersect in textTinyR...

R Package Documentation

Browse R Packages

We want your feedback!

textTinyR
Text Processing for Small or Big Data Files

text_intersect: intersection of words or letters in tokenized text
In textTinyR: Text Processing for Small or Big Data Files

Method `new()`

Method `count_intersect()`

Method `ratio_intersect()`

Method `clone()`