jaccard_coef: Jaccard coefficient
In gvegayon/twitterreport: Out-of-the-Box Analysis and Reporting Tools for Twitter

Description Usage Arguments Details Value Methods (by class) References

Calculate the Jaccard (similarity) coefficient between words.

jaccard_coef(x, ...)

## S3 method for class 'list'
jaccard_coef(x, max.size = 1000, dist = FALSE, ...)

## S3 method for class 'character'
jaccard_coef(x, max.size = 1000,
  stopwds = unique(c(tm::stopwords(), letters)), ignore.case = TRUE,
  dist = FALSE, ...)

`x`	Character vector with the phrases (tweets) to be analyzed.
`...`	Further arguments to be passed to the method.
`max.size`	Max number of words to analyze.
`dist`	When true computes one minus Jaccard coef.
`stopwds`	Character vector of stopwords.
`ignore.case`	When true converts all to lower.

The Jaccard index is used as a measure of similarity between two elements. In particular for a given pair of elements x,y it is calculated as

J(S,T) = |S intersection T|/|S U T|

Where S is the set of groups where x is present and T is the set of groups where y. The resulting value is defined between 0 and 1, where 0 corresponds to no similarity at all (the elements don't have a group in common) and 1 represents perfect similarity (both elements are present in the same groups).

A list including a lower triangular dgCMatrix matrix.

list: Method Processes a list of character vectors such as the one obtained from tw_extract()
character: Computes the coef from a vector of characters (splits the text)

Conover, M., Ratkiewicz, J., & Francisco, M. (2011). "Political polarization on twitter". Icwsm, 133(26), 89<e2><80><93>96. http://doi.org/10.1021/ja202932e

gvegayon/twitterreport documentation built on May 17, 2019, 9:30 a.m.