Local and global weighting functions.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |

`x` |
A numeric matrix. |

There are many local and global weighting functions. In this package, local weighting functions are prefixed with `lw_`

and
global weighting functions with `gw_`

, so users can define their own weighting functions.

Local weighting functions (i.e. weighting every cell in the matrix):

`lw_tf`

Term frequency:*f(x) = x*.`lw_raw`

Raw frequency, which is the same as the term frequency:*f(x) = x*.`lw_log`

Logarithm:*f(x) = log(x + 1)*.`lw_bin`

Binary:*f(x) = 1*if*x > 0*and*0*otherwise.

Global weighting functions, weighting the columns of the matrix (hence, these weighting functions work according to expectation for a document-term matrix, i.e. with the documents as the rows and the terms as the columns):

`gw_idf`

Inverse document frequency:*f(x) = log( nrow(x) / n + 1)*where*n =*the number of rows in which the column*>0*.`gw_idf_alt`

Alternative definition of the inverse document frequency:*f(x) = log( nrow(x) / n) + 1*where*n =*the number of rows in which the column*>0*.`gw_gfidf`

Global frequency multiplied by inverse document frequency:*f(x) = colSums(x) / n*where*n =*the number of rows in which the column*>0*.`gw_nor`

Normal(ized) frequency:*f(x) = x / colSums(x^2)*.`gw_ent`

Entropy:*f(x) = 1 +*the relative Shannon entropy.`gw_bin`

Binary:*f(x) = 1*.`gw_raw`

Raw, which is the same as binary:*f(x) = 1*.

A numeric matrix.

1 2 3 4 5 | ```
SndT_Fra <- read.table(system.file("extdata", "SndT_Fra.txt", package = "svs"),
header = TRUE, sep = "\t", quote = "\"", encoding = "UTF-8")
tab.SndT_Fra <- table(SndT_Fra)
lw_log(tab.SndT_Fra)
gw_idf(tab.SndT_Fra)
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.