make_sampling_table: Generates a word rank-based probabilistic sampling table.
In dfalbel/keras: R Interface to 'Keras'

Description Usage Arguments Details Value Note See Also

Generates a word rank-based probabilistic sampling table.

1	make_sampling_table(size, sampling_factor = 1e-05)

`size`	Int, number of possible words to sample.
`sampling_factor`	The sampling factor in the word2vec formula.

Used for generating the sampling_table argument for skipgrams(). sampling_table[[i]] is the probability of sampling the word i-th most common word in a dataset (more common words should be sampled less frequently, for balance).

The sampling probabilities are generated according to the sampling distribution used in word2vec:

p(word) = min(1, sqrt(word_frequency / sampling_factor) / (word_frequency / sampling_factor))

We assume that the word frequencies follow Zipf's law (s=1) to derive a numerical approximation of frequency(rank):

frequency(rank) ~ 1/(rank * (log(rank) + gamma) + 1/2 - 1/(12*rank))

where gamma is the Euler-Mascheroni constant.

An array of length size where the ith entry is the probability that a word of rank i should be sampled.

The word2vec formula is: p(word) = min(1, sqrt(word.frequency/sampling_factor) / (word.frequency/sampling_factor))

Other text preprocessing: pad_sequences(), skipgrams(), text_hashing_trick(), text_one_hot(), text_to_word_sequence()

dfalbel/keras documentation built on Nov. 27, 2019, 8:16 p.m.

dfalbel/keras index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

dfalbel/keras
R Interface to 'Keras'

make_sampling_table: Generates a word rank-based probabilistic sampling table.
In dfalbel/keras: R Interface to 'Keras'

Description

Usage

Arguments

Details

Value

Note

See Also

Related to make_sampling_table in dfalbel/keras...

R Package Documentation

Browse R Packages

We want your feedback!

dfalbel/keras R Interface to 'Keras'

make_sampling_table: Generates a word rank-based probabilistic sampling table. In dfalbel/keras: R Interface to 'Keras'

Description

Usage

Arguments

Details

Value

Note

See Also

Related to make_sampling_table in dfalbel/keras...

R Package Documentation

Browse R Packages

We want your feedback!

dfalbel/keras
R Interface to 'Keras'

make_sampling_table: Generates a word rank-based probabilistic sampling table.
In dfalbel/keras: R Interface to 'Keras'