make_sampling_table: Generates a word rank-based probabilistic sampling table.
In keras: R Interface to 'Keras'

make_sampling_table

R Documentation

Generates a word rank-based probabilistic sampling table.

Description

Generates a word rank-based probabilistic sampling table.

Usage

make_sampling_table(size, sampling_factor = 1e-05)

Arguments

`size`	Int, number of possible words to sample.
`sampling_factor`	The sampling factor in the word2vec formula.

Details

Used for generating the sampling_table argument for skipgrams(). sampling_table[[i]] is the probability of sampling the word i-th most common word in a dataset (more common words should be sampled less frequently, for balance).

The sampling probabilities are generated according to the sampling distribution used in word2vec:

p(word) = min(1, sqrt(word_frequency / sampling_factor) / (word_frequency / sampling_factor))

We assume that the word frequencies follow Zipf's law (s=1) to derive a numerical approximation of frequency(rank):

frequency(rank) ~ 1/(rank * (log(rank) + gamma) + 1/2 - 1/(12*rank))

where gamma is the Euler-Mascheroni constant.

Value

An array of length size where the ith entry is the probability that a word of rank i should be sampled.

Note

The word2vec formula is: p(word) = min(1, sqrt(word.frequency/sampling_factor) / (word.frequency/sampling_factor))

Package overview Frequently Asked Questions Getting Started with Keras Guide to Keras Basics Guide to the Functional API Guide to the Sequential Model Saving and serializing models Training Callbacks Training Visualization Using Pre-Trained Models Writing Custom Keras Layers Writing Custom Keras Models

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

keras
R Interface to 'Keras'

make_sampling_table: Generates a word rank-based probabilistic sampling table.
In keras: R Interface to 'Keras'

Generates a word rank-based probabilistic sampling table.

Description

Usage

Arguments

Details

Value

Note

See Also

Related to make_sampling_table in keras...

R Package Documentation

Browse R Packages

We want your feedback!

keras R Interface to 'Keras'

make_sampling_table: Generates a word rank-based probabilistic sampling table. In keras: R Interface to 'Keras'

Generates a word rank-based probabilistic sampling table.

Description

Usage

Arguments

Details

Value

Note

See Also

Related to make_sampling_table in keras...

R Package Documentation

Browse R Packages

We want your feedback!

keras
R Interface to 'Keras'

make_sampling_table: Generates a word rank-based probabilistic sampling table.
In keras: R Interface to 'Keras'