Splits the index into several smaller sub-indexes (“shards”),
which are disk-based. If your entire index fits in memory
(~one million documents per 1GB of RAM), you can also use
the similarity_matrix
. It is more simple but
does not scale as well: it keeps the entire index in RAM,
no sharding. It also do not support adding new document
to the index dynamically.
1 2 3 4 5 6 7 8 9 10 11 | similarity(corpus, ...)
## S3 method for class 'gensim.corpora.mmcorpus.MmCorpus'
similarity(corpus, num_features,
...)
## S3 method for class 'mm_file'
similarity(corpus, num_features, ...)
## S3 method for class 'python.builtin.tuple'
similarity(corpus, num_features, ...)
|
corpus |
A corpus. |
... |
Any other parameters to pass to the Python function, see official documentation. |
num_features |
Size of the dictionary i.e.: |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.