Utilities for preprocessing of text corpora into data structures suitable for natural language models: integer sequences or matrices, vocabulary embedding matrices, term-doc, doc-term, term co-occurrence matrices etc. All functions allow for full or partial hashing of the terms in the vocabulary.
Package details |
|
|---|---|
| Maintainer | |
| License | GPL-3 |
| Version | 0.1 |
| URL | https://github.com/vspinu/mlvocab/ |
| Package repository | View on GitHub |
| Installation |
Install the latest version of this package by entering the following in R:
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.