Utilities for preprocessing of text corpora into data structures suitable for natural language models: integer sequences or matrices, vocabulary embedding matrices, term-doc, doc-term, term co-occurrence matrices etc. All functions allow for full or partial hashing of the terms in the vocabulary.
Package details |
|
---|---|
Maintainer | |
License | GPL-3 |
Version | 0.1 |
URL | https://github.com/vspinu/mlvocab/ |
Package repository | View on GitHub |
Installation |
Install the latest version of this package by entering the following in R:
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.