Nothing
Provides data to be used by the wordpiece algorithm in order to tokenize text into somewhat meaningful chunks. Included vocabularies were retrieved from <https://huggingface.co/bert-base-cased/resolve/main/vocab.txt> and <https://huggingface.co/bert-base-uncased/resolve/main/vocab.txt> and parsed into an R-friendly format.
Package details |
|
---|---|
Author | Jonathan Bratt [aut] (<https://orcid.org/0000-0003-2859-0076>), Jon Harmon [aut, cre] (<https://orcid.org/0000-0003-4781-4346>), Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph], Google, Inc [cph] (original BERT vocabularies) |
Maintainer | Jon Harmon <jonthegeek@gmail.com> |
License | Apache License (>= 2) |
Version | 2.0.0 |
URL | https://github.com/macmillancontentscience/wordpiece.data |
Package repository | View on CRAN |
Installation |
Install the latest version of this package by entering the following in R:
|
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.