View source: R/cas_read_corpus.R
cas_read_corpus | R Documentation |
cas_write_dataset
Read datasets created with cas_write_dataset
cas_read_corpus(
...,
update = FALSE,
path = NULL,
file_format = "parquet",
partition = NULL,
token = "full_text",
corpus_folder = "corpus"
)
... |
Passed to |
update |
Logical, defaults to FALSE. If FALSE, just checks if relevant corpus has been previously stored. If TRUE, it checks if more recent contents are available in the local database. |
path |
Defaults to NULL. If NULL, path is set to the project/website/export/dataset/file_format folder. |
file_format |
Defaults to "parquet". Currently, other options are not implemented. |
partition |
Defaults to NULL. If NULL, the parquet file is not
partitioned. "year" is a common alternative: if set to "year", the parquet
file is partitioned by year. If a |
token |
Defaults to "full_text", which does not tokenise the text
column. If different from |
A dataset as ArrowObject
## Not run:
cas_read_corpus()
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.