download: Download full GermaParl corpus.

Description Usage Arguments Details Value See Also

Description

The GermaParl R package includes only a small subset of the GermaParl corpus (GERMAPARLMINI). The full corpus is deposited with Zenodo, an open science repository for research data. The germaparl_download_corpus function downloads a tarball with the indexed corpus from the Zenodo repository and moves the corpus data to the system corpus storage. If a corpus registry has not yet been created, an interactive dialogue will assist doing so. When calling the function, a stable internet connection is recommended. The size of the data to be downloaded is about 1 GB.

Usage

1
2
3
4
5
6
7
8
germaparl_download_corpus(
  doi = "https://doi.org/10.5281/zenodo.3742113",
  registry_dir = cwb_registry_dir(),
  corpus_dir = cwb_corpus_dir(registry_dir),
  verbose = interactive(),
  ask = interactive(),
  sample = FALSE
)

Arguments

doi

The DOI (Digital Object Identifier) of the GermaParl tarball at zenodo, presented as a hyperlink. Defaults to the latest version of GermaParl.

registry_dir

Path to the system registry directory. Defaults to value of cwbtools::cwb_registry_dir() to guess the registry directory. We recommend to state the registry directory explicitly.

corpus_dir

Directory where data directories of corpora are located. By default, the directory is guessed using cwbtools::cwb_registry_dir. We recommend to state the directory explicitly.

verbose

Whether to show messages, defaults to TRUE.

ask

A logical value, whether to ask for user input before replacing an existing corpus.

sample

A logical value, whether to download sample data (GERMAPARLSAMPLE) rather than full corpus (GERMAPARL) for testing purposes.

Details

After downloading and installing the tarball with the CWB indexed corpus, the registry file for the GERMAPARL corpus will be amended by the DOI and the corpus version. Afterwards, this information is available for a citation information fitting the corpus used that is provided when calling citation(package = "GermaParl").

Value

Logical value. TRUE if the corpus has been installed successfully.

See Also

An example for using the germaparl_download_corpus function is part of the examples section of the overview documentation of the GermaParl package.


GermaParl documentation built on Oct. 23, 2020, 8:27 p.m.