europarle_sample: European Parliament Proceedings Parallel Corpus 1996-2011...

europarle_sampleR Documentation

European Parliament Proceedings Parallel Corpus 1996-2011 (Spanish-English)

Description

The Europarl Parallel Corpus is extracted from the proceedings of the European Parliament. This corpus is a sample from the Spanish-English pair.

Usage

data("europarle_sample")

Format

A data frame with 200,000 observations on the following 3 variables.

type

Either: Source or Target language

sentence_id

Id to index the sentence pairs

sentence

Each line from the proceedings, including comments

Details

Version 7 release.

Source

https://www.statmt.org/europarl/

References

Koehn, P. (2005, September). Europarl: A parallel corpus for statistical machine translation. In MT summit (Vol. 5, pp. 79-86).

Examples

data(europarle_sample)

francojc/tadr documentation built on April 26, 2022, 7:55 p.m.