data_corpus_udhr: Universal Declaration of Human Rights

Description Usage Format Details Source

Description

A corpus object containing the Universal Declaration of Human Rights in 464 languages. The files where downloaded from https://unicode.org/udhr/. These have been converted into plain text format by the UDHR in Unicode Project.

Usage

1

Format

The corpus includes the following document variables:

Key

“Key” is the internal key used in the "UDHR in Unicode" database to identify the translations. It has no meaning or relation to any system of tags.

Name

The Ethnologue entry for the language. This is the primary language name given by the Ethnologue, may be followed by a qualifier in parenthesis. You may want to consult the Ethnologue to determine the primary language name if you have difficulty finding a translation by language name.

ISO

ISO 639-3 code of the language

Direction

Text runs from left-to-right (ltr) or right-to-left (rtl)

Details

This corpus only includes texts in the stage 4 or 5 of conversion to Unicode in the project. See https://unicode.org/udhr/translations.html for details.

Source

The UDHR in Unicode Project https://unicode.org/udhr/


quanteda/quanteda.corpora documentation built on Nov. 16, 2020, 12:45 a.m.