ISO_639: ISO 639 Language Codes

Description Usage Format Details Source References

Description

International Organization for Standardization (ISO) codes for the representation of languages. Consists of four parts, with more parts work in progress. ISO 639-1 consists of 185 two-letter (alpha-2) codes used to identify the world's major languages. ISO 639-2 has three-letter (alpha-3) codes for 485 languages. ISO 639-3 extends the ISO 639-2 alpha-3 codes with an aim to cover all known natural languages. ISO 639-5 defines alpha-3 codes for language families.

Usage

1
2
3
4

Format

ISO_639_2 is a character data frame with variables Alpha_3_B and Alpha_3_T (the ISO 639-2 bibliographic and terminological codes), Alpha_2 (the corresponding ISO 639-1 alpha-2 code if available), and Name (the English name of the language).

ISO_639_3 is a data frame with the following variables:

Id:

a character vector with the ISO 639-3 3-letter (alpha-3) identifiers.

Part2B:

a character vector with the equivalent ISO 639-2 B-code identifiers of the bibliographic applications code set (if existent).

Part2T:

a character vector with the equivalent ISO 639-2 T-code identifiers of the terminology applications code set (if existent).

Part1:

a character vector with the equivalent ISO 639-1 2-letter (alpha-2) identifiers (if existent).

Scope:

a factor with levels "I" (Individual), "M" (Macrolanguage) and "S" (Special).

Type:

a factor with levels "L" (Living languages), "E" (Extinct languages), "A" (Ancient languages), "H" (Historic languages), "C" (Constructed languages), and "S" (Special).

Name:

a character vector with the reference language names.

Comment:

a character vector with a comment relating to one or more of the other variables.

Family:

a character vector with the generic English names of the languages' family or macrolanguage.

eng:

a character vector with the language names in English.

fra:

a character vector with the language names in French (if available).

spa:

a character vector with the language names in Spanish (if available).

zho:

a character vector with the language names in Chinese (if available).

rus:

a character vector with the language names in Russian (if available).

deu:

a character vector with the language names in German (if available).

Variables Family and eng to deu are extracted from the Wikipedia ISO 639-3 language codes pages.

ISO_639_3_Retirements is a data frame giving the languages retired from ISO 639-3, with variables:

Id:

a character vector with the retired codes

Ret_Reason:

a factor with levels "C" (change), "D" (duplicate), "N" (non-existent), "S" (split), and "M" (merge).

Change_To:

a character vector which in the cases of C, D, and M gives the identifier to which all instances of the Id should be changed.

Ret_Remedy:

a character vector with instructions for updating an instance of the retired (split) identifier.

Effective:

a Date object giving the date the retirement became effective.

ISO_639_5 is a data frame with the following variables:

Id

a character vector with the 3-letter (alpha-3) ISO 639-5 identifiers.

English_Name

the family names in English.

French_Name

the family names in French.

Part2

a factor indicating how the family relates to 639-2, with levels "g" (group: consists of several related languages), "r" (rest group: a group of several related languages, from which some specific languages have been excluded), or "" (no 639-2 code).

Hierarchy

an indication of which other language families or groups the current language family or group is a member of (given as 639-5 ids separated by : ).

Details

While most languages are given one code by the ISO 639-2 standard, twenty-two of the languages described have two three-letter codes, a “bibliographic” code (ISO 639-2/B, B-code), which is derived from the English name for the language and was a necessary legacy feature, and a “terminological” code (ISO 639-2/T, T-code), which is derived from the native name for the language. The range qaa to qtz is reserved for local use.

ISO 639-3 is a superset of ISO 639-1 and of the individual languages in ISO 639-2. ISO 639-1 and ISO 639-2 focused on major languages, most frequently represented in the total body of the world's literature. Since ISO 639-2 also includes language collections, whereas Part 3 does not, ISO 639-3 is not a superset of ISO 639-2. Where B and T codes exist in ISO 639-2, ISO 639-3 uses the T-codes.

ISO 639-2 contains codes for some individual and group languages and so any code in it is either in 639-3 or 639-5; 639-5 families may be missing from 639-2.

Source

http://www.loc.gov/standards/iso639-2/ for ISO 639-2;
http://www-01.sil.org/iso639-3/download.asp for ISO 639-3;
http://www.loc.gov/standards/iso639-5/ for ISO 639-5.

References

http://en.wikipedia.org/wiki/ISO_639


ISOcodes documentation built on June 30, 2018, 5:05 p.m.

Related to ISO_639 in ISOcodes...