glottolog: Glottolog data from <URL: http://www.glottolog.org>

Description Usage Format Details Source Examples

Description

Data from Glottolog 2016 with added WALS codes and speaker-community size. Various minor corrections and additions were performed in the preparation of the data (see Details). All stocks (i.e. largest reconstructable units) are linked to macroareas, and they are linked to a single root node calles 'World'.

Usage

1
data("glottolog")

Format

A data frame with 22007 observations on the following 10 variables.

name

a character vector with the name of the entity.

father

a character vector with the name of the direct parent entity.

stock

a factor with the highest reconstructable unit. This column is added just for convenience, it does not add any new information.

glottocode

a character vector with the glottocode. The same identifier is added as rownames of the data.

iso

a character vector with ISO 639-3 language codes

wals

a character vector with WALS language codes

level

a factor with levels dialect, family and language

longitude

a numeric vector with geographic coordinates as available in the Glottolog

latitude

a numeric vector with geographic coordinates as available in the Glottolog

population

a numeric vector with speaker community size from an old Ethnologue version (13th Edition), licensed to the MPI-EVA in Leipzig.

Details

For Glottolog data: the names were uniquified by adding a glottocode when a name occurs more than once (typically in some cases of a language and a family having the same name). Entries classified as 'bookkeeping', 'unattested' 'artificial language', 'sign language', 'speech register' and 'unclassifiable' were removed. Links to WALS codes were added: note that about 20 links are missing, and for the non-unique links one link was chosen by data availability. Some macro codes from ISO 639-3 were added.

A level 'area' was added to the tree, separating all languages in six areas: Eurasia, Africa, Southeast Asia, Sahul, North America and South America. This is reminiscent of the proposal from Dryer (1992), though Austronesian is grouped with Southeast Asia here, because that makes more sense genealogically. Still, these nodes are surely not monophyletic! Mixed languages are not assigned to an area.

Please note that the data provided here is not identical to the online version of Glottolog, as the online version is constantly being updated! This is Glottolog 2016. Updates might be made available when they are provided for download from the website.

The format of the glottolog data might seem a bit convoluted, but by using getTree it is actually really easy to extract genealogical parts of the glottolog data and by using FromDataFrameNetwork this can be nicely plotted and turned into various tree format as used in R.

Source

Glottolog 2016 data from http://www.glottolog.org. WALS 2013 data from http://www.glottolog.org. Information on macrolanguages from http://www-01.sil.org/iso639-3/macrolanguages.asp. All data downloaded in March 2017. Population numbers are from the 13th edition of the Ethnologue, licenced to the MPI-EVA in Leipzig.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# use getTree() to select genealogical parts of the data
data(glottolog)

( aalawa <- getTree(up = "aala1237", kind = "glottocode") )
( kandas <- getTree(down = "Kandas-Duke of York") )
( treeFull <- getTree(up = c("deu", "eng", "ind", "cha"), kind = "iso") )
( treeReduced <- getTree(up = c("deu", "eng", "ind", "cha"), kind = "iso", reduce = TRUE) )

# check out areas
( areas <- glottolog[glottolog$level == "area", "name"] )
# stocks in Southeast Asia
glottolog[glottolog$father == areas[1], "name"]

## Not run: 
# use FromDataFrameNetwork() to visualize the tree
# and export it into various other tree formats in R

library(data.tree)
treeF <- FromDataFrameNetwork(treeFull)
treeR <- FromDataFrameNetwork(treeReduced)

plot(treeF)
plot(treeR)

# turn into phylo format from library 'ape'
t <- as.phylo.Node(treeR)
plot(t)

# turn into dendrogram
t <- as.dendrogram(treeF)
plot(t, center = T)

## End(Not run)

Example output

                                            name
aala1237                                  Aalawa
area0001                          Southeast Asia
aust1307                            Austronesian
cent2237       Central-Eastern Malayo-Polynesian
east2712               Eastern Malayo-Polynesian
kand1307                     Kandas-Duke of York
labe1241                             Label-Bilur
mala1545                       Malayo-Polynesian
meso1253                 Meso Melanesian linkage
newi1242 New Ireland-Northwest Solomonic linkage
nucl1752                    Nuclear Austronesian
ocea1241                                 Oceanic
ramo1244                               Ramoaaina
stge1234                       St George linkage
west2818                 Western Oceanic linkage
                                          father        stock glottocode  iso
aala1237                               Ramoaaina Austronesian   aala1237 <NA>
area0001                                   World         <NA>       <NA> <NA>
aust1307                          Southeast Asia Austronesian   aust1307 <NA>
cent2237                       Malayo-Polynesian Austronesian   cent2237 <NA>
east2712       Central-Eastern Malayo-Polynesian Austronesian   east2712 <NA>
kand1307                             Label-Bilur Austronesian   kand1307 <NA>
labe1241                       St George linkage Austronesian   labe1241 <NA>
mala1545                    Nuclear Austronesian Austronesian   mala1545 <NA>
meso1253                 Western Oceanic linkage Austronesian   meso1253 <NA>
newi1242                 Meso Melanesian linkage Austronesian   newi1242 <NA>
nucl1752                            Austronesian Austronesian   nucl1752 <NA>
ocea1241               Eastern Malayo-Polynesian Austronesian   ocea1241 <NA>
ramo1244                     Kandas-Duke of York Austronesian   ramo1244  rai
stge1234 New Ireland-Northwest Solomonic linkage Austronesian   stge1234 <NA>
west2818                                 Oceanic Austronesian   west2818 <NA>
         wals    level longitude latitude population
aala1237 <NA>  dialect        NA       NA         NA
area0001 <NA>     area        NA       NA         NA
aust1307 <NA>   family        NA       NA         NA
cent2237 <NA>   family        NA       NA         NA
east2712 <NA>   family        NA       NA         NA
kand1307 <NA>   family        NA       NA         NA
labe1241 <NA>   family        NA       NA         NA
mala1545 <NA>   family        NA       NA         NA
meso1253 <NA>   family        NA       NA         NA
newi1242 <NA>   family        NA       NA         NA
nucl1752 <NA>   family        NA       NA         NA
ocea1241 <NA>   family        NA       NA         NA
ramo1244 <NA> language   152.451 -4.17306      10266
stge1234 <NA>   family        NA       NA         NA
west2818 <NA>   family        NA       NA         NA
                                            name
aala1237                                  Aalawa
area0001                          Southeast Asia
aust1307                            Austronesian
cent2237       Central-Eastern Malayo-Polynesian
east2712               Eastern Malayo-Polynesian
kand1301                                  Kandas
kand1307                     Kandas-Duke of York
labe1241                             Label-Bilur
maka1306                                  Makada
mala1545                       Malayo-Polynesian
meso1253                 Meso Melanesian linkage
molo1260                                   Molot
newi1242 New Ireland-Northwest Solomonic linkage
nucl1752                    Nuclear Austronesian
ocea1241                                 Oceanic
ramo1244                               Ramoaaina
stge1234                       St George linkage
west2818                 Western Oceanic linkage
                                          father        stock glottocode  iso
aala1237                               Ramoaaina Austronesian   aala1237 <NA>
area0001                                   World         <NA>       <NA> <NA>
aust1307                          Southeast Asia Austronesian   aust1307 <NA>
cent2237                       Malayo-Polynesian Austronesian   cent2237 <NA>
east2712       Central-Eastern Malayo-Polynesian Austronesian   east2712 <NA>
kand1301                     Kandas-Duke of York Austronesian   kand1301  kqw
kand1307                             Label-Bilur Austronesian   kand1307 <NA>
labe1241                       St George linkage Austronesian   labe1241 <NA>
maka1306                               Ramoaaina Austronesian   maka1306 <NA>
mala1545                    Nuclear Austronesian Austronesian   mala1545 <NA>
meso1253                 Western Oceanic linkage Austronesian   meso1253 <NA>
molo1260                               Ramoaaina Austronesian   molo1260 <NA>
newi1242                 Meso Melanesian linkage Austronesian   newi1242 <NA>
nucl1752                            Austronesian Austronesian   nucl1752 <NA>
ocea1241               Eastern Malayo-Polynesian Austronesian   ocea1241 <NA>
ramo1244                     Kandas-Duke of York Austronesian   ramo1244  rai
stge1234 New Ireland-Northwest Solomonic linkage Austronesian   stge1234 <NA>
west2818                                 Oceanic Austronesian   west2818 <NA>
         wals    level longitude latitude population
aala1237 <NA>  dialect        NA       NA         NA
area0001 <NA>     area        NA       NA         NA
aust1307 <NA>   family        NA       NA         NA
cent2237 <NA>   family        NA       NA         NA
east2712 <NA>   family        NA       NA         NA
kand1301 <NA> language   152.781 -4.36520        480
kand1307 <NA>   family        NA       NA         NA
labe1241 <NA>   family        NA       NA         NA
maka1306 <NA>  dialect        NA       NA         NA
mala1545 <NA>   family        NA       NA         NA
meso1253 <NA>   family        NA       NA         NA
molo1260 <NA>  dialect        NA       NA         NA
newi1242 <NA>   family        NA       NA         NA
nucl1752 <NA>   family        NA       NA         NA
ocea1241 <NA>   family        NA       NA         NA
ramo1244 <NA> language   152.451 -4.17306      10266
stge1234 <NA>   family        NA       NA         NA
west2818 <NA>   family        NA       NA         NA
                                   name                         father
angl1264                  Anglo-Frisian             North Sea Germanic
angl1265                        Anglian                  Anglo-Frisian
area0001                 Southeast Asia                          World
area0004                        Eurasia                          World
aust1307                   Austronesian                 Southeast Asia
cham1312                       Chamorro              Malayo-Polynesian
fran1268                     Franconian                  West Germanic
germ1287                       Germanic                  Indo-European
high1287                High Franconian                     Franconian
indo1316                     Indonesian   Indonesian Archipelago Malay
indo1319                  Indo-European                        Eurasia
indo1326   Indonesian Archipelago Malay                Nuclear Malayic
macr1271                  Macro-English                        Mercian
mala1536                Malayo-Sumbawan              Malayo-Polynesian
mala1538                        Malayic North and East Malayo-Sumbawan
mala1545              Malayo-Polynesian           Nuclear Austronesian
merc1242                        Mercian                        Anglian
nort3152             Northwest Germanic                       Germanic
nort3170 North and East Malayo-Sumbawan                Malayo-Sumbawan
nort3175             North Sea Germanic                  West Germanic
nucl1733                Nuclear Malayic                        Malayic
nucl1752           Nuclear Austronesian                   Austronesian
stan1293               Standard English                  Macro-English
stan1295                Standard German                High Franconian
west2793                  West Germanic             Northwest Germanic
                 stock glottocode  iso wals    level longitude latitude
angl1264 Indo-European   angl1264 <NA> <NA>   family        NA       NA
angl1265 Indo-European   angl1265 <NA> <NA>   family        NA       NA
area0001          <NA>       <NA> <NA> <NA>     area        NA       NA
area0004          <NA>       <NA> <NA> <NA>     area        NA       NA
aust1307  Austronesian   aust1307 <NA> <NA>   family        NA       NA
cham1312  Austronesian   cham1312  cha  cha language  145.2760 14.33070
fran1268 Indo-European   fran1268 <NA> <NA>   family        NA       NA
germ1287 Indo-European   germ1287 <NA> <NA>   family        NA       NA
high1287 Indo-European   high1287 <NA> <NA>   family        NA       NA
indo1316  Austronesian   indo1316  ind  ind language  109.7160 -7.33458
indo1319 Indo-European   indo1319 <NA> <NA>   family        NA       NA
indo1326  Austronesian   indo1326 <NA> <NA>   family        NA       NA
macr1271 Indo-European   macr1271 <NA> <NA>   family        NA       NA
mala1536  Austronesian   mala1536 <NA> <NA>   family        NA       NA
mala1538  Austronesian   mala1538 <NA> <NA>   family        NA       NA
mala1545  Austronesian   mala1545 <NA> <NA>   family        NA       NA
merc1242 Indo-European   merc1242 <NA> <NA>   family        NA       NA
nort3152 Indo-European   nort3152 <NA> <NA>   family        NA       NA
nort3170  Austronesian   nort3170 <NA> <NA>   family        NA       NA
nort3175 Indo-European   nort3175 <NA> <NA>   family        NA       NA
nucl1733  Austronesian   nucl1733  msa <NA>   family        NA       NA
nucl1752  Austronesian   nucl1752 <NA> <NA>   family        NA       NA
stan1293 Indo-European   stan1293  eng  eng language   -1.0000 53.00000
stan1295 Indo-European   stan1295  deu  ger language   12.4676 48.64900
west2793 Indo-European   west2793 <NA> <NA>   family        NA       NA
         population
angl1264         NA
angl1265         NA
area0001         NA
area0004         NA
aust1307         NA
cham1312      92700
fran1268         NA
germ1287         NA
high1287         NA
indo1316   23187680
indo1319         NA
indo1326         NA
macr1271         NA
mala1536         NA
mala1538         NA
mala1545         NA
merc1242         NA
nort3152         NA
nort3170         NA
nort3175         NA
nucl1733         NA
nucl1752         NA
stan1293  328008138
stan1295   90294110
west2793         NA
                      name            father         stock glottocode  iso wals
cham1312          Chamorro Malayo-Polynesian  Austronesian   cham1312  cha  cha
indo1316        Indonesian Malayo-Polynesian  Austronesian   indo1316  ind  ind
mala1545 Malayo-Polynesian             World  Austronesian   mala1545 <NA> <NA>
stan1293  Standard English     West Germanic Indo-European   stan1293  eng  eng
stan1295   Standard German     West Germanic Indo-European   stan1295  deu  ger
west2793     West Germanic             World Indo-European   west2793 <NA> <NA>
            level longitude latitude population
cham1312 language  145.2760 14.33070      92700
indo1316 language  109.7160 -7.33458   23187680
mala1545   family        NA       NA         NA
stan1293 language   -1.0000 53.00000  328008138
stan1295 language   12.4676 48.64900   90294110
west2793   family        NA       NA         NA
[1] "Southeast Asia" "Sahul"          "Africa"         "Eurasia"       
[5] "South America"  "North America" 
 [1] "Austroasiatic"    "Austronesian"     "Dravidian"        "Great Andamanese"
 [5] "Hmong-Mien"       "Hruso"            "Jarawa-Onge"      "Kusunda"         
 [9] "Nihali"           "Shom Peng"        "Sino-Tibetan"     "Tai-Kadai"       
sh: 1: cannot create /dev/null: Permission denied
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") : cannot open file '/etc/timezone': Permission denied
Warning: Your system is mis-configured: '/etc/localtime' is not a symlink
sh: 1: cannot create /dev/null: Permission denied
sh: 1: cannot create /dev/null: Permission denied
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") : cannot open file '/etc/timezone': Permission denied
Warning: Your system is mis-configured: '/etc/localtime' is not a symlink
sh: 1: cannot create /dev/null: Permission denied

qlcData documentation built on May 2, 2019, 8:29 a.m.