Description Usage Format Details Source Examples
Data from Glottolog 2016 with added WALS codes and speaker-community size. Various minor corrections and additions were performed in the preparation of the data (see Details). All stocks (i.e. largest reconstructable units) are linked to macroareas, and they are linked to a single root node calles 'World'.
1 | data("glottolog")
|
A data frame with 22007 observations on the following 10 variables.
name
a character vector with the name of the entity.
father
a character vector with the name of the direct parent entity.
stock
a factor with the highest reconstructable unit. This column is added just for convenience, it does not add any new information.
glottocode
a character vector with the glottocode. The same identifier is added as rownames of the data.
iso
a character vector with ISO 639-3 language codes
wals
a character vector with WALS language codes
level
a factor with levels dialect
, family
and language
longitude
a numeric vector with geographic coordinates as available in the Glottolog
latitude
a numeric vector with geographic coordinates as available in the Glottolog
population
a numeric vector with speaker community size from an old Ethnologue version (13th Edition), licensed to the MPI-EVA in Leipzig.
For Glottolog data: the names were uniquified by adding a glottocode when a name occurs more than once (typically in some cases of a language and a family having the same name). Entries classified as 'bookkeeping', 'unattested' 'artificial language', 'sign language', 'speech register' and 'unclassifiable' were removed. Links to WALS codes were added: note that about 20 links are missing, and for the non-unique links one link was chosen by data availability. Some macro codes from ISO 639-3 were added.
A level 'area' was added to the tree, separating all languages in six areas: Eurasia, Africa, Southeast Asia, Sahul, North America and South America. This is reminiscent of the proposal from Dryer (1992), though Austronesian is grouped with Southeast Asia here, because that makes more sense genealogically. Still, these nodes are surely not monophyletic! Mixed languages are not assigned to an area.
Please note that the data provided here is not identical to the online version of Glottolog, as the online version is constantly being updated! This is Glottolog 2016. Updates might be made available when they are provided for download from the website.
The format of the glottolog data might seem a bit convoluted, but by using getTree
it is actually really easy to extract genealogical parts of the glottolog data and by using FromDataFrameNetwork
this can be nicely plotted and turned into various tree format as used in R.
Glottolog 2016 data from http://www.glottolog.org. WALS 2013 data from http://www.glottolog.org. Information on macrolanguages from http://www-01.sil.org/iso639-3/macrolanguages.asp. All data downloaded in March 2017. Population numbers are from the 13th edition of the Ethnologue, licenced to the MPI-EVA in Leipzig.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | # use getTree() to select genealogical parts of the data
data(glottolog)
( aalawa <- getTree(up = "aala1237", kind = "glottocode") )
( kandas <- getTree(down = "Kandas-Duke of York") )
( treeFull <- getTree(up = c("deu", "eng", "ind", "cha"), kind = "iso") )
( treeReduced <- getTree(up = c("deu", "eng", "ind", "cha"), kind = "iso", reduce = TRUE) )
# check out areas
( areas <- glottolog[glottolog$level == "area", "name"] )
# stocks in Southeast Asia
glottolog[glottolog$father == areas[1], "name"]
## Not run:
# use FromDataFrameNetwork() to visualize the tree
# and export it into various other tree formats in R
library(data.tree)
treeF <- FromDataFrameNetwork(treeFull)
treeR <- FromDataFrameNetwork(treeReduced)
plot(treeF)
plot(treeR)
# turn into phylo format from library 'ape'
t <- as.phylo.Node(treeR)
plot(t)
# turn into dendrogram
t <- as.dendrogram(treeF)
plot(t, center = T)
## End(Not run)
|
name
aala1237 Aalawa
area0001 Southeast Asia
aust1307 Austronesian
cent2237 Central-Eastern Malayo-Polynesian
east2712 Eastern Malayo-Polynesian
kand1307 Kandas-Duke of York
labe1241 Label-Bilur
mala1545 Malayo-Polynesian
meso1253 Meso Melanesian linkage
newi1242 New Ireland-Northwest Solomonic linkage
nucl1752 Nuclear Austronesian
ocea1241 Oceanic
ramo1244 Ramoaaina
stge1234 St George linkage
west2818 Western Oceanic linkage
father stock glottocode iso
aala1237 Ramoaaina Austronesian aala1237 <NA>
area0001 World <NA> <NA> <NA>
aust1307 Southeast Asia Austronesian aust1307 <NA>
cent2237 Malayo-Polynesian Austronesian cent2237 <NA>
east2712 Central-Eastern Malayo-Polynesian Austronesian east2712 <NA>
kand1307 Label-Bilur Austronesian kand1307 <NA>
labe1241 St George linkage Austronesian labe1241 <NA>
mala1545 Nuclear Austronesian Austronesian mala1545 <NA>
meso1253 Western Oceanic linkage Austronesian meso1253 <NA>
newi1242 Meso Melanesian linkage Austronesian newi1242 <NA>
nucl1752 Austronesian Austronesian nucl1752 <NA>
ocea1241 Eastern Malayo-Polynesian Austronesian ocea1241 <NA>
ramo1244 Kandas-Duke of York Austronesian ramo1244 rai
stge1234 New Ireland-Northwest Solomonic linkage Austronesian stge1234 <NA>
west2818 Oceanic Austronesian west2818 <NA>
wals level longitude latitude population
aala1237 <NA> dialect NA NA NA
area0001 <NA> area NA NA NA
aust1307 <NA> family NA NA NA
cent2237 <NA> family NA NA NA
east2712 <NA> family NA NA NA
kand1307 <NA> family NA NA NA
labe1241 <NA> family NA NA NA
mala1545 <NA> family NA NA NA
meso1253 <NA> family NA NA NA
newi1242 <NA> family NA NA NA
nucl1752 <NA> family NA NA NA
ocea1241 <NA> family NA NA NA
ramo1244 <NA> language 152.451 -4.17306 10266
stge1234 <NA> family NA NA NA
west2818 <NA> family NA NA NA
name
aala1237 Aalawa
area0001 Southeast Asia
aust1307 Austronesian
cent2237 Central-Eastern Malayo-Polynesian
east2712 Eastern Malayo-Polynesian
kand1301 Kandas
kand1307 Kandas-Duke of York
labe1241 Label-Bilur
maka1306 Makada
mala1545 Malayo-Polynesian
meso1253 Meso Melanesian linkage
molo1260 Molot
newi1242 New Ireland-Northwest Solomonic linkage
nucl1752 Nuclear Austronesian
ocea1241 Oceanic
ramo1244 Ramoaaina
stge1234 St George linkage
west2818 Western Oceanic linkage
father stock glottocode iso
aala1237 Ramoaaina Austronesian aala1237 <NA>
area0001 World <NA> <NA> <NA>
aust1307 Southeast Asia Austronesian aust1307 <NA>
cent2237 Malayo-Polynesian Austronesian cent2237 <NA>
east2712 Central-Eastern Malayo-Polynesian Austronesian east2712 <NA>
kand1301 Kandas-Duke of York Austronesian kand1301 kqw
kand1307 Label-Bilur Austronesian kand1307 <NA>
labe1241 St George linkage Austronesian labe1241 <NA>
maka1306 Ramoaaina Austronesian maka1306 <NA>
mala1545 Nuclear Austronesian Austronesian mala1545 <NA>
meso1253 Western Oceanic linkage Austronesian meso1253 <NA>
molo1260 Ramoaaina Austronesian molo1260 <NA>
newi1242 Meso Melanesian linkage Austronesian newi1242 <NA>
nucl1752 Austronesian Austronesian nucl1752 <NA>
ocea1241 Eastern Malayo-Polynesian Austronesian ocea1241 <NA>
ramo1244 Kandas-Duke of York Austronesian ramo1244 rai
stge1234 New Ireland-Northwest Solomonic linkage Austronesian stge1234 <NA>
west2818 Oceanic Austronesian west2818 <NA>
wals level longitude latitude population
aala1237 <NA> dialect NA NA NA
area0001 <NA> area NA NA NA
aust1307 <NA> family NA NA NA
cent2237 <NA> family NA NA NA
east2712 <NA> family NA NA NA
kand1301 <NA> language 152.781 -4.36520 480
kand1307 <NA> family NA NA NA
labe1241 <NA> family NA NA NA
maka1306 <NA> dialect NA NA NA
mala1545 <NA> family NA NA NA
meso1253 <NA> family NA NA NA
molo1260 <NA> dialect NA NA NA
newi1242 <NA> family NA NA NA
nucl1752 <NA> family NA NA NA
ocea1241 <NA> family NA NA NA
ramo1244 <NA> language 152.451 -4.17306 10266
stge1234 <NA> family NA NA NA
west2818 <NA> family NA NA NA
name father
angl1264 Anglo-Frisian North Sea Germanic
angl1265 Anglian Anglo-Frisian
area0001 Southeast Asia World
area0004 Eurasia World
aust1307 Austronesian Southeast Asia
cham1312 Chamorro Malayo-Polynesian
fran1268 Franconian West Germanic
germ1287 Germanic Indo-European
high1287 High Franconian Franconian
indo1316 Indonesian Indonesian Archipelago Malay
indo1319 Indo-European Eurasia
indo1326 Indonesian Archipelago Malay Nuclear Malayic
macr1271 Macro-English Mercian
mala1536 Malayo-Sumbawan Malayo-Polynesian
mala1538 Malayic North and East Malayo-Sumbawan
mala1545 Malayo-Polynesian Nuclear Austronesian
merc1242 Mercian Anglian
nort3152 Northwest Germanic Germanic
nort3170 North and East Malayo-Sumbawan Malayo-Sumbawan
nort3175 North Sea Germanic West Germanic
nucl1733 Nuclear Malayic Malayic
nucl1752 Nuclear Austronesian Austronesian
stan1293 Standard English Macro-English
stan1295 Standard German High Franconian
west2793 West Germanic Northwest Germanic
stock glottocode iso wals level longitude latitude
angl1264 Indo-European angl1264 <NA> <NA> family NA NA
angl1265 Indo-European angl1265 <NA> <NA> family NA NA
area0001 <NA> <NA> <NA> <NA> area NA NA
area0004 <NA> <NA> <NA> <NA> area NA NA
aust1307 Austronesian aust1307 <NA> <NA> family NA NA
cham1312 Austronesian cham1312 cha cha language 145.2760 14.33070
fran1268 Indo-European fran1268 <NA> <NA> family NA NA
germ1287 Indo-European germ1287 <NA> <NA> family NA NA
high1287 Indo-European high1287 <NA> <NA> family NA NA
indo1316 Austronesian indo1316 ind ind language 109.7160 -7.33458
indo1319 Indo-European indo1319 <NA> <NA> family NA NA
indo1326 Austronesian indo1326 <NA> <NA> family NA NA
macr1271 Indo-European macr1271 <NA> <NA> family NA NA
mala1536 Austronesian mala1536 <NA> <NA> family NA NA
mala1538 Austronesian mala1538 <NA> <NA> family NA NA
mala1545 Austronesian mala1545 <NA> <NA> family NA NA
merc1242 Indo-European merc1242 <NA> <NA> family NA NA
nort3152 Indo-European nort3152 <NA> <NA> family NA NA
nort3170 Austronesian nort3170 <NA> <NA> family NA NA
nort3175 Indo-European nort3175 <NA> <NA> family NA NA
nucl1733 Austronesian nucl1733 msa <NA> family NA NA
nucl1752 Austronesian nucl1752 <NA> <NA> family NA NA
stan1293 Indo-European stan1293 eng eng language -1.0000 53.00000
stan1295 Indo-European stan1295 deu ger language 12.4676 48.64900
west2793 Indo-European west2793 <NA> <NA> family NA NA
population
angl1264 NA
angl1265 NA
area0001 NA
area0004 NA
aust1307 NA
cham1312 92700
fran1268 NA
germ1287 NA
high1287 NA
indo1316 23187680
indo1319 NA
indo1326 NA
macr1271 NA
mala1536 NA
mala1538 NA
mala1545 NA
merc1242 NA
nort3152 NA
nort3170 NA
nort3175 NA
nucl1733 NA
nucl1752 NA
stan1293 328008138
stan1295 90294110
west2793 NA
name father stock glottocode iso wals
cham1312 Chamorro Malayo-Polynesian Austronesian cham1312 cha cha
indo1316 Indonesian Malayo-Polynesian Austronesian indo1316 ind ind
mala1545 Malayo-Polynesian World Austronesian mala1545 <NA> <NA>
stan1293 Standard English West Germanic Indo-European stan1293 eng eng
stan1295 Standard German West Germanic Indo-European stan1295 deu ger
west2793 West Germanic World Indo-European west2793 <NA> <NA>
level longitude latitude population
cham1312 language 145.2760 14.33070 92700
indo1316 language 109.7160 -7.33458 23187680
mala1545 family NA NA NA
stan1293 language -1.0000 53.00000 328008138
stan1295 language 12.4676 48.64900 90294110
west2793 family NA NA NA
[1] "Southeast Asia" "Sahul" "Africa" "Eurasia"
[5] "South America" "North America"
[1] "Austroasiatic" "Austronesian" "Dravidian" "Great Andamanese"
[5] "Hmong-Mien" "Hruso" "Jarawa-Onge" "Kusunda"
[9] "Nihali" "Shom Peng" "Sino-Tibetan" "Tai-Kadai"
sh: 1: cannot create /dev/null: Permission denied
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") : cannot open file '/etc/timezone': Permission denied
Warning: Your system is mis-configured: '/etc/localtime' is not a symlink
sh: 1: cannot create /dev/null: Permission denied
sh: 1: cannot create /dev/null: Permission denied
Error in file(con, "r") : cannot open the connection
In addition: Warning message:
In file(con, "r") : cannot open file '/etc/timezone': Permission denied
Warning: Your system is mis-configured: '/etc/localtime' is not a symlink
sh: 1: cannot create /dev/null: Permission denied
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.