glottodist_subdata: Calculate construction-based distances between languages

View source: R/glottodist.R

glottodist_subdataR Documentation

Calculate construction-based distances between languages


Calculate construction-based distances between languages


  metric = NULL,
  index_type = NULL,
  avg_idx = NULL,
  fixed_idx = NULL



an glottosubdata object


either "gower" or "anderberg"


either "mci" or "ri" or "fmi"


the feature indices over which the average of distances is computed, it must be given when index_type is either "ri" or "fmi".


the feature indices over which the distance of two constructions is computed, it must be given when index_type is either "ri" or "fmi".


object of class dist


The function “glottodist_subdata” returns a “dist” object, the input is a glottosubdata object, it computes the construction-based distance between languages, we refer to the observations of each language as constructions. The distance d(A_i, B_j) between two constructions A_i in a language A and B_j in a language B is determined by the argument “metric”, whose value is either “gower” or “anderberg”. When “index_type” is “mci”, it returns the “matching constructions index”:

MCI(A, B) := \frac{1}{2|A|}\sum\limits_{A_i\in A}\min\limits_{B_j\in B}d(A_i, B_j) + \frac{1}{2|B|}\sum\limits_{B_i\in B}\min\limits_{A_j\in A}d(A_j, B_i). When “index_type” is “ri”, it returns the “relative index”:

RI(A, B) = \frac{1}{|M|}\sum\limits_{s\in M}\textrm{AVG}_{A_i(s) = 1 \textrm{ and } B_j(s) = 1}d(A_i^F, B_j^F), here M is the indices of a subset of variables given by the argument “avg_idx” and F is the indices of a subset of variables given by the argument “fixed_idx”, the restricted constructions A_i^F and B_j^F are defined as the constructions A_i, B_j restricted to “fixed_idx” F. When “index_type” is “fmi”, it returns the “form-meaning index”:

FMI(A, B) = \frac{1}{|M||F|} \sum\limits_{s\in M, p\in F} \Big(1 - SIM(\{(A_i^M(s)=1 \textrm{ and }A_i^F(p)=1)\}, \{B_j^M(s) = 1 \textrm{ and }B_j^F(p) = 1\})\Big), here SIM(X, Y)=\min(|X|/|Y|, |Y|/|X|), if both X and Y are empty, SIM(X, Y)=1.


glottosubdata_cnstn <- glottoget(glottodata = "demosubdata_cnstn")
glottodist_subdata(glottosubdata = glottosubdata_cnstn, metric = "gower", index_type = "mci")
glottodist_subdata(glottosubdata = glottosubdata_cnstn, metric = "gower", index_type = "ri",
                   avg_idx = 1:4, fixed_idx = 5:7)
glottodist_subdata(glottosubdata = glottosubdata_cnstn, index_type = "fmi",
                   avg_idx = 1:4, fixed_idx = 5:7)

SietzeN/glottospace documentation built on June 15, 2024, 10:45 p.m.