R/RcppExports.R

Defines functions vc_leven leven distance_matrix

Documented in distance_matrix leven vc_leven

# Generated by using Rcpp::compileAttributes() -> do not edit by hand
# Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393

#' Distance matrix for Dialectometry
#'
#' Computes a distance matrix between dialect varieties, the results of which may be used for further analyses and plotting.
#'
#' @param dialect_data A dataframe of dialect data, transcribed in the International Phonetic Alphabet.
#' @param funname The distance metric to be used. This can be chosen from the following: "leven", "vc_leven".
#' @param alignment_normalization A logical value, indicating whether or not the distance scores should be normalized by alignment length.
#' @param delim An optional delimiter, in situations where multiple responses exist in the data.
#' @return A distance matrix, where the values are the difference between dialects based on edit distance.
#' @examples
#' data(Dutch)
#' Dutch <- Dutch[1:3,1:3]
#' distance_matrix(Dutch, funname = "vc_leven", alignment_normalization = TRUE)
distance_matrix <- function(dialect_data, funname, alignment_normalization = FALSE, delim = NULL) {
    .Call(`_dialectR_distance_matrix`, dialect_data, funname, alignment_normalization, delim)
}

#' Edit distance for Dialectometry
#'
#' An edit distance for use in Dialectometry. Allows for normalization by dividing alignment length, and for accommodating multiple responses with Bilbao distance, as proposed by Aurrekoetxea et al (2020).
#'
#' @param vec1 A vector of words.
#' @param vec2 A vector of words to be compared against.
#' @param alignment_normalization A logical value, indicating whether or not the difference scores are to be normalized by alignment length.
#' @param delim An optional delimiter, in situations where multiple responses exist in the data.
#' @return A number indicating the number of operations to transform a string to the other, which optionally may undergo length normalization.
#' @references
#' Aurrekoetxea, G., Nerbonne, J., and Rubio, J. 2020. Unifying Analyses of Multiple Responses. \emph{Dialectologia}, 25:59–86.
#' @examples
#' leven("hit", "hot/hit", alignment_normalization = TRUE, delim = "/")
leven <- function(vec1, vec2, alignment_normalization = FALSE, delim = NULL) {
    .Call(`_dialectR_leven`, vec1, vec2, alignment_normalization, delim)
}

#' VC-sensitive edit distance for Dialectometry
#'
#' An edit distance that is sensitive to vowel and consonant alignment. If the aligned segments are a vowel-consonant pair, the difference is penalized as a score of 2; if not, 1.  Allows for normalization by dividing alignment length, and for accommodating multiple responses with Bilbao distance, as proposed by Aurrekoetxea et al (2020).
#'
#' @param vec1 A vector of words.
#' @param vec2 A vector of words to be compared against.
#' @param alignment_normalization A logical value, indicating whether or not the difference scores are to be normalized by alignment length.
#' @param delim An optional delimiter, in situations where multiple responses exist in the data.
#' @return A number indicating the number of operations to transform a string to the other, which optionally may undergo length normalization.
#' @references
#' Aurrekoetxea, G., Nerbonne, J., and Rubio, J. 2020. Unifying Analyses of Multiple Responses. \emph{Dialectologia}, 25:59–86.
#' @examples
#' vc_leven("hit", "hot/hit", alignment_normalization = TRUE, delim = "/")
vc_leven <- function(vec1, vec2, alignment_normalization = FALSE, delim = NULL) {
    .Call(`_dialectR_vc_leven`, vec1, vec2, alignment_normalization, delim)
}

Try the dialectR package in your browser

Any scripts or data that you put into this service are public.

dialectR documentation built on May 20, 2021, 9:06 a.m.