R/featurise.R

Defines functions featurise

Documented in featurise

data("ipa_symbols", envir = environment())

#' Add features to list of phones
#'
#' This function counts occurrences of phones and includes basic phonetic features.
#'
#' @param phlist A list of phones or the output of `phonetise()`.
#'
#' @return A tibble.
#' @export
#'
#' @examples
#' ipa <- c("ada", "buba", "kiki", "sa\u0283a")
#' ip_ph <- phonetise(ipa)
#' featurise(ip_ph)
#'
featurise <- function(phlist) {
    feats <- tibble::tibble(
        phone = unlist(phlist)
    ) %>%
    dplyr::count(phone, name = "count") %>%
    dplyr::arrange(count) %>%
    dplyr::mutate(
        base = stringr::str_remove_all(phone, rm_diacritics_regex),
        base = ifelse(
            stringr::str_count(base) > 1,
            stringr::str_sub(base, 1, 1),
            base
        )
    ) %>%
    dplyr::left_join(y = ipa_symbols, by = c("base" = "IPA"))

    return(feats)
}

Try the phonetisr package in your browser

Any scripts or data that you put into this service are public.

phonetisr documentation built on April 3, 2025, 10:49 p.m.