match_word | R Documentation |
The match_word()
function has been replaced by match_item()
. This wrapper for the function has been kept to support backwards compatibility to code written in older versions of LexOPS. This wrapper may be removed in the future, so please update your code to use match_item()
.
match_word(
df = LexOPS::lexops,
target,
...,
id_col = "string",
filter = TRUE,
standard_eval = FALSE
)
df |
A data frame to reorder, containing the target string (default = LexOPS::lexops). |
target |
The target string |
... |
Should specify the variables and tolerances in the form |
id_col |
A character vector specifying the column identifying unique observations (e.g. in |
filter |
Logical. If TRUE, matches outside the tolerances specified in vars are removed. If FALSE, a new column, matchFilter is calculated indicating whether or not the string is within all variables' tolerances. (Default = TRUE.) |
standard_eval |
Logical; bypasses non-standard evaluation, and allows more standard R object of list. If |
Returns data frame based on df
. If filter
== TRUE, will only contain matches. If filter
== FALSE, will be the original df
object, with a new column, "matchFilter".
lexops
for the default data frame and associated variables.
# Match by number of syllables exactly
lexops |>
match_word("thicket", Syllables.CMU)
# Match by number of syllables exactly, but keep all entries in the original dataframe
lexops |>
match_word("thicket", Syllables.CMU, filter = FALSE)
# Match by number of syllables exactly, and rhyme
lexops |>
match_word("thicket", Syllables.CMU, Rhyme.CMU)
# Match by length exactly, and closely by frequency (within 0.2 Zipf either way)
lexops |>
match_word("thicket", Length, Zipf.SUBTLEX_UK = -0.2:0.2)
# The syntax makes matching by multiple variables easiy and readable
lexops |>
match_word(
"elephant",
BG.SUBTLEX_UK = -0.005:0.005,
Length = 0:0,
Zipf.SUBTLEX_UK = -0.1:0.1,
PoS.SUBTLEX_UK,
RT.ELP = -10:10
)
# Match using standard evaluation
lexops |>
match_word("thicket", list("Length", c("Zipf.SUBTLEX_UK", -0.2, 0.2)), standard_eval = TRUE)
# Find matches within an orthographic levenshtein distance of 5 from "thicket":
library(dplyr)
library(stringdist)
targ_word <- "thicket"
lexops |>
mutate(old = stringdist(targ_word, string, method="lv")) |>
match_word(targ_word, old = 0:5)
# Find matches within a phonological levenshtein distance of 2 from "thicket":
# (note that this method requires 1-letter phonological transcriptions)
library(dplyr)
library(stringdist)
targ_word <- "thicket"
targ_word_pronun <- lexops |>
filter(string == "thicket") |>
pull(eSpeak.br_1letter)
lexops |>
mutate(pld = stringdist(targ_word_pronun, eSpeak.br_1letter, method="lv")) |>
match_word(targ_word, pld = 0:2)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.