euc_dists | R Documentation |
Caclulates the Euclidean distance of a word from all other words in a df, on selected variables.
euc_dists(
df = LexOPS::lexops,
target,
vars = "all",
scale = TRUE,
center = TRUE,
weights = NA,
standardise_weights = TRUE,
id_col = "string",
standard_eval = FALSE
)
df |
A data frame. |
target |
The target string (word) that euclidean distances are required for. |
vars |
The variables to be used as dimensions which Euclidean distance should be calculated over. Can be a vector of variable names (e.g. |
scale , center |
How should variables be scaled and/or centred before calculating Euclidean distance? For options, see the |
weights |
An (optional) list of weights, in the same order as |
standardise_weights |
Logical; should the weights be standardised to average to 1 (i.e., sum to the length of |
id_col |
The column containing the strings (default = |
standard_eval |
Logical; bypasses non-standard evaluation, and allows more standard R objects in |
Returns a vector of Euclidean distances, in the order of rows in df
.
# Get the distance of every entry in the `lexops` dataset from the word "thicket".
# (Note: This will be calculated using the dimensions of frequency, arousal, and size)
lexops |>
euc_dists("thicket", c(Zipf.SUBTLEX_UK, AROU.Warriner, SIZE.Glasgow_Norms))
# no scaling or centering
lexops |>
euc_dists(
"thicket",
c(Zipf.SUBTLEX_UK, AROU.Warriner, SIZE.Glasgow_Norms),
scale = FALSE,
center = FALSE
)
# Add Euclidean distance as new column
# (Also sort ascendingly by distance; barbara will have a distance of 0 so will be first)
lexops %>%
dplyr::mutate(ed = euc_dists(., "barbara", c(Length, Zipf.SUBTLEX_UK, BG.SUBTLEX_UK))) |>
dplyr::arrange(ed)
# bypass non-standard evaluation
lexops |>
euc_dists(
"thicket",
c("Zipf.SUBTLEX_UK", "AROU.Warriner", "SIZE.Glasgow_Norms"),
standard_eval = TRUE
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.