split_by: Add a split to the data.
In JackEdTaylor/LexOPS: A Package and Shiny App for Generating Matched Stimuli

split_by

R Documentation

Add a split to the data.

Description

Specifies splits for one IV for a factorial design. Can be called multiple times for multiple splits.

Usage

split_by(x, var, levels, filter = TRUE, standard_eval = FALSE)

Arguments

`x`	A data frame containing the IV and strings, or a LexOPS_pipeline object resulting from one of `split_by()`, `control_for()`, etc..
`var`	The column to treat as an independent variable (non-standard evaluation).
`levels`	The boundaries to use as levels of this variable (non-standard evaluation). These should be specified in the form `1:3 ~ 4:6 ~ 7:9` or `c(1, 3) ~ c(4, 6) ~ c(7, 9)` for numeric variables, and (e.g.) `⁠"noun" ~ "verb" ~ c"adjective"⁠` for character variables, where levels are separated by the `~` operator. Levels must be non-overlapping.
`filter`	Logical. If TRUE, words which fit no conditions are removed.
`standard_eval`	Logical; bypasses non-standard evaluation, and allows more standard R objects in `var` and `levels`. If `TRUE`, `var` should be a character vector referring to a column in `df` (e.g. `"Zipf.SUBTLEX_UK"`), and `levels` should be a list containing multiple vectors of length 2, each specifying the boundaries of one level's bin (e.g. `list(c(1, 3), c(4, 6), c(7, 20))`). Default = `FALSE`.

Value

Returns df, with a new column (name defined by cond_col argument of set_options()) identifying which level of the IV each string belongs to.

Examples


# Create 3 levels of syllables, for 1-3, 4-6, and 7-20 syllables
lexops |>
  split_by(Syllables.CMU, 1:3 ~ 4:6 ~ 7:20)

# Same split as above, but supplying boundaries as vectors
lexops |>
  split_by(Syllables.CMU, c(1, 3) ~ c(4, 6) ~ c(7, 20))

# Create 2 levels of position of speech, noun and verb
lexops |>
  split_by(PoS.SUBTLEX_UK, "noun" ~ "verb")

# split into two levels: (1) nouns or names, and (2) adjectives or adverbs
lexops |>
 split_by(PoS.SUBTLEX_UK, c("noun", "name") ~ c("adjective", "adverb"))

# Perform two splits
lexops |>
  split_by(Syllables.CMU, 1:3 ~ 4:6 ~ 7:20) |>
  split_by(PoS.SUBTLEX_UK, c("noun", "name") ~ c("adjective", "adverb"))

# Bypass non-standard evaluation
lexops |>
  split_by("Syllables.CMU", list(c(1, 3), c(4, 6), c(7, 20)), standard_eval = TRUE) |>
  split_by("PoS.SUBTLEX_UK", list(c("noun", "name"), "verb"), standard_eval = TRUE)

JackEdTaylor/LexOPS documentation built on Jan. 18, 2025, 10:37 a.m.