split_by: Add a split to the data.

View source: R/split_by.R

split_byR Documentation

Add a split to the data.

Description

Specifies splits for one IV for a factorial design. Can be called multiple times for multiple splits.

Usage

split_by(x, var, levels, filter = TRUE, standard_eval = FALSE)

Arguments

x

A data frame containing the IV and strings, or a LexOPS_pipeline object resulting from one of split_by(), control_for(), etc..

var

The column to treat as an independent variable (non-standard evaluation).

levels

The boundaries to use as levels of this variable (non-standard evaluation). These should be specified in the form 1:3 ~ 4:6 ~ 7:9 or c(1, 3) ~ c(4, 6) ~ c(7, 9) for numeric variables, and (e.g.) ⁠"noun" ~ "verb" ~ c"adjective"⁠ for character variables, where levels are separated by the ~ operator. Levels must be non-overlapping.

filter

Logical. If TRUE, words which fit no conditions are removed.

standard_eval

Logical; bypasses non-standard evaluation, and allows more standard R objects in var and levels. If TRUE, var should be a character vector referring to a column in df (e.g. "Zipf.SUBTLEX_UK"), and levels should be a list containing multiple vectors of length 2, each specifying the boundaries of one level's bin (e.g. list(c(1, 3), c(4, 6), c(7, 20))). Default = FALSE.

Value

Returns df, with a new column (name defined by cond_col argument of set_options()) identifying which level of the IV each string belongs to.

See Also

lexops for the default data frame and associated variables.

Examples


# Create 3 levels of syllables, for 1-3, 4-6, and 7-20 syllables
lexops |>
  split_by(Syllables.CMU, 1:3 ~ 4:6 ~ 7:20)

# Same split as above, but supplying boundaries as vectors
lexops |>
  split_by(Syllables.CMU, c(1, 3) ~ c(4, 6) ~ c(7, 20))

# Create 2 levels of position of speech, noun and verb
lexops |>
  split_by(PoS.SUBTLEX_UK, "noun" ~ "verb")

# split into two levels: (1) nouns or names, and (2) adjectives or adverbs
lexops |>
 split_by(PoS.SUBTLEX_UK, c("noun", "name") ~ c("adjective", "adverb"))

# Perform two splits
lexops |>
  split_by(Syllables.CMU, 1:3 ~ 4:6 ~ 7:20) |>
  split_by(PoS.SUBTLEX_UK, c("noun", "name") ~ c("adjective", "adverb"))

# Bypass non-standard evaluation
lexops |>
  split_by("Syllables.CMU", list(c(1, 3), c(4, 6), c(7, 20)), standard_eval = TRUE) |>
  split_by("PoS.SUBTLEX_UK", list(c("noun", "name"), "verb"), standard_eval = TRUE)


JackEdTaylor/LexOPS documentation built on Jan. 18, 2025, 10:37 a.m.