term_prep: Create New Terms for Data

View source: R/S03_Utilities.R

term_prepR Documentation

Create New Terms for Data

Description

Function that takes a list specifying base variables and new terms to create and updates a data frame appropriately. Allows for easy pre-processing of multiple data frames in the same manner for analyses (e.g., when prepping training and validation sets for data).

Usage

term_prep(x, settings, output = "x")

Arguments

x

A data frame.

settings

A named list, in the form: list(column = list( new1 = list(...), new2 = list(...))), where column is a pre-existing variable in x, and new1, new2, etc. are new terms to be created using variable column. If new terms are to be combined (e.g., interaction effects), provide a list with (1) the element 'new' for the terms to create and (2) the element 'combo' for the terms to combine (see example).

output

A character string, the type of output to return, where 'x' (the default) returns an updated data frame, 'settings' returns an updated list, and 'both' returns a list with both the data frame and list of settings.

Value

Either a data frame, a list of settings, or a list with both the data frame and list of settings.

Examples

data("mtcars")
#' Split into two sets of data
x1 <- mtcars[ seq( 1, 32, 2), ] # Odd
x2 <- mtcars[ seq( 1, 32, 2), ] # Even

lst <- list(
  new = list(
    mpg = list(
      outcome = term_new(
        label = 'Miles per gallon'
      )
    ),
    hp = list(
      log_hp = term_new(
        label = 'Log(Horsepower)',
        transformation = 'log(x)',
        scale = TRUE,
        order = c( 't', 's' )
      )
    ),
    vs = list(
      vs_0v1 = term_new(
        label = 'Engine: V-shaped vs. straight',
        coding = term_coding_effect( 1, 0 ),
        scale = TRUE,
        order = c( 'c', 's' )
      )
    ),
    am = list(
      am_0v1 = term_new(
        label = 'Transmission: Automatic vs. manual',
        coding = term_coding_effect( 1, 0 ),
        scale = TRUE,
        order = c( 'c', 's' )
      )
    )
  ),
  combo = list(
    vs_x_am = term_combo(
      combine = c( vs = 'vs_0v1', am = 'am_0v1' ),
      transformation = 'vs*am',
      scale = TRUE
    )
  )
)
# Add info on mean/SD from 'x1' data
lst <- x1 |> term_prep( lst, output = 'settings' )
# Update 'x1' and 'x2'
x1 <- x1 |> term_prep( lst )
x2 <- x2 |> term_prep( lst )

# Fit 'x1' data
fit <- lm( outcome ~ log_hp + vs_0v1 + am_0v1 + vs_x_am, data = x1)
# Predict 'x2' data
predict( fit, newdata = x2 )

rettopnivek/arfpam documentation built on Oct. 20, 2024, 7:24 p.m.