turf: TURF Analysis

View source: R/turf.R

turfR Documentation

TURF Analysis

Description

Runs Total Unduplicated Reach & Frequency with options for case weights, constraints on combinations, item weights, and three methods of a greedy algorithm.

Usage

turf(
  data,
  items,
  case_weights,
  item_weights,
  k = 1,
  depth = 1,
  force_in,
  force_in_together,
  force_out,
  force_out_together,
  greedy_begin = Inf,
  greedy_entry = "shapley",
  progress = FALSE
)

Arguments

data

A data frame.

items

Columns on which to run TURF. Must contain only ones, zeros, or NA. Suggest using is_onezero ahead of time to check.

case_weights

An optional column of case weights to use in reach calculations. Rows with NA will be removed from the base.

item_weights

An optional named vector of non-zero weights to associate with each item. Items not specified will be given a default weight of 1.

Common examples are profit, revenue, or simply relative importance weights.

k

Set size, number of items to choose in a combination. Can be a vector of values from 1 to number of items. Values outside of that range will be silently ignored. floor is used to override accidental use of decimals. Default is 1.

depth

Number of items needed in order to be considered "reached." Can be any number between 1 to number of items. Default is 1.

force_in, force_in_together, force_out, force_out_together

Options for reducing the number of combinations by adding constraints.

force_in and force_out accept a single tidyselect expression. Items specified here will force every combination to include or exclude those items.

Use force_in_together or force_out_together to make items appear together or not appear together in the combinations. Pass an arbitrary number of tidyselect expressions to together. Duplicates and inclusions/exclusions containing only one item will be silently dropped.

greedy_begin

Set size at which the greedy algorithm will kick in. Default is Inf.

greedy_entry

Method for entering variables into greedy algorithm. Options are "shapley" (default) which uses approximated Shapley Values, "reach" or "freq" for using the combination with the highest reach or frequency, respectively, from the k[i-1] set size.

progress

Display progress? Default is FALSE. Adds a slight processing overhead, but not much. Useful when number of items exceeds 20.

Details

Need some dang details here.

Examples

library(dplyr)

# Simple 10-item TURF
x <- turf(FoodSample, Bisque:Ribeye, k = 1:10)

# With items forced in and out
# Forcing in "Ribeye"
# Forcing out items with an individual reach of < 10%
turf(
    data = FoodSample,
    items = Bisque:Ribeye,
    k = 1:10,
    force_in = Ribeye,
    force_out = where(~mean(.x, na.rm = TRUE) < 0.1)
)

# Forcing items in and out together
turf(
    data = FoodSample,
    items = Bisque:Ribeye,
    case_weights = weight,
    k = c(1:4, 6:10),
    force_in_together = together(
        c(Chicken, Salmon),
        c(Chili, Tofu, Turkey)
    ),
    force_out_together = together(
        matches("eye"),
        c(2, 10)
    ),
    greedy_begin = 10,
    greedy_entry = "reach"
)

# Item weights
turf(
    data = FoodSample,
    items = 2:6,
    k = 1:6,
    item_weights = c(
        Bisque = 1.2,
        Chicken = 2.5,
        Tofu = 2.9,
        Chili = 1.7,
        PorkChop = 3.0
    )
)


ttrodrigz/onezero documentation built on May 9, 2023, 2:59 p.m.