TrimControls: Helper to trim controls for shorter run-times with minimal...

View source: R/pre_processing_data.R

TrimControlsR Documentation

Helper to trim controls for shorter run-times with minimal loss of precision.

Description

[Experimental]

As the number of controls for a Geo test increases, the model complexity grows as does the algorithm's run-time. However, there are diminishing marginal returns in adding too many control locations, especially if their time-series are very similar. TrimControls provides a method to trim the number of controls in order to reduce run-times with minimal loss of precision. In general, it is recommended to have 4 to 5 times the number of controls locations than the ones we have for test locations.

Usage

TrimControls(
  data,
  Y_id = "Y",
  time_id = "time",
  location_id = "location",
  max_controls = 20,
  test_locations = c(),
  forced_control_locations = c()
)

Arguments

data

A data.frame containing the historical conversions by geographic unit. It requires a "locations" column with the geo name, a "Y" column with the outcome data (units), a time column with the indicator of the time period (starting at 1), and covariates.

Y_id

Name of the outcome variable (String).

time_id

Name of the time variable (String).

location_id

Name of the location variable (String).

max_controls

Max number of controls, recommended 4x-5x the number of test locations.

test_locations

List of test locations.

forced_control_locations

List of locations to be forced as controls.

Value

A data frame with reduced control locations.


facebookincubator/GeoLift documentation built on May 31, 2024, 10:09 a.m.