undid_stage_two: Runs UNDID stage two procedures

View source: R/undid_stage_two.r

undid_stage_twoR Documentation

Runs UNDID stage two procedures

Description

Based on the information given in the received empty_diff_df.csv, computes the appropriate differences in mean outcomes at the local silo and saves as filled_diff_df_$silo_name.csv. Also stores trends data as trends_data_$silo_name.csv.

Usage

undid_stage_two(
  empty_diff_filepath,
  silo_name,
  silo_df,
  time_column,
  outcome_column,
  silo_date_format = NULL,
  consider_covariates = TRUE,
  filepath = tempdir(),
  anonymize_weights = FALSE,
  anonymize_size = 5
)

Arguments

empty_diff_filepath

A character filepath to the empty_diff_df.csv.

silo_name

A character indicating the name of the local silo. Ensure spelling is the same as it is written in the empty_diff_df.csv.

silo_df

A data frame of the local silo's data. Ensure any covariates are spelled the same in this data frame as they are in the empty_diff_df.csv.

time_column

A character which indicates the name of the column in the silo_df which contains the date data. Ensure the time_column references a column of character values.

outcome_column

A character which indicates the name of the column in the silo_df which contains the outcome of interest. Ensure the outcome_column references a column of numeric values.

silo_date_format

A character which indicates the date format which the date strings in the time_column are written in.

consider_covariates

An optional logical parameter which if set to FALSE ignores any of the computations involving the covariates. Defaults to TRUE.

filepath

Character value indicating the filepath to save the CSV files. Defaults to tempdir().

anonymize_weights

A logical value (defaults FALSE) which determines if the counts of n (# of obs. used to calculate a contrast/difference) and n_t (# of treated obs. used in the calculation of a contrast/difference) should be rounded.

anonymize_size

A numeric value. Counts will be rounded to the nearest multiple of this value if anonymize_weights is TRUE (with a minimum value for any count being set as the value given for anonymize_size).

Details

Covariates at the local silo should be renamed to match the spelling used in the empty_diff_df.csv.

Value

A list of data frames. The first being the filled differences data frame, and the second being the trends data data frame. Use the suffix $diff_df to access the filled differences data frame, and use $trends_data to access the trends data data frame.

Examples

# Load data
silo_data <- silo71
empty_diff_path <- system.file("extdata/staggered", "empty_diff_df.csv",
                               package = "undidR")

# Run `undid_stage_two()`
results <- undid_stage_two(
  empty_diff_filepath = empty_diff_path,
  silo_name = "71",
  silo_df = silo_data,
  time_column = "year",
  outcome_column = "coll",
  silo_date_format = "yyyy"
)

# View results
head(results$diff_df)
head(results$trends_data)

# Clean up temporary files
unlink(file.path(tempdir(), c("diff_df_71.csv",
                             "trends_data_71.csv")))

undidR documentation built on Feb. 17, 2026, 1:06 a.m.