test_segments: Hypothesis Testing of the Segmented Conditional Average...
In Netflix/sherlock: Causal Machine Learning for Segment Discovery and Analysis

Description Usage Arguments Value

Perform a one-sided hypothesis test of whether the estimate conditional average treatment effect (CATE) exceeds a given threshold. In particular, this is a test of the null hypothesis H0: CATE <= threshold against the alternative hypothesis H1: CATE > threshold. This function returns the relevant metrics (e.g., standard error, test statistic) for evaluating this hypothesis test, augmenting the input data.table.

1 2	test_segments(data_with_cate, threshold, segment_by, type = c("rule", "summary"))

`data_with_cate`	A `data.table` containing the input data, augmented with cross-validated nuisance parameter estimates and an estimate of the CATE. This input object should b ecreated by successive calls to `set_est_data` and `est_cate`, or through a wrapper function that composes these function calls automatically.
`threshold`	A `numeric` value indicating the cutoff to be used in determining the treatment decision based on the estimated CATE. The default of zero assigns treatment to segments that ought to benefit from treatment while withholding treatment from those segments that ought to be harmed. It may suit to adjust this value based on the problem context.
`segment_by`	A `character` vector specifying the column names in `data_obs` that correspond to the covariates over which segmentation should be performed. This should be a strict subset of `baseline`.
`type`	A `character` string (of length one) indicating whether the hypothesis testing procedure is meant to assign the segment-specific rule to all units in the input `data_with_cate` or to return a simple table summarizing the segment-specific rule assignment and inference across only the segmentation strata.

A data.table, of one of two types:

Either the full input data, augmented with the segment-specific CATE, the estimated standard error of the estimated CATE, a test statistic for evaluating the statistical difference of the CATE from a threshold, and the segment-specific p-value for this test of a difference. Note that this table is of the same length as the input data, that is, there is a row for each observation and observation-level nuisance estimates.
A summary table containing identifying information for the segments as well as all of the same information noted above for the estimated CATE and the associated hypothesis test. Note that the length of this table matches the number of discovered segments, that is, it is a summary of the segment-specific information; consequently, all observation-level information is discarded from this summary table.

Netflix/sherlock documentation built on Dec. 17, 2021, 5:22 a.m.