data_screen: Data screening

View source: R/data_prep_utils.R

data_screenR Documentation

Data screening

Description

This function takes a matrix of data and removes 1. Variables without variation 2. Dummy variables where one group is nearly empty (optional in one of both treatment groups) 3. Redundant (highly correlated variables)

Usage

data_screen(data, treat = NULL, bin_cut = 0.01, corr_cut = 0.99, print = FALSE)

Arguments

data

Matrix the variables to be screened.

treat

Optional binary treatment vector if screening should be done within treatment groups

bin_cut

Cut-off fraction under which nearly empty binary variables should be removed. Default 0.01.

corr_cut

Cut-off above which highly correlated varialbes should be removed. Default 0.99.

print

Shows details about the reomved variables at each step

Value

Screened matrix


MCKnaus/causalDML documentation built on Aug. 19, 2023, 5:47 p.m.