xtdml_data: Set up for data for panel data approaches and up two cluster...

xtdml_dataR Documentation

Set up for data for panel data approaches and up two cluster variables

Description

Double machine learning (DML) data-backend for data with cluster variables. xtdml_data sets up the data environment for panel data analysis with transformed variables.

xtdml_data objects can be initialized from a data.table. The following functions can be used to create a new instance of xtdml_data.

  • xtdml_data$new() for initialization from a data.table.

  • xtdml_data_from_data_frame() for initialization from a data.frame.

Active bindings

all_variables

(character())
All variables available in the data frame.

d_cols

(character())
The treatment variable.

dbar_col

(NULL, character()')
The individual mean of the treatment variable.

data

(data.table)
Data object.

data_model

(data.table)
Internal data object that implements the causal panel model as specified by the user via y_col, d_cols, x_cols, dbar_col.

n_obs

(integer(1))
The number of observations.

n_treat

(integer(1))
The number of treatment variables.

treat_col

(character(1))
"Active" treatment variable in the multiple-treatment case.

x_cols

(character())
The covariates.

y_col

(character(1))
The outcome variable.

cluster_cols

(character())
The cluster variable(s).

n_cluster_vars

(integer(1))
The number of cluster variables.

approach

(character(1))
A character() ("fd-exact", "wg-approx" or "cre") specifying the panel data technique to apply to estimate the causal model. Default is "fd-exact".

transformX

(character(1))
A character() ("no", "minmax" or "poly") specifying the type of transformation to apply to the X data. "no" does not transform the covariates X and is recommended for tree-based learners. "minmax" applies the Min-Max normalization x' = (x-x_{min})/(x_{max}-x_{min}) to the covariates and is recommended with neural networks. "poly" add polynomials up to order three and interactions between all possible combinations of two and three variables; this is recommended for Lasso. Default is "no".

Methods

Public methods


Method new()

Creates a new instance of this R6 class.

Usage
xtdml_data$new(
  data = NULL,
  x_cols = NULL,
  y_col = NULL,
  d_cols = NULL,
  dbar_col = NULL,
  cluster_cols = NULL,
  approach = NULL,
  transformX = NULL
)
Arguments
data

(data.table, data.frame())
Data object.

x_cols

(character())

y_col

(character(1))
The outcome variable.

d_cols

(character(1))
The treatment variable.

dbar_col

(NULL, character()⁠) \cr Individual mean of the treatment variable (used for the CRE approach). Default is ⁠NULL'.

cluster_cols

(character())
The cluster variable(s).

approach

(character(1))
A character() ("fd-exact", "wg-approx" or "cre") specifying the panel data technique to apply to estimate the causal model. Default is "fd-exact".

transformX

(character(1))
A character() ("no", "minmax" or "poly") specifying the type of transformation to apply to the X data. "no" does not transform the covariates X and is recommended for tree-based learners. "minmax" applies the Min-Max normalization x' = (x-x_{min})/(x_{max}-x_{min}) to the covariates and is recommended with neural networks. "poly" add polynomials up to order three and interactions between all possible combinations of two and three variables; this is recommended for Lasso. Default is "no".


Method print()

Print xtdml_data objects.

Usage
xtdml_data$print()

Method set_data_model()

Setter function for data_model. The function implements the causal model as specified by the user via y_col, d_cols, x_cols and cluster_cols and assigns the role for the treatment variables in the multiple-treatment case.

Usage
xtdml_data$set_data_model(treatment_var)
Arguments
treatment_var

(character())
Active treatment variable that will be set to treat_col.


Method clone()

The objects of this class are cloneable with this method.

Usage
xtdml_data$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


xtdml documentation built on Sept. 9, 2025, 5:54 p.m.