X_RF-X_RF: X-Learner with honest RF for both stages

Description Usage Arguments Format Value

Description

This is an implementation of the X-learner with honest random forest in the first and second stage. The function returns an X-RF object.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
X_RF(feat, tr, yobs, predmode = "propmean",
  relevant_Variable_first = 1:ncol(feat),
  relevant_Variable_second = 1:ncol(feat),
  relevant_Variable_prop = 1:ncol(feat), ntree_first = 1000,
  ntree_second = 1000, ntree_prop = 500, mtry_first = round(ncol(feat) *
  13/20), mtry_second = round(ncol(feat) * 17/20), mtry_prop = ncol(feat),
  min_node_size_spl_first = 2, min_node_size_ave_first = 1,
  min_node_size_spl_second = 5, min_node_size_ave_second = 6,
  min_node_size_spl_prop = 11, min_node_size_ave_prop = 33,
  splitratio_first = 1, splitratio_second = 0.8, splitratio_prop = 0.5,
  replace_first = TRUE, replace_second = TRUE, replace_prop = TRUE,
  sample_fraction_first = 0.8, sample_fraction_second = 0.7,
  sample_fraction_prop = 0.5, nthread = 0, verbose = TRUE,
  middleSplit_first = TRUE, middleSplit_second = TRUE,
  middleSplit_prop = FALSE)

Arguments

feat

A data frame of all the features.

tr

A numeric vector contain 0 for control and 1 for treated variables.

yobs

A numeric vector containing the observed outcomes.

predmode

One of propmean, control, treated, extreme. It specifies how the two estimators of the second stage should be aggregated. The default is propmean which refers to propensity score weighting.

relevant_Variable_first

Variables which are only used in the first stage.

relevant_Variable_second

Variables which are only used in the second stage.

ntree_first

Numbers of trees in the first stage.

ntree_second

Numbers of trees in the second stage.

mtry_first

Numbers of trees in the second stage.

mtry_second

Numbers of trees in the second stage.

min_node_size_spl_first

minimum nodesize in the first stage for the observations in the splitting set.

min_node_size_ave_first

minimum nodesize in the first stage for the observations in the average set.

min_node_size_spl_second

minimum nodesize in the second stage for the observations in the splitting set.

min_node_size_ave_second

minimum nodesize in the second stage for the observations in the averaging set.

splitratio_first

Proportion of the training data used as the splitting dataset in the first stage.

splitratio_second

Proportion of the training data used as the splitting dataset in the second stage.

replace_first

Sample with or without replacement in the first stage.

replace_second

Sample with or without replacement in the first stage.

sample_fraction_first

The size of total samples to draw for the training data in the first stage.

sample_fraction_second

The size of total samples to draw for the training data in the second stage.

nthread

number of threats which should be used to work in parallel.

verbose

whether or not to print messages of the training procedure.

Format

An object of class NULL of length 0.

Value

A 'X_RF' object.


soerenkuenzel/hte documentation built on June 12, 2018, 4:26 p.m.