Description Usage Arguments Details Value Author(s) References Examples
Provides a method for assessing performance for uplift models.
1 | performance(pr.y1_ct1, pr.y1_ct0, y, ct, direction = 1, groups = 10)
|
pr.y1_ct1 |
the predicted probability Prob(y=1|treated, x). |
pr.y1_ct0 |
the predicted probability Prob(y=1|control, x). |
y |
the actual observed value of the response. |
ct |
a binary (numeric) vector representing the treatment assignment (coded as 0/1). |
direction |
possible values are |
groups |
number of groups of equal observations in which to partition the data set to show results. The default value is 10 (deciles). Other possible values are 5 and 20. |
Model performance is estimated by: 1. computing the difference in the predicted conditional class probabilities Prob(y=1|treated, x) and Prob(y=1|control, x), 2. ranking the difference and grouping it into 'buckets' with equal number of observations each, and 3. computing the actual difference in the mean of the response variable between the treatment and the control groups for each bucket.
An object of class performance
, which is a matrix with the following columns: (group
) the number of groups, (n.ct1
) the number of observations in the treated group, (n.ct0
) the number of observations in the control group, (n.y1_ct1
) the number of observation in the treated group with response = 1, (n.y1_ct0
) the number of observation in the control group with response = 1, (r.y1_ct1
) the mean of the response for the treated group, (r.y1_ct0
) the mean of the response for the control group, and (uplift
) the difference between r.y1_ct1
and r.y1_ct0
(if direction = 1
).
Leo Guelman <leo.guelman@gmail.com>
Guelman, L., Guillen, M., and Perez-Marin A.M. (2013). Uplift random forests. Cybernetics & Systems, forthcoming.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 | library(uplift)
set.seed(123)
dd <- sim_pte(n = 1000, p = 20, rho = 0, sigma = sqrt(2), beta.den = 4)
dd$treat <- ifelse(dd$treat == 1, 1, 0)
### fit uplift random forest
fit1 <- upliftRF(y ~ X1 + X2 + X3 + X4 + X5 + X6 + trt(treat),
data = dd,
mtry = 3,
ntree = 100,
split_method = "KL",
minsplit = 200, # need small trees as there is strong uplift effects in the data
verbose = TRUE)
print(fit1)
summary(fit1)
### get variable importance
varImportance(fit1, plotit = TRUE, normalize = TRUE)
### predict on new data
dd_new <- sim_pte(n = 1000, p = 20, rho = 0, sigma = sqrt(2), beta.den = 4)
dd_new$treat <- ifelse(dd_new$treat == 1, 1, 0)
pred <- predict(fit1, dd_new)
### evaluate model performance
perf <- performance(pred[, 1], pred[, 2], dd_new$y, dd_new$treat, direction = 1)
plot(perf[, 8] ~ perf[, 1], type ="l", xlab = "Decile", ylab = "uplift")
|
Loading required package: RItools
Loading required package: SparseM
Attaching package: 'SparseM'
The following object is masked from 'package:base':
backsolve
Loading required package: MASS
Loading required package: coin
Loading required package: survival
Loading required package: tables
Loading required package: Hmisc
Loading required package: lattice
Loading required package: Formula
Loading required package: ggplot2
Attaching package: 'Hmisc'
The following objects are masked from 'package:base':
format.pval, round.POSIXt, trunc.POSIXt, units
Loading required package: penalized
Welcome to penalized. For extended examples, see vignette("penalized").
uplift: status messages enabled; set "verbose" to false to disable
upliftRF: starting. Wed Dec 13 08:34:06 2017
10 out of 100 trees so far...
20 out of 100 trees so far...
30 out of 100 trees so far...
40 out of 100 trees so far...
50 out of 100 trees so far...
60 out of 100 trees so far...
70 out of 100 trees so far...
80 out of 100 trees so far...
90 out of 100 trees so far...
Call:
upliftRF(formula = y ~ X1 + X2 + X3 + X4 + X5 + X6 + trt(treat),
data = dd, mtry = 3, ntree = 100, split_method = "KL", minsplit = 200,
verbose = TRUE)
Uplift random forest
Number of trees: 100
No. of variables tried at each split: 3
Split method: KL
$call
upliftRF(formula = y ~ X1 + X2 + X3 + X4 + X5 + X6 + trt(treat),
data = dd, mtry = 3, ntree = 100, split_method = "KL", minsplit = 200,
verbose = TRUE)
$importance
var rel.imp
1 X1 39.97286
2 X2 25.07182
3 X4 18.37845
4 X3 16.57687
$ntree
[1] 100
$mtry
[1] 3
$split_method
[1] "KL"
attr(,"class")
[1] "summary.upliftRF"
var rel.imp
1 X1 39.97286
2 X2 25.07182
3 X4 18.37845
4 X3 16.57687
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.