Description Usage Arguments Details Value Note Author(s) References Examples
upliftKNN
implements k-nearest neighbor for uplift modeling.
1 2 | upliftKNN(train, test, y, ct, k = 1, dist.method = "euclidean",
p = 2, ties.meth = "min", agg.method = "mean")
|
train |
a matrix or data frame of training set cases. |
test |
a matrix or data frame of test set cases. A vector will be interpreted as a row vector for a single case. |
y |
a numeric response variable (must be coded as 0/1 for binary response). |
ct |
factor or numeric vector representing the treatment to which each train case is assigned. At least 2 groups are required (e.g. treatment and control). Multi-treatments are also supported. |
k |
number of neighbors considered. |
dist.method |
the distance to be used in calculating the neighbors. Any method supported in function |
p |
the power of the Minkowski distance. |
ties.meth |
method to handle ties for the kth neighbor. The default is "min" which uses all ties. Alternatives include "max" which uses none if there are ties for the k-th nearest neighbor, "random" which selects among the ties randomly and "first" which uses the ties in their order in the data. |
agg.method |
method to combine responses of the nearest neighbors, defaults to "mean". The alternative is "majority". |
k-nearest neighbor for uplift modeling for a test set from a training set. For each case in the test set, the k-nearest training set vectors for each treatment type are found. The response value for the k-nearest training vectors is aggregated based on the function specified in agg.method
. For "majority", classification is decided by majority vote (with ties broken at random).
A matrix of predictions for each test case and value of ct
The code logic follows closely the knn
and knnflex
packages, the later currently discontinued from CRAN.
Leo Guelman <leo.guelman@gmail.com>
Su, X., Kang, J., Fan, J., Levine, R. A., and Yan, X. (2012). Facilitating score and causal inference trees for large observational studies. Journal of Machine Learning Research, 13(10): 2955-2994.
Guelman, L., Guillen, M., and Perez-Marin A.M. (2013). Optimal personalized treatment rules for marketing interventions: A review of methods, a new proposal, and an insurance case study. Submitted.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | library(uplift)
### simulate data for uplift modeling
set.seed(1)
train <- sim_pte(n = 500, p = 10, rho = 0, sigma = sqrt(2), beta.den = 4)
train$treat <- ifelse(train$treat == 1, 1, 0)
### Fit an Uplift k-Nearest Neighbor on test data
test <- sim_pte(n = 100, p = 10, rho = 0, sigma = sqrt(2), beta.den = 4)
test$treat <- ifelse(test$treat == 1, 1, 0)
fit1 <- upliftKNN(train[, 3:8], test[, 3:8], train$y, train$treat, k = 1,
dist.method = "euclidean", p = 2, ties.meth = "min", agg.method = "majority")
head(fit1)
|
Loading required package: RItools
Loading required package: SparseM
Attaching package: 'SparseM'
The following object is masked from 'package:base':
backsolve
Loading required package: MASS
Loading required package: coin
Loading required package: survival
Loading required package: tables
Loading required package: Hmisc
Loading required package: lattice
Loading required package: Formula
Loading required package: ggplot2
Attaching package: 'Hmisc'
The following objects are masked from 'package:base':
format.pval, round.POSIXt, trunc.POSIXt, units
Loading required package: penalized
Welcome to penalized. For extended examples, see vignette("penalized").
0 1
[1,] 1 0
[2,] 0 0
[3,] 0 0
[4,] 1 0
[5,] 1 1
[6,] 0 0
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.