Description Usage Arguments Value See Also Examples
Calculate the partial dependence of a predictor variable on the response variable from a random forest classification model. Rather than sequence through values of the predictor variable of interest and keep the other predictors at their median, this partial dependence technique creates replicates of the entire dataset for each level of the x variable of interest from it's min to max. This gives a more realistic idea of the magnitude and direction of the x variable on the response.
1 2 | partialDep(model, df, xvar, n = 10, target.class = "1", ci = c(0.9, 0.5,
0.3))
|
model |
model object used to generate predictions. Currently only built and tested for random forest. |
df |
data.frame or data.table used to generate predictions with |
xvar |
character of length one; the x variable in |
n |
numeric of length one; number of values between the min and max of |
target.class |
character: Which category (class) of the target variable to use for predictions |
ci |
numeric: specify any confidence intervals around the median response. |
data.table of output. cnt
refers to how many obs from df
are within the fixed-width bin specified by xvar
.
1 2 3 4 5 6 7 8 9 10 11 12 | library('randomForest')
library('data.table')
DF <- mtcars
DF$vs <- factor(DF$vs)
rf <- randomForest(vs~mpg+cyl+drat+qsec+disp+gear+carb+hp, DF, ntrees=100)
pd <- partialDep(model=rf, df=DF, xvar='mpg')
pd[ci==0.5,] # median of response when sequenced through 'mpg'
## Plotting
plot(pd[cilev==0, xvar], pd[cilev==0, pred], type='l', ylim=c(0,1))
lines(pd[ci==.95, xvar], pd[ci==.95, pred], type='l', col='red')
lines(pd[ci==.05, xvar], pd[ci==.05, pred], type='l', col='green')
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.