View source: R/average_predicted.R
average_predicted | R Documentation |
Calculates average predictions over the values of one or multiple features specified
by X
. Shows the combined effect of a feature and other (correlated) features.
average_predicted(
X,
pred,
w = NULL,
x_name = "x",
breaks = "Sturges",
right = TRUE,
discrete_m = 13L,
outlier_iqr = 2,
seed = NULL,
...
)
X |
A vector, matrix, or data.frame with features. |
pred |
A numeric vector of predictions. |
w |
An optional numeric vector of weights. Having observations with non-positive weight is equivalent to excluding them. |
x_name |
If |
breaks |
An integer, vector, or "Sturges" (the default) used to determine
bin breaks of continuous features. Values outside the total bin range are placed
in the outmost bins. To allow varying values of |
right |
Should bins be right-closed? The default is |
discrete_m |
Numeric features with up to this number of unique values should not
be binned but rather treated as discrete. The default is 13. Vectorized over |
outlier_iqr |
If |
seed |
Optional integer random seed used for calculating breaks: The bin range is determined without values outside quartiles +- 2 IQR using a sample of <= 9997 observations to calculate quartiles. |
... |
Currently unused. |
The function is a convenience wrapper around feature_effects()
.
A list (of class "EffectData") with a data.frame per feature having columns:
bin_mid
: Bin mid points. In the plots, the bars are centered around these.
bin_width
: Absolute width of the bin. In the plots, these equal the bar widths.
bin_mean
: For continuous features, the (possibly weighted) average feature
value within bin. For discrete features equivalent to bin_mid
.
N
: The number of observations within bin.
weight
: The weight sum within bin. When w = NULL
, equivalent to N
.
Different statistics, depending on the function call.
Use single bracket subsetting to select part of the output. Note that each data.frame contains an attribute "discrete" with the information whether the feature is discrete or continuous. This attribute might be lost when you manually modify the data.frames.
Apley, Daniel W., and Jingyu Zhu. 2016. Visualizing the Effects of Predictor Variables in Black Box Supervised Learning Models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 82 (4): 1059–1086. doi:10.1111/rssb.12377.
feature_effects()
fit <- lm(Sepal.Length ~ ., data = iris)
M <- average_predicted(iris[2:5], pred = predict(fit, iris), breaks = 5)
M
M |> plot()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.