ocf | R Documentation |
Nonparametric estimator for ordered non-numeric outcomes. The estimator modifies a standard random forest splitting criterion to build a collection of forests, each estimating the conditional probability of a single class.
ocf(
Y = NULL,
X = NULL,
honesty = FALSE,
honesty.fraction = 0.5,
inference = FALSE,
alpha = 0.2,
n.trees = 2000,
mtry = ceiling(sqrt(ncol(X))),
min.node.size = 5,
max.depth = 0,
replace = FALSE,
sample.fraction = ifelse(replace, 1, 0.5),
n.threads = 1
)
Y |
Outcome vector. |
X |
Covariate matrix (no intercept). |
honesty |
Whether to grow honest forests. |
honesty.fraction |
Fraction of honest sample. Ignored if |
inference |
Whether to extract weights and compute standard errors. The weights extraction considerably slows down the routine. |
alpha |
Controls the balance of each split. Each split leaves at least a fraction |
n.trees |
Number of trees. |
mtry |
Number of covariates to possibly split at in each node. Default is the square root of the number of covariates. |
min.node.size |
Minimal node size. |
max.depth |
Maximal tree depth. A value of 0 corresponds to unlimited depth, 1 to "stumps" (one split per tree). |
replace |
If |
sample.fraction |
Fraction of observations to sample. |
n.threads |
Number of threads. Zero corresponds to the number of CPUs available. |
Object of class ocf
.
Riccardo Di Francesco
Di Francesco, R. (2025). Ordered Correlation Forest. Econometric Reviews, 1–17. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/07474938.2024.2429596")}.
marginal_effects
## Generate synthetic data.
set.seed(1986)
data <- generate_ordered_data(100)
sample <- data$sample
Y <- sample$Y
X <- sample[, -1]
## Training-test split.
train_idx <- sample(seq_len(length(Y)), floor(length(Y) * 0.5))
Y_tr <- Y[train_idx]
X_tr <- X[train_idx, ]
Y_test <- Y[-train_idx]
X_test <- X[-train_idx, ]
## Fit ocf on training sample.
forests <- ocf(Y_tr, X_tr)
## We have compatibility with generic S3-methods.
print(forests)
summary(forests)
predictions <- predict(forests, X_test)
head(predictions$probabilities)
table(Y_test, predictions$classification)
## Compute standard errors. This requires honest forests.
honest_forests <- ocf(Y_tr, X_tr, honesty = TRUE, inference = TRUE)
head(honest_forests$predictions$standard.errors)
## Marginal effects.
me <- marginal_effects(forests, eval = "atmean")
print(me)
print(me, latex = TRUE)
plot(me)
## Compute standard errors. This requires honest forests.
honest_me <- marginal_effects(honest_forests, eval = "atmean", inference = TRUE)
print(honest_me, latex = TRUE)
plot(honest_me)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.