get_sample_weights: Given a trained forest and test data, compute the training...

View source: R/analysis_tools.R

get_sample_weightsR Documentation

Given a trained forest and test data, compute the training sample weights for each test point.

Description

During normal prediction, these weights are computed as an intermediate step towards producing estimates. This function allows for examining the weights directly, so they could be potentially be used as the input to a different analysis.

Usage

get_sample_weights(
  forest,
  newdata = NULL,
  estimate.uncertainty = FALSE,
  num.threads = NULL
)

Arguments

forest

The trained forest.

newdata

Points at which predictions should be made. If NULL, makes out-of-bag predictions on the training set instead (i.e., provides predictions at Xi using only trees that did not use the i-th training example).

estimate.uncertainty

Whether to return a single weight for each sample or return B weight vectors calculated on B CI groups for each sample. See Details and return value docu.

num.threads

Number of threads used in training. If set to NULL, the software automatically selects an appropriate amount.

Details

To estimate the uncertainty, a set of B=(num.trees)/(ci.group.size) weights is produced for each sample when estimate.uncertainty=TRUE. These B weights arise from B subforests (CI groups) inside the estimation routine and may be seen as bootstrap approximation to the estimation uncertainty of the DRF estimator. As such, they can be used to build confidence intervals for functionals. For instance, for univariate functionals, one may calculate one functional per weight to obtain B estimates, with which the variance can be calculated. Then the usual normal approximation can be used to construct confidence intervals for said functional. Uncertainty weights are not available OOB.

Value

estimate.uncertainty=FALSE

A sparse matrix where each row represents a test sample, and each column is a sample in the training data. The value at (i, j) gives the weight of training sample j for test sample i.

estimate.uncertainty=TRUE

A list of length nrow(test sample) where each item is a B x w sparse matrix, where B is the number of CI groups and w=nrow(Y). This matrix essentially contains B separate weight vectors, one in each row.

Examples

## Not run: 
p <- 10
n <- 100
X <- matrix(2 * runif(n * p) - 1, n, p)
Y <- (X[, 1] > 0) + 2 * rnorm(n)
rrf <- drf(X, matrix(Y,ncol=1), mtry = p)
sample.weights.oob <- get_sample_weights(rrf)

n.test <- 15
X.test <- matrix(2 * runif(n.test * p) - 1, n.test, p)
sample.weights <- get_sample_weights(rrf, X.test)

## End(Not run)


drf documentation built on Jan. 21, 2026, 9:06 a.m.

Related to get_sample_weights in drf...