WpProj | R Documentation |
This function will calculate linear projections from a set of predictions into the space of the covariates in terms of the p-Wasserstein distance.
WpProj(
X,
eta = NULL,
theta = NULL,
power = 2,
method = c("L1", "binary program", "stepwise", "simulated annealing", "L0"),
solver = c("lasso", "ecos", "lpsolve", "mosek"),
options = NULL
)
X |
An |
eta |
An |
theta |
An optional An |
power |
The power of the Wasserstein distance to use. Must be |
method |
The algorithm to calculate the Wasserstein projections. One of "L1", "binary program", "IP", "stepwise","simulated annealing", or "L0". Will default to "L1" if not provided. See details for more information. |
solver |
Which solver to use? One of "lasso", "ecos", "lpsolve", or "mosek". See details for more information |
options |
Options passed to the particular method and desired solver. See details for more information. |
The WpProj
function is a wrapper for the various Wasserstein projection methods. It is designed to be a one-stop shop for all Wasserstein projection methods. It will automatically choose the correct method and solver based on the arguments provided. It will also return a standardized output for all methods. Each method has its own set of options that can be passed to it. See the documentation for each method for more information.
For the L1 methods, see L1_method_options()
for more information. For the binary program methods, see binary_program_method_options()
for more information. For the stepwise methods, see stepwise_method_options()
for more information. For the simulated annealing methods, see simulated_annealing_method_options()
for more information.
In most cases, we recommend using the L1 methods or binary program methods. The L1 methods are the fastest and applicable to Wasserstein powers of any value greater than 1 and function as direct linear projections into the space of the covariates. The binary program methods instead preserve the coefficients of the original model if this is of interest, such as when the original model was already a linear model. The binary program will instead function as a way of turning on and off certain coefficients in a way that minimizes the Wasserstein distance between reduced and original models. Of note, we also have available an approximate binary program method using a lasso solver. This method is faster than the exact binary program method but is not guaranteed to find the optimal solution. It is recommended to use the exact binary program method if possible. See binary_program_method_options()
for more information on how to set up the approximate method as some arguments for the lasso solver should be specified. For more information on how this works, please also see the referenced paper.
The stepwise, simulated annealing, and L0 methods also select covariates like the binary program methods but they can be slower. They are presented merely for comparison purposes given they were used in the original paper.
The Wasserstein distance is a measure of distance between two probability distributions. It is defined as:
W_p(\mu,\nu) = \left(\inf_{\pi \in \Pi(\mu,\nu)} \int_{\mathbb{R}^d \times \mathbb{R}^d} \|x-y\|^p d\pi(x,y)\right)^{1/p},
where \Pi(\mu,\nu)
is the set of all joint distributions with marginals \mu
and \nu
. The Wasserstein distance is a generalization of the Euclidean distance, which is the case when p=2
. In our function we have argument power
that corresponds to the p
of the equation above. The default power
is 2.0
but any value greater than or equal to 1.0
is allowed. For more information, see the references.
The particular implementation of the Wasserstein distance is as follows. If \mu
is the original prediction from the original model, then we seek to find a new prediction \nu
that minimizes the Wasserstein distance between the two: \text{argmin}_\nu W_p(\mu,\nu)
.
object of class WpProj
, which is a list with the following slots:
call
: The call to the function
theta
: A list of the final parameter matrices for each returned model
fitted.values
: A list of the fitted values for each returned model
power
: The power of the Wasserstein distance used
method
: The method used to calculate the Wasserstein projections
solver
: The solver used to calculate the Wasserstein projections
niter
: The number of iterations used to calculate the Wasserstein projections. Not all methods return a number of iterations so this may be NULL
nzero
: The number of non zero coefficients in the final models
Dunipace, Eric and Lorenzo Trippa (2020) https://arxiv.org/abs/2012.09999.
if(rlang::is_installed("stats")) {
# note we don't generate believable data with real posteriors
# these examples are just to show how to use the function
n <- 32
p <- 10
s <- 21
# covariates and coefficients
x <- matrix( stats::rnorm( p * n ), nrow = n, ncol = p )
beta <- (1:10)/10
#outcome
y <- x %*% beta + stats::rnorm(n)
# fake posterior
post_beta <- matrix(beta, nrow=p, ncol=s) + stats::rnorm(p*s, 0, 0.1)
post_mu <- x %*% post_beta #posterior predictive distributions
# fit models
## L1 model
fit.p2 <- WpProj(X=x, eta=post_mu, power = 2.0,
method = "L1", #default
solver = "lasso" #default
)
## approximate binary program
fit.p2.bp <- WpProj(X=x, eta=post_mu, theta = post_beta, power = 2.0,
method = "binary program",
solver = "lasso" #default because approximate algorithm is faster
)
## compare performance by measuring distance from full model
dc <- distCompare(models = list("L1" = fit.p2, "BP" = fit.p2.bp))
plot(dc)
## compare performance by measuring the relative distance between a null model
## and the predictions of interest as a pseudo R^2
r2.expect <- WPR2(predictions = post_mu, projected_model = dc) # can have negative values
r2.null <- WPR2(projected_model = dc) # should be between 0 and 1
plot(r2.null)
## we can also examine how predictions change in the models for individual observations
ridgePlot(fit.p2, index = 21, minCoef = 0, maxCoef = 10)
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.