View source: R/view.contribution.R
view.contribution | R Documentation |
Evaluate the contribution of each data view in making prediction. The function has two options.
If force
is set to NULL
, the data view contribution is benchmarked by the null model.
If force
is set to a list of data views, the contribution is benchmarked by the model fit on
this list of data views, and the function evaluates the marginal contribution of each additional data
view on top of this benchmarking list of views.
The function returns a table showing the percentage improvement in reducing error as compared to the bechmarking model
made by each data view.
view.contribution(
x_list,
y,
family = gaussian(),
rho,
s = c("lambda.min", "lambda.1se"),
eval_data = c("train", "test"),
weights = NULL,
type.measure = c("default", "mse", "deviance", "class", "auc", "mae", "C"),
x_list_test = NULL,
test_y = NULL,
nfolds = 10,
foldid = NULL,
force = NULL,
...
)
x_list |
a list of |
y |
the quantitative response with length equal to |
family |
A description of the error distribution and link function to be used in the model. This is the result of a call to a family function. Default is stats::gaussian. (See stats::family for details on family functions.) |
rho |
the weight on the agreement penalty, default 0. |
s |
Value(s) of the penalty parameter |
eval_data |
If |
weights |
Observation weights; defaults to 1 per observation |
type.measure |
loss to use for cross-validation. Currently
five options, not all available for all models. The default is
|
x_list_test |
A list of |
test_y |
The quantitative response in the test data with length equal to the
number of rows in each |
nfolds |
number of folds - default is 10. Although |
foldid |
an optional vector of values between 1 and |
force |
If |
... |
Other arguments that can be passed to |
a data frame consisting of the view, error metric, and percentage improvement.
set.seed(3)
# Simulate data based on the factor model
x = matrix(rnorm(200*20), 200, 20)
z = matrix(rnorm(200*20), 200, 20)
w = matrix(rnorm(200*20), 200, 20)
U = matrix(rep(0, 200*10), 200, 10) # latent factors
for (m in seq(10)){
u = rnorm(200)
x[, m] = x[, m] + u
z[, m] = z[, m] + u
w[, m] = w[, m] + u
U[, m] = U[, m] + u}
beta_U = c(rep(2, 5),rep(-2, 5))
y = U %*% beta_U + 3 * rnorm(100)
# Split training and test sets
smp_size_train = floor(0.9 * nrow(x))
train_ind = sort(sample(seq_len(nrow(x)), size = smp_size_train))
test_ind = setdiff(seq_len(nrow(x)), train_ind)
train_X = scale(x[train_ind, ])
test_X = scale(x[test_ind, ])
train_Z <- scale(z[train_ind, ])
test_Z <- scale(z[test_ind, ])
train_W <- scale(w[train_ind, ])
test_W <- scale(w[test_ind, ])
train_y <- y[train_ind, ]
test_y <- y[test_ind, ]
foldid = sample(rep_len(1:10, dim(train_X)[1]))
# Benchmarked by the null model:
rho = 0.3
view.contribution(x_list=list(x=train_X,z=train_Z), train_y, rho = rho,
eval_data = 'train', family = gaussian())
view.contribution(x_list=list(x=train_X,z=train_Z), train_y, rho = rho,
eval_data = 'test', family = gaussian(),
x_list_test=list(x=test_X,z=test_Z), test_y=test_y)
# Force option -- benchmarked by the model train on a specified list of data views:
view.contribution(x_list=list(x=train_X,z=train_Z,w=train_W), train_y, rho = rho,
eval_data = 'train', family = gaussian(), force=list(x=train_X))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.