| mismm | R Documentation |
This function fits the MILD-SVM model, which takes a multiple-instance learning with distributions (MILD) data set and fits a modified SVM to it. The MILD-SVM methodology is based on research in progress.
## Default S3 method:
mismm(
x,
y,
bags,
instances,
cost = 1,
method = c("heuristic", "mip", "qp-heuristic"),
weights = TRUE,
control = list(kernel = "radial", sigma = if (is.vector(x)) 1 else 1/ncol(x),
nystrom_args = list(m = nrow(x), r = nrow(x), sampling = "random"), max_step = 500,
scale = TRUE, verbose = FALSE, time_limit = 60, start = FALSE),
...
)
## S3 method for class 'formula'
mismm(formula, data, ...)
## S3 method for class 'mild_df'
mismm(x, ...)
x |
A data.frame, matrix, or similar object of covariates, where each
row represents a sample. If a |
y |
A numeric, character, or factor vector of bag labels for each
instance. Must satisfy |
bags |
A vector specifying which instance belongs to each bag. Can be a string, numeric, of factor. |
instances |
A vector specifying which samples belong to each instance. Can be a string, numeric, of factor. |
cost |
The cost parameter in SVM. If |
method |
The algorithm to use in fitting (default |
weights |
named vector, or |
control |
list of additional parameters passed to the method that control computation with the following components:
|
... |
Arguments passed to or from other methods. |
formula |
A formula with specification |
data |
If |
Several choices of fitting algorithm are available, including a version of the heuristic algorithm proposed by Andrews et al. (2003) and a novel algorithm that explicitly solves the mixed-integer programming (MIP) problem using the gurobi package optimization back-end.
An object of class mismm The object contains at least the
following components:
*_fit: A fit object depending on the method parameter. If method = 'heuristic', this will be a ksvm fit from the kernlab package. If
method = 'mip' this will be gurobi_fit from a model optimization.
call_type: A character indicating which method misvm() was called
with.
x: The training data needed for computing the kernel matrix in
prediction.
features: The names of features used in training.
levels: The levels of y that are recorded for future prediction.
cost: The cost parameter from function inputs.
weights: The calculated weights on the cost parameter.
sigma: The radial basis function kernel parameter.
repr_inst: The instances from positive bags that are selected to be
most representative of the positive instances.
n_step: If method %in% c('heuristic', 'qp-heuristic'), the total
steps used in the heuristic algorithm.
useful_inst_idx: The instances that were selected to represent the bags
in the heuristic fitting.
inst_order: A character vector that is used to modify the ordering of
input data.
x_scale: If scale = TRUE, the scaling parameters for new predictions.
default: Method for data.frame-like objects
formula: Method for passing formula
mild_df: Method for mild_df objects
Sean Kent, Yifei Liu
Kent, S., & Yu, M. (2022). Non-convex SVM for cancer diagnosis based on morphologic features of tumor microenvironment arXiv preprint arXiv:2206.14704
predict.mismm() for prediction on new data.
set.seed(8)
mil_data <- generate_mild_df(nbag = 15, nsample = 20, positive_prob = 0.15,
sd_of_mean = rep(0.1, 3))
# Heuristic method
mdl1 <- mismm(mil_data)
mdl2 <- mismm(mild(bag_label, bag_name, instance_name) ~ X1 + X2 + X3, data = mil_data)
# MIP method
if (require(gurobi)) {
mdl3 <- mismm(mil_data, method = "mip", control = list(nystrom_args = list(m = 10, r = 10)))
predict(mdl3, mil_data)
}
predict(mdl1, new_data = mil_data, type = "raw", layer = "bag")
# summarize predictions at the bag layer
library(dplyr)
mil_data %>%
bind_cols(predict(mdl2, mil_data, type = "class")) %>%
bind_cols(predict(mdl2, mil_data, type = "raw")) %>%
distinct(bag_name, bag_label, .pred_class, .pred)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.