View source: R/mrIMLpredicts.R
mrIMLpredicts | R Documentation |
This function fits separate classification/regression models, specified in
the tidymodels framework, for each response variable in a data set. This is
the core function of mrIML
.
mrIMLpredicts(
X,
X1 = NULL,
Y,
Model,
balance_data = "no",
dummy = FALSE,
prop = 0.7,
tune_grid_size = 10,
k = 10,
racing = TRUE
)
Y , X , X1 |
Data frames containing the response, predictor, and the joint
response variables (i.e. the responses that are also to be used as predictors
if fitting GN model) respectively. If |
Model |
Any model from the tidymodels package. See Examples. |
balance_data |
A character string:
|
dummy |
A logical value indicating if |
prop |
A numeric value between 0 and 1. Defines the training-testing
data proportion to be used, which defaults to |
tune_grid_size |
A numeric value that sets the grid size for
hyperparameter tuning. Larger grid sizes increase computational time. Ignored
if |
k |
A numeric value. Sets the number of folds in the cross-validation. 10-fold CV is the default. |
racing |
A logical value. If |
mrIMLpredicts
fits the supplied tidy model to each response variable in the
data frame Y
. If only X
(a data frame of predictors) is supplied, then
independent models are fit, i.e., the other response variables are not used as
predictors. If X1
(a data frame of all or select response variables) is
supplied, then those response variables are also used as predictors in the
response models. For example, supplying X1
means that a co-occurrence model is fit.
If balance_data = "up"
, then themis::step_rose()
is used to upsample the
dataset; however, we generally recommend using balance_data = "no"
in most
cases.
A list object with three slots:
$Model
: The tidymodels object that was fit.
$Data
: A list of the raw data.
$Fits
: A list of the fitted models for each response variable.
library(tidymodels)
data <- MRFcov::Bird.parasites
# Define the response variables of interest
Y <- data %>%
select(-scale.prop.zos) %>%
select(order(everything()))
# Define the predictors
X <- data %>%
select(scale.prop.zos)
# Specify a random forest tidy model
model_rf <- rand_forest(
trees = 50, # 50 trees are set for brevity. Aim to start with 1000
mode = "classification",
mtry = tune(),
min_n = tune()
) %>%
set_engine("randomForest")
# Fitting independent multi-response model -----------------------------------
MR_model_rf <- mrIMLpredicts(
X = X,
Y = Y,
Model = model_rf,
prop = 0.7,
k = 2,
racing = FALSE
)
# Fitting a graphical network model -----------------------------------------
# Define the dependent response variables (all in this case)
X1 <- Y
GN_model <- mrIMLpredicts(
X = X,
Y = Y,
X1 = X1,
Model = model_rf,
prop = 0.7,
k = 2,
racing = FALSE
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.