knitr::opts_chunk$set(comment = NA)
Model selection is the process of identifying the most relevant features from a set of candidate variables. This step is critical for building models that are accurate, interpretable, and computationally efficient while avoiding overfitting. Stepwise regression algorithms automate this process by iteratively adding or removing features based on predefined criteria, such as statistical significance (e.g., p-values), information criteria (e.g., AIC or BIC), or other performance metrics. The procedure continues until no further improvements can be made according to the chosen criterion, resulting in a final model that includes the selected features and their corresponding coefficients.
However, it is important to note that stepwise regression should never be used for statistical inference unless the variable selection process is explicitly accounted for. Without proper adjustments, the selection process invalidates statistical inference, such as p-values and confidence intervals, due to issues like multiple testing and data dredging. This limitation does not apply when stepwise regression is used for prediction, as the primary goal in predictive modeling is to maximize accuracy rather than draw causal conclusions.
StepReg simplifies model selection tasks by providing a unified programming interface. It currently supports model buildings for five distinct response variable types (section \@ref(regressioncategories)), four model selection strategies (section \@ref(modelselectionstrategies)) including the best subsets algorithm, and a variety of selection metrics (section \@ref(selectionmetrics)). Moreover, StepReg detects and addresses the multicollinearity issues if they exist (section \@ref(multicollinearity)). The output of StepReg includes multiple tables summarizing the final model and the variable selection procedures. Additionally, StepReg offers a plot function to visualize the selection steps (section \@ref(stepregoutput)). For demonstration, the vignettes include four use cases covering distinct regression scenarios (section \@ref(usecases)). Non-programmers can access the tool through the iterative Shiny app detailed in section \@ref(shinyapp).
By combining flexibility, robustness, and ease of use, StepReg is a powerful tool for predictive modeling tasks, particularly when the goal is to identify an optimal set of features for accurate predictions. However, users should exercise caution and avoid using StepReg for statistical inference unless the variable selection process is properly accounted for.
The following example selects an optimal linear regression model with the mtcars
dataset.
library(StepReg) data(mtcars) formula <- mpg ~ . res <- stepwise(formula = formula, data = mtcars, type = "linear", include = c("qsec"), strategy = "bidirection", metric = c("AIC"))
Breakdown of the parameters:
formula
: specifies the dependent and independent variablestype
: specifies the regression category, depending on your data, choose from "linear", "logit", "cox", etc.include
: specifies the variables that must be in the final modelstrategy
: specifies the stepwise strategy, choose from "forward", "backward", "bidirection", "subset"metric
: specifies the model fit evaluation metric, choose one or more from "AIC", "AICc", "BIC", "SL", etc.The output consists of final model, which can be viewed using:
res
You can further explore the results with generic functions such as summary()
, coeff()
, and others. For example:
summary(res$bidirection$AIC)
You can also visualize the variable selection procedures with:
plot(res, strategy = "bidirection", process = "overview") plot(res, strategy = "bidirection", process = "details")
The (+)1
refers to original model with intercept being added, (+)
indicates variables being added to the model while (-)
means variables being removed from the model.
Additionally, you can generate reports of various formats with:
report(res, report_name = "path_to/demo_res", format = "html")
Replace "path_to/demo_res"
with desired output file name, the suffix ".html"
will be added automatically. For detailed examples and more usage, refer to section \@ref(stepregoutput) and \@ref(usecases).
StepReg supports multiple types of regressions, including linear, logit, cox, poisson, and gamma regressions. These methods primarily vary by the type of response variable, which are summarized in the table below. Additional regression techniques can be incorporated upon user requests.
library(knitr) library(kableExtra) Regression <- c("linear", "logit", "cox", "poisson", "gamma") Reponse <- c("continuous", "binary", "time-to-event", "count", "continuous and positively skewed") df <- data.frame(Regression, Reponse) kable(df, format = "html", caption = 'Common regression categories') %>% kable_styling()
Model selection aims to identify the subset of independent variables that provide the best predictive performance for the response variable. Both stepwise regression and best subsets approaches are implemented in StepReg. For stepwise regression, there are mainly three methods: Forward Selection, Backward Elimination, Bidirectional Elimination.
Strategy <- c("Forward Selection", "Backward Elimination", "Bidirectional Elimination", "Best Subsets") Description <- c("In forward selection, the algorithm starts with an empty model (no predictors) and adds in variables one by one. Each step tests the addition of every possible predictor by calculating a pre-selected metric. Add the variable (if any) whose inclusion leads to the most statistically significant fit improvement. Repeat this process until more predictors no longer lead to a statistically better fit.", "In backward elimination, the algorithm starts with a full model (all predictors) and deletes variables one by one. Each step test the deletion of every possible predictor by calculating a pre-selected metric. Delete the variable (if any) whose loss leads to the most statistically significant fit improvement. Repeat this process until less predictors no longer lead to a statistically better fit.", "Bidirectional elimination is essentially a forward selection procedure combined with backward elimination at each iteration. Each iteration starts with a forward selection step that adds in predictors, followed by a round of backward elimination that removes predictors. Repeat this process until no more predictors are added or excluded.", "Stepwise algorithms add or delete one predictor at a time and output a single model without evaluating all candidates. Therefore, it is a relatively simple procedure that only produces one model. In contrast, the *Best Subsets* algorithm calculates all possible models and output the best-fitting models with one predictor, two predictors, etc., for users to choose from.") df <- data.frame(Strategy, Description) kable(df, format = "html", caption = 'Model selection strategy') %>% kable_styling()
Given the computational constraints, when dealing with datasets featuring a substantial number of predictor variables greater than the sample size, the Bidirectional Elimination typically emerges as the most advisable approach. Forward Selection and Backward Elimination can be considered in sequence. On the contrary, the Best Subsets approach requires the most substantial processing time, yet it calculates a comprehensive set of models with varying numbers of variables. In practice, users can experiment with various methods and select a final model based on the specific dataset and research objectives at hand.
Various selection metrics can be used to guide the process of adding or removing predictors from the model. These metrics help to determine the importance or significance of predictors in improving the model fit. In StepReg, selection metrics include two categories: Information Criteria and Significance Level of the coefficient associated with each predictor. Information Criteria is a means of evaluating a model's performance, which balances model fit with complexity by penalizing models with a higher number of parameters. Lower Information Criteria values indicate a better trade-off between model fit and complexity. Note that when evaluating different models, it is important to compare them within the same Information Criteria framework rather than across multiple Information Criteria. For example, if you decide to use AIC, you should compare all models using AIC. This ensures consistency and fairness in model comparison, as each Information Criterion has its own scale and penalization factors. In practice, multiple metrics have been proposed, the ones supported by StepReg are summarized below.
Importantly, given the discrepancies in terms of the precise definitions of each metric, StepReg mirrors the formulas adopted by SAS for univariate multiple regression (UMR) except for HQ, IC(1), and IC(3/2). A subset of the UMR can be easily extended to multivariate multiple regression (MMR), which are indicated in the following table.
Statistic <- c( "${n}$", "${p}$", "${q}$", "$\\sigma^2$", "${SST}$", "${SSE}$", "$\\text{LL}$", "${| |}$", "$\\ln()$") Meanings <- c( "Sample Size", "Number of parameters including the intercept", "Number of dependent variables", "Estimate of pure error variance from fitting the full model", "Total sum of squares corrected for the mean for the dependent variable, which is a numeric value for UMR and a matrix for multivariate regression", "Error sum of squares, which is a numeric value for UMR and a matrix for multivariate regression", "The natural logarithm of likelihood", "The determinant function", "The natural logarithm") kable_styling(kable(data.frame(Statistic,Meanings),format = "html", align='l', escape = F, caption = 'Statistics in selection metric'))
Abbreviation <- c("", "AIC", "AICc", "BIC", "Cp", "HQ", "IC(1)", "IC(3/2)", "SBC", "SL", "adjRsq") Definition <- c("", "Akaike’s Information Criterion", "Corrected Akaike’s Information Criterion", "Sawa Bayesian Information Criterion", "Mallows’ Cp statistic", "Hannan and Quinn Information Criterion", "Information Criterion with Penalty Coefficient Set to 1", "Information Criterion with Penalty Coefficient Set to 3/2", "Schwarz Bayesian Information Criterion", "Significance Level (pvalue)", "Adjusted R-square statistic") Formula_in_Linear <- c("linear", "$n\\ln\\left(\\frac{|\\text{SSE}|}{n}\\right) + 2pq + n + q(q+1)$ <br>[@Hurvich_Tsai_1989; @Al-Subaihi_2002]$^1$", "$n\\ln\\left(\\frac{|\\text{SSE}|}{n}\\right) + \\frac{nq(n+p)}{n-p-q-1}$ <br>[@Hurvich_Tsai_1989; @Bedrick_Tsai_1994]$^2$", "$n\\ln\\left(\\frac{SSE}{n}\\right) + 2(p+2)o - 2o^2, o = \\frac{n\\sigma^2}{SSE}$ <br>[@Sawa_1978; @Judge_1985] <br>not available for MMR", "$\\frac{SSE}{\\sigma^2} + 2p - n$ <br> [@Mallows_1973; @Hocking_1976] <br>not available for MMR", "$n\\ln\\left(\\frac{|\\text{SSE}|}{n}\\right) + 2pq\\ln(\\ln(n))$ <br>[@Hannan_Quinn_1979; @McQuarrie_Tsai_1998; @Hurvich_Tsai_1989]", "$n\\ln\\left(\\frac{|\\text{SSE}|}{n}\\right) + p$ <br>[@Nelder_Wedderburn_1972; @Smith_Spiegelhalter_1980] not available for MMR", "$n\\ln\\left(\\frac{|\\text{SSE}|}{n}\\right) + \\frac{3}{2}p$ <br>[@Smith_Spiegelhalter_1980] <br>not available for MMR", "$n\\ln\\left(\\frac{|\\text{SSE}|}{n}\\right) + pq \\ln(n)$ <br>[@Hurvich_Tsai_1989; @Schwarz_1978; @Judge_1985; @Al-Subaihi_2002] <br>not available for MMR", "$\\textit{F test}$ for UMR and $\\textit{Approximate F test}$ for MMR", "$1 - \\frac{(n-1)(1-R^2)}{n-p}$, <br> where $R^2=1 - \\frac{SSE}{SST}$ <br>[@Darlington_1968; @Judge_1985] <br>not available for MMR") Formula_in_Logit_Cox_Poisson_Gamma <- c("logit, cox, poisson and gamma", "$-2\\text{LL} + 2p$ <br>[@Darlington_1968; @Judge_1985]", "$-2\\text{LL} + \\frac{n(n+p)}{n-p-2}$ <br>[@Hurvich_Tsai_1989]", "not available", "not available", "$-2\\text{LL} + 2p\\ln(\\ln(n))$ <br>[@Hannan_Quinn_1979]", "$-2\\text{LL} + p$ <br>[@Nelder_Wedderburn_1972; @Smith_Spiegelhalter_1980]", "$-2\\text{LL} + \\frac{3}{2}p$ <br>[@Smith_Spiegelhalter_1980]", "$-2\\text{LL} + p\\ln(n)$ <br>[@Schwarz_1978; @Judge_1985]", "Forward: LRT and Rao Chi-square test (logit, poisson, gamma); LRT (cox); <br><br>Backward: Wald test", "not available") df <- data.frame(Abbreviation, Definition, Formula_in_Linear, Formula_in_Logit_Cox_Poisson_Gamma) colnames(df) <- c("Abbreviation","Definition","Formula","") kable(df, format = "html", align = "l", booktabs = TRUE, escape = F, caption = 'Abbreviation, Definition, and Formula of the Selection Metric for Linear, Logit, Cox, Possion, and Gamma regression') %>% footnote(number = c("Unsupported AIC formula (which does not affect the selection process as it only differs by constant additive and multiplicative factors):\n $AIC=n\\ln\\left(\\frac{SSE}{n}\\right) + 2p$ [@Darlington_1968; @Judge_1985]", "Unsupported AICc formula (which does not affect the selection process as it only differs by constant additive and multiplicative factors):\n $AICc=\\ln\\left(\\frac{SSE}{n}\\right) + 1 + \\frac{2(p+1)}{n-p-2}$ [@McQuarrie_Tsai_1998]")) %>% kable_styling() %>% column_spec(3, width = "0.5in") %>% column_spec(4, width = "0.4in")
No metric is necessarily optimal for all datasets. The choice of them depends on your data and research goals. We recommend using multiple metrics simultaneously, which allows the selection of the best model based on your specific needs. Below summarizes general guidance.
AIC: AIC works by penalizing the inclusion of additional variables in a model. The lower the AIC, the better performance of the model. AIC does not include sample size in penalty calculation, and it is optimal in minimizing the mean square error of predictions [@Brewer_2016].
AICc: AICc is a variant of AIC, which works better for small sample size, especially when numObs / numParam < 40
[@Burnham_2002].
Cp: Cp is used for linear models. It is equivalent to AIC when dealing with Gaussian linear model selection.
IC(1) and IC(3/2): IC(1) and IC(3/2) have 1 and 3/2 as penalty factors respectively, compared to 2 used by AIC. As such, IC(1) turns to return a complex model with more variables that may suffer from overfitting issues.
BIC and SBC: Both BIC and SBC are variants of Bayesian Information Criterion. The main distinction between BIC/SBC and AIC lies in the magnitude of the penalty imposed: BIC/SBC are more parsimonious when penalizing model complexity, which typically results to a simpler model [@SAS_Institute_2018; @Sawa_1978; @Hurvich_Tsai_1989; @Schwarz_1978; @Judge_1985; @Al-Subaihi_2002].
The precise definitions of these criteria can vary across literature and in the SAS environment. Here, BIC aligns with the definition of the Sawa Bayesion Information Criterion as outlined in SAS documentation, while SBC corresponds to the Schwarz Bayesian Information Criterion. According to Richard's post, whereas AIC often favors selecting overly complex models, BIC/SBC prioritize a small models. Consequently, when dealing with a limited sample size, AIC may seem preferable, whereas BIC/SBC tend to perform better with larger sample sizes.
HQ: HQ is an alternative to AIC, differing primarily in the method of penalty calculation. However, HQ has remained relatively underutilized in practice [@Burnham_2002].
adjRsq: The adjusted R-squared (adj-R²) seeks to overcome the limitation of R-squared in model selection by considering the number of predictors. It serves a similar purpose to information criteria, as both methods compare models by weighing their goodness of fit against the number of parameters. However, information criteria are typically regarded as superior in this context [@Stevens_2016].
SL: SL stands for Significance Level (P-value), embodying a distinct approach to model selection in contrast to information criteria. The SL method operates by calculating a P-value through specific hypothesis testing. Should this P-value fall below a predefined threshold, such as 0.05, one should favor the alternative hypothesis, indicating that the full model significantly outperforms the reduced model. The effectiveness of this method hinges upon the selection of the P-value threshold, wherein smaller thresholds tend to yield simpler models.
This blog by Jim Frost gives an excellent overview of multicollinearity and when it is necessary to remove it.
Simply put, a dataset contains multicollinearity when input predictors are correlated. When multicollinearity occurs, the interpretability of predictors will be badly affected because changes in one input variable lead to changes in other input variables. Therefore, it is hard to individually estimate the relationship between each input variable and the dependent variable.
Multicollinearity can dramatically reduce the precision of the estimated regression coefficients of correlated input variables, making it hard to find the correct model. However, as Jim pointed out, “Multicollinearity affects the coefficients and p-values, but it does not influence the predictions, precision of the predictions, and the goodness-of-fit statistics. If your primary goal is to make predictions, and you don’t need to understand the role of each independent variable, you don’t need to reduce severe multicollinearity.”
In StepReg, QC Matrix Decomposition is performed ahead of time to detect and remove input variables causing multicollinearity.
StepReg offers a suite of functions to summarize and visualize model-building results. The core function, stepwise()
, generates a list of data frames detailing the feature selection process and the final model. These results can be exported into various formats—such as "xlsx", "html", or "docx"—using the report()
function, making it easy to share and collaborate. Additionally, the plot()
function allows you to visualize and compare variable selection procedures across multiple strategies and metrics, providing further insights into the model-building process.
The stepwise()
function produces optimal models based on the chosen regression strategies and metrics. You can further analyze these models using standard functions like summary()
, coeff()
, residuals()
, and fitted()
, which are applicable to each element of the output.
StepReg enables to explore the arguments used in stepwise()
, variable calsses, overviews and detailed process of model selection, and voted models. The report()
function can export these results into multiple formats including "xlsx", "docx", "html", and "pptx" for enhanced usability. For example, running report()
with the name "results" will generate both "results.xlsx" and "results.docx" files.
report(res, report_name = "results", format = c("xlsx", "docx"))
Below, we present various examples illustrating the application of different models tailored to specific datasets. Please note that stepwise regression should never be used for statistical inference unless the variable selection process is properly accounted for, as it can invalidate the results. However, this issue does not arise when stepwise regression is used for prediction. It is essential to select the regression model that best suits the type of response variable. For detailed guidance, refer to section @ref(regressioncategories).
In this section, we'll demonstrate how to perform linear regression analysis using the mtcars dataset, showcasing different scenarios with varying numbers of predictors and dependent variables. We set type = "linear"
to direct the function to perform linear regression.
Description of the mtcars
dataset
The mtcars
is a classic dataset in statistics and is included in the base R installation. It was sourced from the 1974 Motor Trend US magazine, comprising 32 observations on 11 variables. Here's a brief description of the variables included:
Why choose linear regression
Linear regression is an ideal choice for analyzing the mtcars
dataset due to its inclusion of continuous variables like "mpg", "hp", or "weight", which can serve as response variables. Furthermore, the dataset exhibits potential linear relationships between the response variable and other variables.
In this example, we employ "forward" strategy with "AIC" as the selection criteria. Additionally, we specify using the include
argument that "disp", "cyl" always be included in the model.
data(mtcars) formula <- mpg ~ . res1 <- stepwise(formula = formula, data = mtcars, type = "linear", include = c("disp", "cyl"), strategy = "forward", metric = "AIC") res1
To visualize the selection process:
plot_list <- list() plot_list[["forward"]][["details"]] <- plot(res1, process = "details") plot_list[["forward"]][["overview"]] <- plot(res1, process = "overview") cowplot::plot_grid(plotlist = plot_list$forward, ncol = 1)
To exclude the intercept from the model, adjust the formula as follows:
formula <- mpg ~ . + 0
formula <- mpg ~ . - 1
To limit the model to a specific subset of predictors, adjust the formula as follows, which will only consider "cyp", "disp", "hp", "wt", "vs", and "am" as the predictors.
formula <- mpg ~ cyl + disp + hp + wt + vs + am + 0
Another way is to use minus symbol("-"
) to exclude some predictors for variable selection. For example, include all variables except "disp", "wt", and intercept.
formula <- mpg ~ . - 1 - disp - wt
You can simultaneously provide multiple selection strategies and metrics. For example, the following code snippet employs both "forward" and "backward" strategies using metrics "AIC", "BIC", and "SL". It's worth mentioning that when "SL" is specified, you may also want to set the significance level for entry ("sle") and stay ("sls"), both of which default to 0.15.
formula <- mpg ~ . res2 <- stepwise(formula = formula, data = mtcars, type = "linear", strategy = c("forward", "backward"), metric = c("AIC", "BIC", "SL"), sle = 0.05, sls = 0.05) res2
plot_list <- setNames( lapply(c("forward", "backward"),function(i){ setNames( lapply(c("details","overview"),function(j){ plot(res2,strategy=i,process=j) }), c("details","overview") ) }), c("forward", "backward") ) cowplot::plot_grid(plotlist = plot_list$forward, ncol = 1, rel_heights = c(2, 1)) cowplot::plot_grid(plotlist = plot_list$backward, ncol = 1, rel_heights = c(2, 1))
In this scenario, there are two dependent variables, "mpg" and "drat". The model selection aims to identify the most influential predictors that affect both variables.
formula <- cbind(mpg, drat) ~ . + 0 res3 <- stepwise(formula = formula, data = mtcars, type = "linear", strategy = "bidirection", metric = c("AIC", "HQ")) res3 plot_list <- setNames( lapply(c("bidirection"),function(i){ setNames( lapply(c("details","overview"),function(j){ plot(res3,strategy=i,process=j) }), c("details","overview") ) }), c("bidirection") ) cowplot::plot_grid(plotlist = plot_list$bidirection, ncol = 1, rel_heights = c(2, 1))
In this example, we'll showcase logistic regression using the remission
dataset. By setting type = "logit"
, we instruct the function to perform logistic regression.
Description of the remission
dataset
The remission dataset, obtained from the online course STAT501 at Penn State University, has been integrated into StepReg. It consists of 27 observations across seven variables, including a binary variable named "remiss":
Why choose logistic regression
Logistic regression effectively captures the relationship between predictors and a categorical response variable, offering insights into the probability of being assigned into specific response categories given a set of predictors. It is suitable for analyzing binary outcomes, such as the remission status ("remiss") in the remission
dataset.
In this example, we employ a "forward" strategy with "AIC" as the selection criteria, while force ensuring that the "cell" variable is included in the model.
data(remission) formula <- remiss ~ . res4 <- stepwise(formula = formula, data = remission, type = "logit", include= "cell", strategy = "forward", metric = "AIC") res4 plot_list <- setNames( lapply(c("forward"),function(i){ setNames( lapply(c("details","overview"),function(j){ plot(res4,strategy=i,process=j) }), c("details","overview") ) }), c("forward") ) cowplot::plot_grid(plotlist = plot_list$forward, ncol = 1, rel_heights = c(2, 1))
In this example, we employ a "subset" strategy, utilizing "SBC" as the selection criteria while excluding the intercept. Meanwhile, we set best_n = 3
to restrict the output to the top 3 models for each number of variables.
data(remission) formula <- remiss ~ . + 0 res5 <- stepwise(formula = formula, data = remission, type = "logit", strategy = "subset", metric = "SBC", best_n = 3) res5 plot_list <- setNames( lapply(c("subset"),function(i){ setNames( lapply(c("details","overview"),function(j){ plot(res5,strategy=i,process=j) }), c("details","overview") ) }), c("subset") ) cowplot::plot_grid(plotlist = plot_list$subset, ncol = 1, rel_heights = c(2, 1))
Here, the 0
in the above plot means that there is no intercept in the model.
lung
datasetIn this example, we'll demonstrate how to perform Cox regression analysis using the lung
dataset. By setting type = "cox"
, we instruct the function to conduct Cox regression.
Description of the lung
dataset
The lung
dataset, available in the "survival"
R package, includes information on survival times for 228 patients with advanced lung cancer. It comprises ten variables, among which the "status" variable codes for censoring status (1 = censored, 2 = dead), and the "time" variable denotes the patient survival time in days. To learn more about the dataset, use ?survival::lung
.
Why choose Cox regression
Cox regression, also termed the Cox proportional hazards model, is specifically designed for analyzing survival data, making it well-suited for datasets like lung
that include information on the time until an event (e.g., death) occurs. This method accommodates censoring and assumes proportional hazards, enhancing its applicability to medical studies involving time-to-event outcomes.
In this example, we employ a "forward" strategy with "AICc" as the selection criteria.
library(dplyr) library(survival) # Preprocess: lung <- survival::lung %>% mutate(sex = factor(sex, levels = c(1, 2))) %>% # make sex as factor na.omit() # get rid of incomplete records formula = Surv(time, status) ~ . res6 <- stepwise(formula = formula, data = lung, type = "cox", strategy = "forward", metric = "AICc") res6 plot_list <- setNames( lapply(c("forward"),function(i){ setNames( lapply(c("details","overview"),function(j){ plot(res6,strategy=i,process=j) }), c("details","overview") ) }), c("forward") ) cowplot::plot_grid(plotlist = plot_list$forward, ncol = 1, rel_heights = c(2, 1))
creditCard
datasetIn this example, we'll demonstrate how to perform Poisson regression analysis using the creditCard
dataset. We set type = "poisson"
to direct the function to perform Poisson regression.
Descprition of the creditCard
dataset
The creditCard
dataset contains credit history information for a sample of applicants for a specific type of credit card, included in the "AER"
package. It encompasses 1319 observations across 12 variables, including "reports", "age", "income", among others. The "reports" variable represents the number of major derogatory reports. For detailed information, refer to ?AER::CreditCard
.
Why choose Poisson regression
Poisson regression is frequently employed method for analyzing count data, where the response variable represents the occurrences of an event within a defined time or space frame. In the context of the creditCard
dataset, Poisson regression can model the count of major derogatory reports ("reports"), enabling assessment of predictors' impact on this variable.
In this example, we employ a "forward" strategy with "SL" as the selection criteria. We set the significance level for entry to 0.05 (sle = 0.05
).
data(creditCard) formula = reports ~ . res7 <- stepwise(formula = formula, data = creditCard, type = "poisson", strategy = "forward", metric = "SL", sle = 0.05) summary(res7$forward$SL)
plot_list <- setNames( lapply(c("forward"),function(i){ setNames( lapply(c("details","overview"),function(j){ plot(res7,strategy=i,process=j) }), c("details","overview") ) }), c("forward") ) cowplot::plot_grid(plotlist = plot_list$forward, ncol = 1, rel_heights = c(2, 1))
We have developed an interactive Shiny application to simplify model selection tasks for non-programmers. You can access the app through the following URL:
https://junhuili1017.shinyapps.io/StepReg/
You can also access the Shiny app directly from your local machine with the following code:
library(StepReg) StepRegShinyApp()
Here is the user interface.
sessionInfo()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.