title : "Tutorial "
☺nnnnn
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{Tutorial}
%\V ignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This tutorial will show you how to use the vifmiqo function from the Rvifmiqo package. First, you will need to load all the required libraries. Instructions on how to install the "gurobi" package can be found at https://www.gurobi.com/.
# Required to run the vifmiqo function library(Rvifmiqo) library(Matrix) library(gurobi) # Used for data formatting library(fastDummies) library(ggplot2) library(reshape2)
For this tutorial, we will be using the "iris" dataset that is built in to R.
data(iris) head(iris)
The vifmiqo function requires the user to separate the covariates of interest from the outcome of interest. Before we can run the function, we must remove any rows containing missing data and create dummy variables for each categorical variable. Then, we need to select an outcome and reformat the data into X and y. For this example, let's suppose we are trying to predict Petal Width using the other variables as predictors.
# Remove rows with missing values iris <- iris[complete.cases(iris), ] # Separate predictors and outcome X <- iris[ , !(names(iris) == 'Petal.Width')] y <- iris$Petal.Width # Create dummy variables for categorical data X <- dummy_cols(X, select_columns = c("Species")) # Remove the original Species variable X <- X[ , !(names(X) == 'Species')] # See the results head(X)
All that's left to do is call the function and examine the results.
vif(X,y)
As we can see, our outcome selects only four of the six predictors. Although we cannot easily visualize multicollinearity, the presence of pairwise linear associaitons in the original data is reason for concern.
get_upper_tri <- function(cormat){ cormat[lower.tri(cormat)]<- NA return(cormat) } cormat <- round(cor(X),2) upper_tri <- get_upper_tri(cormat) melted_cormat <- melt(upper_tri, na.rm = TRUE) # Heatmap library(ggplot2) ggplot(data = melted_cormat, aes(Var2, Var1, fill = value))+ geom_tile(color = "white")+ scale_fill_gradient2(low = "blue", high = "red", mid = "white", midpoint = 0, limit = c(-1,1), space = "Lab", name="Pearson\nCorrelation") + theme_minimal()+ theme(axis.text.x = element_text(angle = 45, vjust = 1, size = 12, hjust = 1))+ coord_fixed()
The optional alpha parameter measures how much multicollinearity we wish to allow. Values of 5 and 10 are the most common, with 5 being more restricitive. In this case, changing alpha to 10 does not change the results, however in a larger dataset this change could result in more covariates being selected.
vifmiqo(X,y,alpha=10)
bench::mark(vifmiqo(X,y))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.