README.md

collinearityR

R-CMD-check codecov

Identify multicollinearity issues by correlation, VIF, and visualizations. This package is designed for beginners of R who want to identify multicollinearity issues by applying a simple function. It automates the process of building a longer form of correlation matrix, creating correlation heat map and identifying pairwise highly correlated variables. A python version of package is also in the progress of development.

Functions

The following four functions are in the collinearityR package:

R ecosystem

The R ecosystem contains many tools necessary to conduct linear regression. However, it does not have tools to analyze multicollinearity visually using both Pearson’s coefficient and VIF. This process also requires intermediate knowledge of R to manipulate the correlation matrix into a more suitable format. Our package will allow users with less experience to conduct this analysis.

cor(): This function is part of base r. It creates a correlation matrix between variables using Pearson’s coefficient. Documentation for cor() can be accessed here.

ggplot: This is one of the most commonly used plotting packages. The collinearityR package relies on ggplot to create heatmap plots.

car: The car package is necessary to do VIF calculations. More documentation on VIF function can be found here.

Installation

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("UBC-MDS/collinearityR_tool")

Usage

This is a basic example which shows you how to apply this package to a data frame.



UBC-MDS/collinearityR_tool documentation built on Feb. 6, 2022, 9:41 p.m.