Almost every data analysis project involves the process of doing some exploratory data analysis(EDA) and data preprocessing. Usually they serve as a very crucial and inevitable step in a data analysis workflow. There are some very common tasks in EDA, which can include:
Typically these steps are followed by some preprocessing like imputation and dealing with outliers. All of those steps together may require lots of coding effort and can be repeated for several projects. To solve this issue, we designed the R package eaziReda that wraps all of those lines of code into four convenient functions that will allow you to quickly and easily carry out EDA along with some simple preprocessing using just a few lines of code!
You can install the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("https://github.com/UBC-MDS/eaziReda.git")
Documentation and usage examples for eaziReda
can be found
here.
missing_impute
: This function will take in a dataframe and
generate a table listing the number of missing values and the
percentage of missing values for each column. It also gives the user
an option of doing some simple imputations on the entire dataframe
in place. The imputation methods can also be customized by the user.outliers_detect
: This function will take in a vector and will
return a boolean vector with outliers marked given by certain method
that the users can customize. It also gives the user an option to
remove all the outliers in place.corr_plot
: This function will take in a dataframe and a list of
feature names to generate a correlation plot for the given list of
features.histograms
: This function will take in a dataframe and a list of
feature names to generates histograms for numeric features and bar
plots for categorical featuresremove_outliers
: This function will remove the outliers from the
given vector based on a second vector that has the outliers’ indices
markedWhile there aren’t a ton of packages in R that do only EDA, quite a few of them include it as a secondary functionality. Here are a few packages that we found that do something similar:
Please note that the eaziReda project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
We welcome and recognize all contributions.
| Core contributor | Github.com username | |---------------------------|---------------------| | Vignesh Lakshmi Rajakumar | @vigneshrajakumar | | Dustin Andrews | @dbandrews | | Arash Shamseddini | @arashshams | | Yuyan Guo | @yuyanguo |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.