knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
Exploratory Data Analysis is an important preparatory work to help data scientists understand and clean up data sets before machine learning begins. However, this step also involves a lot of repetitive tasks. In this context, slimeda will help data scientists quickly complete the initial work of EDA and gain a preliminary understanding of the data.
Slimeda focuses on unique value and missing value counts, as well as making graphs like histogram and correlation graphs. Also, the generated results are designed as charts or images, which will help users more flexibly reference their EDA results.
Function Specification The package is under developement and includes the following functions:
histogram : This function accepts a dataframe and builds histograms for all numeric columns which are returned as an array of chart objects.
corr_map : This function accepts a dataframe and builds an heat map for all numeric columns which is returned as a chart object.
cat_unique_count : This function accepts a dataframe and returns a table of unique value counts for all categorical columns.
miss_counts : This function accepts a dataframe and returns a table of counts of missing values in all columns.
Limitations: We only consider numeric and categorical columns in our package.
You can install the released version of slimreda (after Milestone 4 is done) from CRAN with:
install.packages("slimreda")
And the development version from GitHub with:
# install.packages("devtools") devtools::install_github("UBC-MDS/slimreda")
To import the package:
library(slimreda) ## basic example code
For each function:
histogram
:library(palmerpenguins) library(cowplot) hist_plots <- slimreda::histogram(penguins, c('body_mass_g', 'flipper_length_mm')) cowplot::plot_grid(plotlist = hist_plots, nrow = 1)
miss_count
:example_miss_count <-data.frame( name = c(NA,NA,"Jessica"), age = c(NA,21,30), hobby = c("lab","quiz","swim") ) output <- slimreda::miss_count(example_miss_count, ascending = TRUE) output
cat_unique_count
:unique_cat_df <- slimreda::cat_unique_count(penguins) knitr::kable(unique_cat_df, "simple")
corr_map
:corr_map_plot <- slimreda::corr_map(penguins, colnames(penguins)) corr_map_plot
Packages have similar functions are: DataExplorer (https://cran.r-project.org/web/packages/DataExplorer/vignettes/dataexplorer-intro.html)
Slimreda's innovation points:
Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.
Group 4 members:
slimeda
was created by Taiwo Owoseni. It is licensed under the terms of the MIT license.
slimeda
was created with the devtools package. It is the public face of a set of packages that support various aspects of package development.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.