README.md

magmavizR

Codecov test
coverage R-CMD-check pages-build-deployment

Exploratory Data Analysis is one of the key steps in a machine learning project. This package aims to make this process easy by providing R functions built on the ggplot2 package to plot four common types of plots with the magma color scheme. To maximize interpretability, the plots have defined color schemes (discrete, diverging, sequential) based on the kind of data they show.

Installation

The development version of the package can be installed from GitHub with:

# install.packages("devtools")
devtools::install_github("UBC-MDS/magmavizR")

Documentation

An interactive version of the documentation can be found here.

Usage

The magmavizR library can be loaded by using the commands below:

library("magmavizR")
penguins_data <- palmerpenguins::penguins

The four data visualization functions included in the package along with the usage are outlined below:

Boxplot

Returns a boxplot based on the data frame, a numerical feature to view the distribution of and a categorical feature to bucket data into categories. Additionally, there is a boolean option to facet the boxplots into separate charts.

boxplot(penguins_data, species, bill_length_mm, facet = TRUE)

Correlation plot

Returns a correlation plot based on the numerical features present in the data frame. Additionally, it will print the correlated numerical feature pairs along with their correlation values.

corrplot(penguins_data, print_corr = TRUE, title = "Correlation Plot")

Histogram

Returns a histogram based on the data frame and a numeric feature to plot on the x-axis. The y-axis will display the result of the following aggregating functions:

histogram(penguins_data, bill_length_mm, "..count..")

Scatterplot

Returns a scatterplot based on the data frame and two numerical feature names passed as the required inputs. There are auxiliary inputs that provide the flexibility to:

scatterplot(penguins_data, bill_length_mm, flipper_length_mm, species, "Bill and Flipper length clusters by Species", 0.5, 2.5, "Bill length (mm)", "Flipper length (mm)", "", FALSE, FALSE, TRUE)

Fit within R ecosystem

Our package will build onto the existing features of ggplot2 using the magma color scheme. It serves as an automated plotter and is a higher level implementation of it. Essentially it circumvents the need to code every single detail and allows the user to focus on the output. We came across two packages on CRAN that have a similar line of thought:

Contributing

The primary contributors to this package are:

  1. Abdul Moid Mohammed
  2. Mukund Iyer
  3. Irene Yan
  4. Rubén De la Garza Macías

We welcome new ideas and contributions. Please refer to the contributing guidelines in the CONTRIBUTING.MD file. Do note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

magmavizR was created by Abdul Moid Mohammed, Mukund Iyer, Irene Yan, Rubén De la Garza Macías. It is licensed under the terms of the MIT license.

Credits

magmavizR was created using the tutorial in The Whole Game.



UBC-MDS/magmavizR documentation built on Feb. 6, 2022, 9:41 p.m.