eda: Automated exploratory data analysis

Description Usage Arguments Value Author(s) Examples

View source: R/eda.R

Description

Explores a provided data set and returns a list of plots per feature as well as a summary table.

Usage

1
2
3
eda(data, x = NULL, y = NULL, sample.size = 0.5, theme = 1,
  numeric.plot = "histogram", categorical.plot = "bar",
  pipeline = NULL)

Arguments

data

[required | data.frame ] Dataset to visualize.

x

[optional | character | default=NULL] Features to visualize specified as a character vector. If NULL then all features in the dataset will be used except for the target feature.

y

[optional | character | default=NULL] Target feature to visualize. If NULL a univariate visialization will take place.

sample.size

[optional | numeric | default=0.5] Sample size to down sample the data for faster exploration.

theme

[optional | numeric | default=1] Color theme applied to plot, options range from 1 to 4.

numeric.plot

[optional | character | default="histogram"] The type of plot to be produced. For numeric feature types histogram, density, boxplot and violin are available.

categorical.plot

[optional | character | default="bar"] The type of plot to be produced. For categorical bar and stackedbar are available.

pipeline

[optional | list | default=NULL] Pipeline used to pre-process the data for visualization. If NULL then a exploratory pipeline will be produced to pre-process the data.

Value

List containing plots, tabular summary exploration and the pipeline used to data pre-processing.

Author(s)

Xander Horn

Examples

1
2
res <- eda(data = iris)
res <- eda(data = iris, y = "Species")

XanderHorn/lazy documentation built on Jan. 16, 2021, 6:15 p.m.