eda_report.data.frame: Reporting the information of EDA

View source: R/EDA.R

eda_reportR Documentation

Reporting the information of EDA

Description

The eda_report() report the information of exploratory data analysis for object inheriting from data.frame.

Usage

eda_report(.data, ...)

## S3 method for class 'data.frame'
eda_report(
  .data,
  target = NULL,
  output_format = c("pdf", "html"),
  output_file = NULL,
  output_dir = tempdir(),
  font_family = NULL,
  browse = TRUE,
  ...
)

Arguments

.data

a data.frame or a tbl_df.

...

arguments to be passed to methods.

target

target variable.

output_format

character. report output type. Choose either "pdf" and "html". "pdf" create pdf file by knitr::knit(). "html" create html file by rmarkdown::render().

output_file

character. name of generated file. default is NULL.

output_dir

character. name of directory to generate report file. default is tempdir().

font_family

character. font family name for figure in pdf.

browse

logical. choose whether to output the report results to the browser.

Details

Generate generalized EDA report automatically. You can choose to output as pdf and html files. This feature is useful for EDA of data with many variables, rather than data with fewer variables. For pdf output, Korean Gothic font must be installed in Korean operating system.

Value

No return value. This function only generates a report.

Reported information

The EDA process will report the following information:

  • Introduction

    • Information of Dataset

    • Information of Variables

    • About EDA Report

  • Univariate Analysis

    • Descriptive Statistics

    • Normality Test of Numerical Variables

      • Statistics and Visualization of (Sample) Data

  • Relationship Between Variables

    • Correlation Coefficient

      • Correlation Coefficient by Variable Combination

      • Correlation Plot of Numerical Variables

  • Target based Analysis

    • Grouped Descriptive Statistics

      • Grouped Numerical Variables

      • Grouped Categorical Variables

    • Grouped Relationship Between Variables

      • Grouped Correlation Coefficient

      • Grouped Correlation Plot of Numerical Variables

See vignette("EDA") for an introduction to these concepts.

Examples

if (FALSE) {
library(dplyr)

## target variable is categorical variable ----------------------------------
# reporting the EDA information
# create pdf file. file name is EDA_Report.pdf
eda_report(heartfailure, death_event)

# create pdf file. file name is EDA_heartfailure.pdf
eda_report(heartfailure, "death_event", output_file = "EDA_heartfailure.pdf")

# create pdf file. file name is EDA_heartfailure.pdf and not browse
eda_report(heartfailure, "death_event", output_dir = ".", 
  output_file = "EDA_heartfailure.pdf", browse = FALSE)

# create html file. file name is EDA_Report.html
eda_report(heartfailure, "death_event", output_format = "html")

# create html file. file name is EDA_heartfailure.html
eda_report(heartfailure, death_event, output_format = "html", 
  output_file = "EDA_heartfailure.html")

## target variable is numerical variable ------------------------------------
# reporting the EDA information
eda_report(heartfailure, sodium)

# create pdf file. file name is EDA2.pdf
eda_report(heartfailure, "sodium", output_file = "EDA2.pdf")

# create html file. file name is EDA_Report.html
eda_report(heartfailure, "sodium", output_format = "html")

# create html file. file name is EDA2.html
eda_report(heartfailure, sodium, output_format = "html", output_file = "EDA2.html")

## target variable is null
# reporting the EDA information
eda_report(heartfailure)

# create pdf file. file name is EDA2.pdf
eda_report(heartfailure, output_file = "EDA2.pdf")

# create html file. file name is EDA_Report.html
eda_report(heartfailure, output_format = "html")

# create html file. file name is EDA2.html
eda_report(heartfailure, output_format = "html", output_file = "EDA2.html")
}

choonghyunryu/dlookr documentation built on June 11, 2024, 9:12 a.m.