knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

abridge_df() is yet another approach to summarize data, aimed to help you get quickly acquainted with a new data set (mostly tabular data, though). It is designed to be the first contact with a new data set, providing answers -ideally at a glance- to typical questions, such as:

This is of course nothing new and there are a myriad of alternatives out there to do this. In R, the possibilities go from the built-in base::summary() to full-fledged packages that produce a detailed report of the data.frame, including correlations, memory usage and other stuff. ^[There's this useful Github repo autoEDA-resources that keeps a "list of software and papers related to automated Exploratory Data Analysis", including alternatives in R, Python and others.] But I was just not happy with any of those and just decided to write something that better fit my workflow.

So, here's an example of how it works and below a list of the features I found important and built-in abridge_df() (many of them of course also available in those other packages, but just not all of them in a single place).

nlsw88 <- haven::read_dta('http://www.stata-press.com/data/r15/nlsw88.dta')
efun::abridge_df(nlsw88)

Features:



edalfon/efun documentation built on June 23, 2024, 4:17 a.m.