knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) library(qacr) library(knitr)
The purpose of the qacr package is to provide functions for descriptive statistics, data management, and data visualization. As a part of this package, the contents function produces a series of informational tables that allow for users to have a comprehensive understanding of their dataset of choice as well as the each of the quantitative and categorical variables featured in their dataset. Graphical functions, such as barcharts, histograms, and densities provide succinct visualizations of the variables in a data frame.
How can you quickly become familiar with the data in a data frame? We'll use the cars74 data frame as an example. This data frame contains information on the characteristics of 32 cars in 1974.
data(cars74)
contents()
contents(cars74)
As shown above, the contents
function produced three tables, one that provides an overall summary of the data and two tables that break down the quantitative and categorical variables of the dataset respectively.
The overall table describes each of the variables in the data frame, listing their column position, variable name, the variable type, number of unique values, the number of missing values, and the corresponding percentage of missing values.
The quantitative table provides information for each of the quantitative variables in the data frame, listing the variable name, the number of non-missing values, and a series of summary statistics including the mean, standard deviation, skewness, minimum, 25%tile, median, 75%tile, and maximum.
The categorical table provides information for each of the categorical variables in the data frame, listing the variable name, the specific levels of each variable, and the number of observations for each level, and percentage distribution of each variable level. By default, up to 10 levels of a categorical variable are displayed (but you can increase this by adding the option maxcat = #, where # is the number of levels to display).
If the dataset only contains quantitative or categorical, then only the overall summary table and the relevant variable table would be printed.
The df_plot
function provides a single graph displaying the variables, their types, and the percent of values present (or missing).
df_plot(cars74)
The barcharts
function plots the distribution of each of the categorical variables in the data frame.
barcharts(cars74)
The histograms
and densities
functions plot the distribution of each of the quantitative variables in the data frame.
histograms(cars74) densities(cars74)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.