missingness_info: Compile missing data summary of dataset

View source: R/missingness_info.R

missingness_infoR Documentation

Compile missing data summary of dataset

Description

Generates two missing data summaries. The first is "by row" which reports the distribution of missing data within each row. The second is "by column" and reports the missing data percentage within each column.

Usage

missingness_info(data, upper_limit = 5, max_vars = 5)

Arguments

data

A data.frame object for which a missing data report will be generated

upper_limit

A number. The right tail of the frequency distribution reported in the "by observation" summary is truncated to [upper_limit, Inf) and is displayed as "upper_limit +".

max_vars

A number. Limits the list of variables reported in the second data frame to the first max_vars most frequently missing variables. If there are multiple variables with the same number of missing values, all such the variables will be reported. (This means more than max_vars variables can appear in the output)

Details

The output is a list with the "by row" summary and the "by column" summary.

Examples

tmp <- iris; tmp[1,1] <- NA
missingness_info(tmp)

thomasgstewart/tgsify documentation built on Oct. 26, 2024, 8:15 p.m.