missingness_info: Compile missing data summary of dataset

Description Usage Arguments Details Examples

View source: R/missingness_info.R

Description

Generates two missing data summaries. The first is "by row" which reports the distribution of missing data within each row. The second is "by column" and reports the missing data percentage within each column.

Usage

1
missingness_info(data, upper_limit = 5, max_vars = 5)

Arguments

data

A data.frame object for which a missing data report will be generated

upper_limit

A number. The right tail of the frequency distribution reported in the "by observation" summary is truncated to [upper_limit, Inf) and is displayed as "upper_limit +".

max_vars

A number. Limits the list of variables reported in the second data frame to the first max_vars most frequently missing variables. If there are multiple variables with the same number of missing values, all such the variables will be reported. (This means more than max_vars variables can appear in the output)

Details

The output is a list with the "by row" summary and the "by column" summary.

Examples

1
2
tmp <- iris; tmp[1,1] <- NA
missingness_info(tmp)

thomasgstewart/tgsify documentation built on June 18, 2020, 11:10 a.m.