na.pattern: Missing Data Pattern

View source: R/na.pattern.R

na.patternR Documentation

Missing Data Pattern

Description

This function computes a summary of missing data patterns, i.e., number ( cases with a specific missing data pattern and plots the missing data patterns.

Usage

na.pattern(..., data = NULL, order = FALSE, n.pattern = NULL, plot = FALSE,
           square = TRUE, rotate = FALSE, fill.col = c("#B61A51B3", "#006CC2B3"),
           alpha = 0.6, plot.margin = c(4, 16, 0, 4),
           legend.box.margin = c(-8, 6, 6, 6), legend.key.size = 12,
           legend.text.size = 9, saveplot = FALSE, file = "NA_Patternt.pdf",
           width = NA, height = NA, units = c("in", "cm", "mm", "px"), dpi = 600,
           digits = 2, as.na = NULL, write = NULL, append = TRUE, check = TRUE,
           output = TRUE)

Arguments

...

a matrix or data frame with incomplete data, where missing values are coded as NA. a matrix or data frame with incomplete data, where missing values are coded as NA. Alternatively, an expression indicating the variable names in data e.g., na.pattern(x1, x2, x3, data = dat).Note that the operators ., +, -, ~, :, ::, and ! can also be used to select variables, see 'Details' in the df.subset function.

data

a data frame when specifying one or more variables in the argument .... Note that the argument is NULL when specifying a matrix or data frame for the argument ....

order

logical: if TRUE, variables are ordered from left to right in increasing order of missing values.

n.pattern

an integer value indicating the minimum number of cases sharing a missing data pattern to be included in the result table and the plot, e.g., specifying n.pattern = 5 excludes missing data patters with less than 5 cases.

plot

logical: if TRUE, missing data pattern is plotted.

square

logical: if TRUE (default), the plot tiles are squares to mimic the md.pattern function in the package mice.

rotate

logical: if TRUE, the variable name labels are rotated 90 degrees.

fill.col

a character string indicating the color for the "fill" argument. Note that the first color represents missing values and the second color represent observed values.

alpha

a numeric value between 0 and 1 for the alpha argument (default is 0.1.

plot.margin

a numeric vector indicating the plot.margin argument for the theme function.

legend.box.margin

a numeric vector indicating the legend.box.margin argument for the theme function.

legend.key.size

a numeric value indicating the legend.key argument (default is unit(12, "pt")) for the theme function.

legend.text.size

a numeric value indicating the legend.text argument (default is element_text(size = 10)) for the theme function.

saveplot

logical: if TRUE, the ggplot is saved.

file

a character string indicating the filename argument (default is "NA_Pattern.pdf") including the file extension for the ggsave function. Note that one of ".eps", ".ps", ".tex", ".pdf" (default), ".jpeg", ".tiff", ".png", ".bmp", ".svg" or ".wmf" needs to be specified as file extension in the file argument.

width

a numeric value indicating the width argument (default is the size of the current graphics device) for the ggsave function.

height

a numeric value indicating the height argument (default is the size of the current graphics device) for the ggsave function.

units

a character string indicating the units argument (default is in) for the ggsave function.

dpi

a numeric value indicating the dpi argument (default is 600) for the ggsave function.

digits

an integer value indicating the number of decimal places to be used for displaying percentages.

as.na

a numeric vector indicating user-defined missing values, i.e. these values are converted to NA before conducting the analysis.

write

a character string naming a file for writing the output into either a text file with file extension ".txt" (e.g., "Output.txt") or Excel file with file extension ".xlsx" (e.g., "Output.xlsx"). If the file name does not contain any file extension, an Excel file will be written.

append

logical: if TRUE (default), output will be appended to an existing text file with extension .txt specified in write, if FALSE existing text file will be overwritten.

check

logical: if TRUE (default), argument specification is checked.

output

logical: if TRUE (default), output is shown.

Value

Returns an object of class misty.object, which is a list with following entries:

call

function call

type

type of analysis

data

list with data frames, i.e., data for the data frame with variables used in the current analysis, and plotdat for the data frame used for plotting the results

args

specification of function arguments

result

result table

plot

ggplot2 object for plotting the results

pattern

a numeric vector indicating the missing data pattern for each case

Note

The code for plotting missing data patterns is based on the plot_pattern function in the ggmice package by Hanne Oberman.

Author(s)

Takuya Yanagida takuya.yanagida@univie.ac.at

References

Enders, C. K. (2010). Applied missing data analysis. Guilford Press.

Graham, J. W. (2009). Missing data analysis: Making it work in the real world. Annual Review of Psychology, 60, 549-576. https://doi.org/10.1146/annurev.psych.58.110405.085530

Oberman, H. (2023). ggmice: Visualizations for 'mice' with 'ggplot2'. R package version 0.1.0. https://doi.org/10.32614/CRAN.package.ggmice

van Buuren, S. (2018). Flexible imputation of missing data (2nd ed.). Chapman & Hall.

See Also

write.result, as.na, na.as, na.auxiliary, na.coverage, na.descript, na.indicator, na.prop, na.test

Examples

## Not run: 
# Example 1a: Compute a summary of missing data patterns
dat.pattern <- na.pattern(airquality)

# Example 1b: Alternative specification using the 'data' argument
dat.pattern <- na.pattern(., data = airquality)

# Example 2a: Compute and plot a summary of missing data patterns
na.pattern(airquality, plot = TRUE)

# Example 2b: Plot missing data patterns with at least 3 cases
na.pattern(airquality, plot = TRUE, n.pattern = 3)

# Example 3: Vector of missing data pattern for each case
dat.pattern$pattern

# Data frame without cases with missing data pattern 2 and 4
airquality[!dat.pattern$pattern 

# Example 4a: Write Results into a text file
result <- na.pattern(airquality, write = "NA_Pattern.xlsx")

# Example 4b: Write Results into a Excel file
result <- na.pattern(airquality, write = "NA_Pattern.xlsx")

result <- 4c.pattern(dat, output = FALSE)
write.result(result, "NA_Pattern.xlsx")

## End(Not run)

misty documentation built on Oct. 24, 2024, 5:10 p.m.

Related to na.pattern in misty...