R/all_patterns.R

Defines functions all_patterns

Documented in all_patterns

#' @title Missing data spike-in in various missing data patterns
#'
#' @description
#' \code{\link{all_patterns}} spikes in missingness using MCAR, MAR, MNAR (default) and MAP (optional) patterns
#'
#' @details
#' This function uses the generated simulated matrix and generates missing datapoints in MCAR, MAR and MNAR patterns.
#' Optionally, in case the user defines an assumed pattern, the \code{\link{all_patterns}} function will also generate
#' a MAP missingness pattern. It is suggested that the user carefully
#' examines the missing data fractions, excludes variables with high missingness using the \code{\link{clean}} function.
#' For more information on the functions that spike in missing data in MCAR, MAR,
#' MNAR and MAP patterns, please see the functions \code{\link{MCAR}}, \code{\link{MAR}},
#' \code{\link{MNAR}} and \code{\link{MAP}}.
#'
#'
#' @param X_hat Simulated matrix with no missingness (Simulated_matrix output from the \code{\link{simulate}} function)
#' @param MD_pattern Missing data pattern in the original dataset (MD_Pattern output from the \code{\link{get_data}} function)
#' @param NA_fraction Fraction of missingness in the original dataset (Fraction_missingness output from the \code{\link{get_data}} function)
#' @param min_PDM All patterns with number of observations less than this number will be removed from the missing data generation. This argument is necessary to be carefully set, as the function will fail or generate erroneous missing data patterns with very complicated missing data patterns. The default is 10, but for large datasets this number needs to be set higher to avoid errors. Please select a value based on the min_PDM_thresholds output from the \code{\link{get_data}} function
#' @param assumed_pattern Vector of missingness types (must be same length as missingness fraction per variable). If this input is specified, the function will spike in missing datapoints in a MAP pattern as well.
#'
#' @name all_patterns
#'
#' @return
#' \item{MCAR_matrix}{Matrix with MCAR pre-defined missingness pattern (default output)}
#' \item{MAR_matrix}{Matrix with MAR pre-defined missingness pattern (default output)}
#' \item{MNAR_matrix}{Matrix with MNAR pre-defined missingness pattern (default output)}
#' \item{MAP_matrix}{Matrix with MAP pre-defined missingness pattern (optional output)}
#'
#' @examples
#' cleaned <- clean(clindata_miss, missingness_coding = -9)
#' metadata <- get_data(cleaned)
#' simulated <- simulate(rownum = metadata$Rows, colnum = metadata$Columns,
#' cormat = metadata$Corr_matrix)
#'
#' miss_list <- all_patterns(simulated$Simulated_matrix,
#'                     MD_pattern = metadata$MD_Pattern,
#'                     NA_fraction = metadata$Fraction_missingness,
#'                     min_PDM = 20)
#'
#' miss_list <- all_patterns(simulated$Simulated_matrix,
#'                     MD_pattern = metadata$MD_Pattern,
#'                     NA_fraction = metadata$Fraction_missingness,
#'                     min_PDM = 10,
#'                     assumed_pattern = c('MAR', 'MCAR', 'MCAR', 'MAR',
#'                                         'MNAR', 'MCAR', 'MAR', 'MCAR',
#'                                         'MCAR', 'MAR', 'MNAR'))
#'
#' @export


### FUNCTION
all_patterns <- function(X_hat, MD_pattern, NA_fraction, min_PDM = 10, assumed_pattern = NA) {

    MCAR <- MCAR(X_hat, MD_pattern = MD_pattern, NA_fraction = NA_fraction, min_PDM = min_PDM)
    MAR <- MAR(X_hat, MD_pattern = MD_pattern, NA_fraction = NA_fraction, min_PDM = min_PDM)
    MNAR <- MNAR(X_hat, MD_pattern = MD_pattern, NA_fraction = NA_fraction, min_PDM = min_PDM)

    if (!is.na(assumed_pattern[1]) & (length(assumed_pattern) != ncol(X_hat)))
        stop(paste("The number of columns in X_hat (", ncol(X_hat), ") and argument assumed_pattern (",
            length(assumed_pattern), ") do not match. Please double-check the arguments of the function.",
            sep = ""))


    if (!is.na(assumed_pattern[1]))
        MAP <- MAP(X_hat, MD_pattern = MD_pattern, NA_fraction = NA_fraction, min_PDM = min_PDM,
            assumed_pattern = assumed_pattern)

    if (!is.na(assumed_pattern[1]))
        list(MCAR_matrix = MCAR$MCAR_matrix, MAR_matrix = MAR$MAR_matrix, MNAR_matrix = MNAR$MNAR_matrix,
            MAP_matrix = MAP$MAP_matrix) else list(MCAR_matrix = MCAR$MCAR_matrix, MAR_matrix = MAR$MAR_matrix, MNAR_matrix = MNAR$MNAR_matrix)

}

Try the missCompare package in your browser

Any scripts or data that you put into this service are public.

missCompare documentation built on Dec. 1, 2020, 9:09 a.m.