get_data: Extraction of metadata from dataframes
In missCompare: Intuitive Missing Data Imputation Framework

Description Usage Arguments Details Value Examples

get_data extracts descriptive metadata from the dataframe including information on missing data

1	get_data(X, matrixplot_sort = TRUE, plot_transform = TRUE)

`X`	Original dataframe with samples in rows and variables as columns. Can also use the resulting object from the `clean` function
`matrixplot_sort`	Boolean with default TRUE. If TRUE, the matrix plot will be sorted by missing/non-missing status. If FALSE, the original order of rows will be retained
`plot_transform`	Boolean with default TRUE. If TRUE, the matrix plot will plot all variables scaled (mean = 0, SD = 1). If FALSE, the matrix plot will show the variables on their original scale

This function uses the original dataframe and extracts descriptive metadata including dimensions, missingness fractions overall and by variable, number of missing values overall and by variable, missing data patterns, missing data correlations and missing data visualizations

`Complete_cases`	Number of complete cases (samples with no missing data in any columns)
`Rows`	Total number of rows (samples) in the dataframe
`Columns`	Total number of columns (variables) in the dataframe
`Corr_matrix`	Correlation matrix of all variables. The correlation matrix contains Pearson correlation coefficients based on pairwise correlations between variable pairs
`Fraction_missingness`	Total fraction of missingness expressed as a number between 0 and 1, where 1 means 100% of data is missing and 0 means there are no missing values
`Fraction_missingness_per_variable`	Fraction of missingness per variable. A (named) numeric vector of length the number of columns. Each variable missingness values are expressed as numbers between 0 and 1, where 1 means 100% of data is missing and 0 means there are no missing values
`Total_NA`	Total number of missing values in the dataframe
`NA_per_variable`	Number of missing values per variables in the dataframe. A (named) numeric vector of length the number of columns
`MD_Pattern`	Missing data pattern calculated using mice::md_pattern (see `md.pattern` in the mice package)
`NA_Correlations`	Correlation matrix of variables vs. variables converted to boolean based on missingness status (yes/no). Point-biserial correlation coefficients based on variable pairs is obtained using complete observations in the respective variable pairs. Higher correlation coefficients can indicate MAR missingness pattern
`NA_Correlation_plot`	Plot based on NA_Correlations
`min_PDM_thresholds`	Small dataframe offering clues on how to set min_PDM thresholds in the next steps of the pipeline. The first column represents min_PDM thresholds, while the second column represents percentages that would be retained by setting min_PDM to the respective values. These values are the percentages of the total rows with any number of missing data (excluding complete observations), so a value of e.g. 80% would mean that 80% of rows with missing data with the most common patterns are represented in the simulation step
`Vars_above_half`	Character vector of variables names with missingness higher than 50%
`Matrix_plot`	Matrix plot where missing values are colored gray and available values are colored based on value range
`Cluster_plot`	Cluster plot of co-missingness. Variables demonstrating shared missingness patterns will branch at closer to the bottom of the plot, while no patterns will be represented by branches high in the plot

1
2
3

cleaned <- clean(clindata_miss, missingness_coding = -9)
metadata <- get_data(cleaned)
metadata <- get_data(cleaned, matrixplot_sort = FALSE)

missCompare documentation built on Dec. 1, 2020, 9:09 a.m.

missCompare index

A complete tutorial to missCompare

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

missCompare
Intuitive Missing Data Imputation Framework

get_data: Extraction of metadata from dataframes
In missCompare: Intuitive Missing Data Imputation Framework

Description

Usage

Arguments

Details

Value

Examples

Related to get_data in missCompare...

R Package Documentation

Browse R Packages

We want your feedback!

missCompare Intuitive Missing Data Imputation Framework

get_data: Extraction of metadata from dataframes In missCompare: Intuitive Missing Data Imputation Framework

Description

Usage

Arguments

Details

Value

Examples

Related to get_data in missCompare...

R Package Documentation

Browse R Packages

We want your feedback!

missCompare
Intuitive Missing Data Imputation Framework

get_data: Extraction of metadata from dataframes
In missCompare: Intuitive Missing Data Imputation Framework