This function provides a summary of NAs in a given matrix or data.frame either feature-wise (by column) or sample-wise (by row). It can also provide a barplot and/or histogram regarding this statistics.
1 |
d |
A data.frame or matrix which you want to get the summary of NAs in it (Mandatory) |
hist |
logical. Should the function plot histogram. Default is FALSE. (Optional) |
summary |
logical. Should the function returns the result dataframe. Default is TRUE. (Optional) |
byrow |
logical. Should the function perform row-wise. Default is FALSE. (Optional) |
barplot |
logical. Should the function plot barplot. Default is TRUE. (Optional) |
This function provides a quick and easy way to see how much missing values (NA) exist in a data.frame or matrix. This function is designed to make the data exploration easier since missing values are one of the most problematic part in lated stages of analysis.
The function prvides a data.frame (in case summary argument is set to TRUE) containing column or row index, name, number_of_NAs and ratio_of_NA. In case the function does not find any NA, it will return NULL in case it need to be checked by is.null().
The barplot generated by this function is presenting column names or row names which contain NAs with their NA ratio to the total number of items in that row or column. The plot also colors the bars based on their NA ratio: * Gray less than and equal to 10 * Yellow for >10 * Orange for >30 * Red for >50 The plot also has horizontal lines at 10
The histogram generated by this function is meant to provide an overview of how NAs are distributed in the input data. This plot presents all the columns or rows regardless of having NA values or not. This plot is more useful when used for small number of rows or columns.
Mehrad Mahmoudian
pin.na
is.na
1 2 3 4 5 6 7 8 9 10 11 |
Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
All documentation is copyright its authors; we didn't write any of that.