deleteBogusRows | R Documentation |
If cases are mostly missing, delete them. It often happens that when data is imported from other sources, some noise rows exist at the bottom of the input. Anything that is missing in more than, say, 90% of cases is probably useless information. We invented this to deal with problem that MS Excel users often include a marginal note at the bottom of a spread sheet.
deleteBogusRows(dframe, pm = 0.9, drop = FALSE, verbose = TRUE, n = 25)
dframe |
A data frame or matrix |
pm |
"proportion missing data" to be tolerated. |
drop |
Default FALSE: if data frame result is reduced to one row, should R's default drop behavior "demote" this to a column vector. |
verbose |
Default TRUE. Should a report be printed summarizing information to be delted? |
n |
Default 25: limit on number of values to print in verbose diagnostic output. If set to NULL or NA, then all of the column values will be printed for the bogus rows. |
a data frame, invisibly
Paul Johnson <pauljohn@ku.edu>
mymat <- matrix(rnorm(10*100), nrow = 10, ncol = 100,
dimnames = list(1:10, paste0("x", 1:100)))
mymat <- rbind(mymat, c(32, rep(NA, 99)))
mymat2 <- deleteBogusRows(mymat)
mydf <- as.data.frame(mymat)
mydf$someFactor <- factor(sample(c("A", "B"), size = NROW(mydf), replace = TRUE))
mydf2 <- deleteBogusRows(mydf, n = "all")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.