Description Usage Arguments Value Author(s) Examples
Calls "is.outlier"
for all of the data columns (cols) of the provided data.frame (x)
and returns the data frame with NA in place of outliers
1 | outlierRemoveDataset(x, mcut = 6.2, by = NA, cols)
|
x |
A data.frame with sample data, metadata, etc. |
mcut |
Number of MADs a data point need to be from the median to be considered an outlier, default is 6.2 |
by |
Column name to group data by for outlier removal (e.g. by line, by run, etc.), if not provided then by whole dataset |
cols |
Vector of column numbers or names in x to remove outliers from. |
Returns data frame in the same format as input, but with outliers in each of the specified columns changed to NA.
Greg Ziegler
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | set.seed(1)
x <- rnorm(100)
y <- rnorm(100)
z <- rnorm(100)
x <- c(-10, x, 10)
y <- c(-20, y, 20)
z <- c(-30, z, 30)
df <- data.frame(id=sample(LETTERS[1:5],length(x),replace=TRUE),x,y,z)
#By entire dataset
dfOR <- outlierRemoveDataset(df,6.2,by=NA,c("x","y","z"))
summary(dfOR)
#Look for outliers within groups
dfOR <- outlierRemoveDataset(df,6.2,by="id",c("x","y","z"))
summary(dfOR)
## The function is currently defined as
function (x, mcut = 6.2, by = NA, cols)
{
for (i in cols) {
if (is.na(by)) {
x[, i] <- is.outlier(x[, i], mcut)
}
else {
for (j in unique(x[, by])) {
if (is.na(j)) {
x[is.na(x[, by]), i] <- is.outlier(x[is.na(x[,
by]), i], mcut)
}
else {
x[x[, by] == j & !(is.na(x[, by])), i] <- is.outlier(x[x[,
by] == j & !(is.na(x[, by])), i], mcut)
}
}
}
}
return(x)
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.