filter_outliers: Filter lowly abundant features

View source: R/filter_outliers.R

filter_outliersR Documentation

Filter lowly abundant features

Description

Function for filtering lowly abundant features. By default, it uses all numerical columns. Missing values are always considered as outliers.

Usage

filter_outliers(data, target = NULL, percent = 1, k = 1.5, lower_limit = NULL)

Arguments

data

data to filter featuers from.

target

columns to base the filtering on, supports tidyselect-package.

percent

A feature gets filtered out if it is lowly abundant or missing in percent columns.

k

Parameter for the lower limit of Tukey's fence, any value bellow this will be considered an outlier.

lower_limit

a user defined lower limit at which a measurement is considered an outlier.

Value

data with outliers removed

Examples

# Since Tukey's fences are not ideal for raw proteomics data one could use
# the e.g., the tenth percentile as a indicator of lower abundance
filter_outliers(yeast, lower_limit = stats::quantile(yeast[-1], .1, na.rm = TRUE))

# We recommend normalizing the data before filtering outliers with Tukey's fences.
# This way we ensure that no peptides are considered outliers as an effect
# of a set of samples, one average, have lower quantification or that the
# lower fence is smaller then the smallest value in the dataset
yeast <- psrn(yeast, "identifier")
filter_outliers(yeast, -1, 1, 1.5)

PhilipBerg/PaiR documentation built on March 18, 2022, noon