View source: R/outlier-filtering.R
outlier_filter | R Documentation |
Filter out outliers in metadata by using appropriate outlier tests.
outlier_filter(
metadata,
pcr_id_col = pcr_id_column(),
outlier_test = c(outliers_by_pool_fragments),
outlier_test_outputs = NULL,
combination_logic = c("AND"),
negate = FALSE,
report_path = default_report_path(),
...
)
metadata |
The metadata data frame |
pcr_id_col |
The name of the pcr identifier column |
outlier_test |
One or more outlier tests. Must be functions,
either from |
outlier_test_outputs |
|
combination_logic |
One or more logical operators ("AND", "OR", "XOR", "NAND", "NOR", "XNOR"). See datails. |
negate |
If |
report_path |
The path where the report file should be saved.
Can be a folder or |
... |
Additional named arguments passed to |
The outlier filtering functions are structured in a modular fashion. There are 2 kind of functions:
Outlier tests - Functions that perform some kind of calculation based on inputs and flags metadata
Outlier filter - A function that takes one or more outlier tests, combines all the flags with a given logic and filters out rows that are flagged as outliers
This function acts as the filter. It can either take one or more outlier
tests as functions and call them through the argument outlier_test
,
or it can take directly outputs produced by individual tests in
the argument outlier_test_outputs
- if both are provided the second one
has priority. The second method offers a bit more freedom, since single
tests can be run independently and intermediate results saved and examined
more in detail. If more than one test is to be performed, the argument
combination_logic
tells the function how to combine the flags: you can
specify 1 logical operator or more than 1, provided it is compatible
with the number of tests.
You have the freedom to provide your own functions as outlier tests. For this purpose, functions provided must respect this guidelines:
Must take as input the whole metadata df
Must return a df containing AT LEAST the pcr_id_col
and a logical column
"to_remove"
that contains the flag
The pcr_id_col
must contain all the values originally present in the
metadata df
A data frame of metadata which has less or the same amount of rows
Other Data cleaning and pre-processing:
aggregate_metadata()
,
aggregate_values_by_key()
,
compute_near_integrations()
,
default_meta_agg()
,
outliers_by_pool_fragments()
,
purity_filter()
,
realign_after_collisions()
,
remove_collisions()
,
threshold_filter()
data("association_file", package = "ISAnalytics")
filtered_af <- outlier_filter(association_file,
key = "BARCODE_MUX",
report_path = NULL
)
head(filtered_af)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.