summariseOverrep | R Documentation |
Summarise the Overrepresented sequences found in one or more QC files
summariseOverrep(x, ...)
## S4 method for signature 'FastpData'
summariseOverrep(x, step = c("Before", "After"), min_count = 0, ...)
## S4 method for signature 'FastpDataList'
summariseOverrep(
x,
min_count = 0,
step = c("Before", "After"),
vals = c("count", "rate"),
fn = c("mean", "sum", "max"),
by = c("reads", "sequence"),
...
)
## S4 method for signature 'FastqcDataList'
summariseOverrep(
x,
min_count = 0,
vals = c("Count", "Percentage"),
fn = c("mean", "sum", "max"),
pattern = ".*",
...
)
## S4 method for signature 'FastqcData'
summariseOverrep(
x,
min_count = 0,
vals = c("Count", "Percentage"),
fn = c("mean", "sum", "max"),
pattern = ".*",
by = "Filename",
...
)
x |
An object of a suitable class |
... |
Not used |
step |
Can be 'Before', 'After' or both to obtain data from the Before_filtering or After_filtering modules |
min_count |
Filter sequences with counts less than this value, both before and after filtering |
vals |
Values to use for creating summaries across multiple files. For FastpDataList objects these can be "count" and/or "rate", whilst for FastqcDataList objects these values can be "Count" and/or "Percentage" |
fn |
Functions to use when summarising values across multiple files |
by |
character vector of columns to summarise by. See dplyr::summarise |
pattern |
Regular expression to filter the Possible_Source column by |
This function prepares a useful summary of all over-represented sequences as reported by either fastp or FastQC
A tibble
Tibble columns will vary between Fastp*, FastqcDataList and FastqcData objects. Calling this function on list-type objects will attempt to summarise the presence each over-represented sequence across all files.
In particular, FastqcData objects will provide the requested summary statistics across all sequences within a file
## For operations on a FastpData object
f <- system.file("extdata/fastp.json.gz", package = "ngsReports")
fp <- FastpData(f)
summariseOverrep(fp, min_count = 100)
## Applying the function to a FastqcDataList
packageDir <- system.file("extdata", package = "ngsReports")
fl <- list.files(packageDir, pattern = "fastqc.zip", full.names = TRUE)
fdl <- FastqcDataList(fl)
summariseOverrep(fdl)
# An alternative viewpoint can be obtained using
fdl |> lapply(summariseOverrep) |> dplyr::bind_rows()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.