importNgsLogs: Import Various NGS-related log files

View source: R/importNgsLogs.R

importNgsLogsR Documentation

Import Various NGS-related log files

Description

[Maturing] Imports NGS-related log files such as those generated from stderr.

Usage

importNgsLogs(x, type = "auto", which, stripPaths = TRUE)

Arguments

x

character. Vector of filenames. All log files must be of the same type. Duplicate file paths will be silently ignored.

type

character. The type of file being imported. Can be one of bowtie, bowtie2, hisat2, star, flagstat, featureCounts, duplicationMetrics, cutadapt, umitoolsDedup, macs2Callpeak, adapterRemoval, rnaseqcMetrics, quast, salmonLibFormatCounts, salmonMetaInfo or busco. Defaults to type = "auto" which will automatically detect the file type for all implemented types.

which

Which element of the parsed object to return. Ignored in all file types except when type is set to duplicationMetrics, cutadapt or adapterRemoval. See details for possible values. To return all elements, set this value to 'all'

stripPaths

logical(1). Remove paths from the Filename column

Details

Imports one or more log files as output by tools such as: bowtie, bowtie2, featureCounts, Hisat2, STAR, salmon ⁠picard MarkDuplicates⁠, cutadapt, flagstat, macs2Callpeak, ⁠Adapter Removal⁠, trimmomatic, rnaseqcMetrics, quast or busco. autoDetect can be used to detect the log type by parsing the file.

The featureCounts log file corresponds to the counts.out.summary, not the main counts.out file.

Whilst most log files return a single tibble, some are more complex with multiple modules.

adapterRemoval can return one of four modules (which = 1:4),. When calling by name, the possible values are sequences, settings, statistics or distribution. Partial matching is implemented.

cutadapt can return one of five modules (which = 1:5). When calling by name the possible modules are summary, adapter1, adapter2, adapter3 or overview. Note that adapter2/3 may be missing from these files depending on the nature of your data. If cutadapt log files are obtained using report=minimal, all supplied log files must be of this format and no modules can be returned.

duplicationMetrics will return either the metrics of histogram. These can be requested by setting which as 1 or 2, or naming either module.

Value

A tibble. Column names are broadly similar to the text in supplied files, but have been modified for easier handling under R naming conventions.

Examples

f <- c("bowtiePE.txt", "bowtieSE.txt")
bowtieLogs <- system.file("extdata", f, package = "ngsReports")
df <- importNgsLogs(bowtieLogs, type = "bowtie")


steveped/ngsReports documentation built on April 2, 2024, 5:10 p.m.