loadLogs: loadLogs

Description Usage Arguments Details Value See Also Examples

Description

Load mapping statistics from log files

Usage

1
loadLogs(source, multiplex, summary, pipeline)

Arguments

source

Indicate to load the data from log files in the current directory, or from the summary file of Moirai.

multiplex

Optional. Path to a ‘multiplex’ file.

summary

Optional. Path to a ‘summary’ file.

pipeline

Optional. Version string identifying the pipeline used to process the data.

Details

Loads mapping counts and other statistics produced during processing.

With source='logs', loadLogs will load data from every file ending in sQuote.log in the work directory. Thes files are expected contain tab-separated triples, with first the name of the mapping statistics, like extracted, mapped, rdna, etc., then the sample identifier, and then the number of reads. loadLogs will crash or produce incorrect output if the files do not contain triples, or if the sample identifiers are not matched correctly in the files, or if the first word of the triples appears in multiple files.

With source='moirai', loadLogs will load data from a summary file and a multiplex file. When their path is not given by multiplex and summary, they will be searched at fixed locations in the PROCESSED_DATA directory using the LIBRARY variable.

With source='moirai', loadLogs will recognise the ‘nano-fluidigm’ or the ‘nanoCAGE2’ Moirai users, or fail. For the ‘nano-fluidigm’ user, the samples are sorted by numbers and associated to sorted well names, from A01, A02, ..., to H11 and H12.

Value

Returns a data frame with one row per sample, and the following columns (if the corresponding data is available).

  1. samplename Sample identifier (factor)

  2. extracted Number of extracted reads

  3. tagdust Number of reads containing oligonucleotide artefacts

  4. spikes Number of reads overlaping with the reference spike sequences

  5. rdna Number of reads overlaping with the reference ribosomal DNA sequences

  6. mapped Number of reads aligned to the reference genome

See Also

hierarchAnnot, mapStats

Examples

1
2
3
4
5
6
loadLogs( "moirai"
        , summary = system.file("extdata/summary.txt", package="smallCAGEqc")
        , multiplex = system.file("extdata/samplename_to_sampleid.txt", package="smallCAGEqc")
        , pipeline="OP-WORKFLOW-CAGEscan-short-reads-v2.0")
        
libs$group <- libs$samplename %>% sub("Run._", "", .) %>% substr(1,1) %>% factor

charles-plessy/smallCAGEqc documentation built on May 13, 2019, 3:31 p.m.