filterBadSeqs: Performs quality checks, then filters reads for quality
In nixstix/RNASeqAnalysis: Analysis of RNA-Seq data: QC filtering, expression analysis, differential expression pipeline

Description Usage Arguments Details Value See Also

The function trims poor-quality bases and unknown bases from the ends of the sequences. Any reads which are too short, or contain any unknown bases (N), are removed from the file.

1 2	filterBadSeqs(dataFile, minlength = 30, Phred = 25, blockSize = 1e+08, readerBlockSize = 1e+05, mc.cores = 1)

`dataFile`	An R data frame with the data to be processed. The R object is a standard format, and must contain the following headings: File, PE, Sample, Replicate, FilteredFile. More information about the file is available at `datafileTemplate`.
`Phred`	An integer which specifies Phred (ascii) quality score. Any two consecutive nucleotides with a quality score lower than this threshold will be discarded. Default score is 30.
`blockSize`	An integer which specifies the number of reads to be read at a time when processing. Default is 1e8.
`mc.cores`	The number of cores to use when parallelizing. Default is 1 (i.e. no parallelisation)
`minLength`	An integer which specifies the minimum length for a read. Reads shorter than this length will be discarded. Default is 30 nucleotides.
`readBlockSize`	An integer which specifies the number of bytes (characters) to be read at one time. Smaller `readBlockSize` reduces memory requirements, but is less efficient. Default is 1e5.

The function should be run in the working directory, where all fastq files are found.

filterBadSeqs iterates over each file specified in the "datafile", and filters and trims the reads for quality. This is done by iterating over chunks of reads in the fastq files at a time. The size of the chunks are decided by the "blockSize" and "readerBlockSize" parameters. More information about how this is done is available in the ShortRead package.

* it removes any trailing or leadining N's from each sequence,

* it removes any reads wich still contain N's,

* it trims the trailing end when it finds a minimum of 2 poor-quality bases in a window of 5. The threshold for poor quality is determined by the parameter "Phred", where the Phred score is logarithmically related to the probability of errors at each base,

* it removes any reads shorter than a minimum length (this is specified by the "minLength" parameter).

The function produces a new set of fastq files which have been filtered. The user must specify in the "FILTEREDFILE" column of the data file the output file. The user may specify the same output file for multiple input files - this will append new output to existing files, thereby allowing de-multiplexing of samples which have been run on different lanes. A new R object (QualityFilterResults) is created, which contains pointers to the input and output fastq files, as well as a summary of how many reads have been trimmed or removed.

A data frame summarising for each file how many sequences have been trimmed or removed.

https://en.wikipedia.org/wiki/Phred_quality_score for more about quality scores.

ShortRead for more information about blockSize (n) and readerBlockSize.

nixstix/RNASeqAnalysis documentation built on May 23, 2019, 7:06 p.m.

nixstix/RNASeqAnalysis index

Analysis Pipeline

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

nixstix/RNASeqAnalysis
Analysis of RNA-Seq data: QC filtering, expression analysis, differential expression pipeline

filterBadSeqs: Performs quality checks, then filters reads for quality
In nixstix/RNASeqAnalysis: Analysis of RNA-Seq data: QC filtering, expression analysis, differential expression pipeline

Description

Usage

Arguments

Details

Value

See Also

Related to filterBadSeqs in nixstix/RNASeqAnalysis...

R Package Documentation

Browse R Packages

We want your feedback!

nixstix/RNASeqAnalysis Analysis of RNA-Seq data: QC filtering, expression analysis, differential expression pipeline

filterBadSeqs: Performs quality checks, then filters reads for quality In nixstix/RNASeqAnalysis: Analysis of RNA-Seq data: QC filtering, expression analysis, differential expression pipeline

Description

Usage

Arguments

Details

Value

See Also

Related to filterBadSeqs in nixstix/RNASeqAnalysis...

R Package Documentation

Browse R Packages

We want your feedback!

nixstix/RNASeqAnalysis
Analysis of RNA-Seq data: QC filtering, expression analysis, differential expression pipeline

filterBadSeqs: Performs quality checks, then filters reads for quality
In nixstix/RNASeqAnalysis: Analysis of RNA-Seq data: QC filtering, expression analysis, differential expression pipeline