qExportWig: QuasR wig file export
In QuasR: Quantify and Annotate Short Reads in R

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/qExportWig.R

Create a fixed-step wig file from the alignments in the genomic bam files of the ‘QuasR’ project.

qExportWig(proj, file=NULL, collapseBySample=TRUE, binsize=100L,
           shift=0L, strand=c("*","+","-"), scaling=TRUE,
           tracknames=NULL, log2p1=FALSE,
           colors=c("#1B9E77", "#D95F02", "#7570B3", "#E7298A",
                    "#66A61E", "#E6AB02", "#A6761D", "#666666"),
           includeSecondary=TRUE,
           mapqMin=0L, mapqMax=255L, absIsizeMin=NULL, absIsizeMax=NULL,
           createBigWig=FALSE,
           useRead=c("any","first","last"),
           pairedAsSingle=FALSE)

`proj`	A `qProject` object as returned by `qAlign`.
`file`	A character vector with the name(s) for the wig or bigWig file(s) to be generated. Either `NULL` or a vector of the same length as the number of bam files (for `collapseBySample=FALSE`) or the number of unique sample names (for `collapseBySample=TRUE`) in `proj`. If `NULL`, the wig or bigWig file names are generated from the names of the genomic bam files or unique sample names with an added “.wig.gz” or “.bw” extension.
`collapseBySample`	If `TRUE`, genomic bam files with identical sample name will be combined (summed) into a single track.
`binsize`	a numerical value defining the bin and step size for the wig or bigWig file(s). `binsize` will be coerced to `integer()`.
`shift`	Either a vector or a scalar value defining the read shift (e.g. half of fragment length, see ‘Details’). If `length(shift)>1`, the length must match the number of bam files in ‘proj’, and the i-th sample will be converted to wig or bigWig using the value in `shift[i]`. `shift` will be coerced to `integer()`. For paired-end alignments, `shift` will be ignored, and a warning will be issued if it is set to a non-zero value (see ‘Details’).
`strand`	Only count alignments of `strand`. The default (“*”) will count all alignments.
`scaling`	If TRUE or a numerical value, the output values in the wig or bigWig file(s) will be linearly scaled by the total number of aligned reads per sample to improve comparability (see ‘Details’).
`tracknames`	A character vector with the names of the tracks to appear in the track header. If `NULL`, the sample names in `proj` will be used.
`log2p1`	If `TRUE`, the number of alignments `x` per bin will be transformed using the formula `log2(x+1)`.
`colors`	A character vector with R color names to be used for the tracks.
`includeSecondary`	if `TRUE` (the default), include alignments with the secondary bit (0x0100) set in the `FLAG`.
`mapqMin`	minimal mapping quality of alignments to be included (mapping quality must be greater than or equal to `mapqMin`). Valid values are between 0 and 255. The default (0) will include all alignments.
`mapqMax`	maximal mapping quality of alignments to be included (mapping quality must be less than or equal to `mapqMax`). Valid values are between 0 and 255. The default (255) will include all alignments.
`absIsizeMin`	For paired-end experiments, minimal absolute insert size (TLEN field in SAM Spec v1.4) of alignments to be included. Valid values are greater than 0 or `NULL` (default), which will not apply any minimum insert size filtering.
`absIsizeMax`	For paired-end experiments, maximal absolute insert size (TLEN field in SAM Spec v1.4) of alignments to be included. Valid values are greater than 0 or `NULL` (default), which will not apply any maximum insert size filtering.
`createBigWig`	If `TRUE`, first a temporary wig file will be created and then converted to BigWig format (file extension “.bw”) using the `wigToBigWig` function from package rtracklayer.
`useRead`	For paired-end experiments, selects the read mate whose alignments should be counted, one of: `any` (default): count all alignments `first` : count only alignments from the first read `last` : count only alignments from the last read For single-read alignments, this argument will be ignored. For paired-end alignments, setting this argument to a value different from the default (`any`) will cause `qExportWig` not to automatically use the mid of fragments, but to treat the selected read as if it would come from a single-read experiment (see ‘Details’).
`pairedAsSingle`	If `TRUE`, treat paired-end data single read data, which means that instead of calculating fragment mid-points for each read pair, the 5-prime ends of the reads is used. This is for example useful when analyzing paired-end DNAse-seq or ATAC-seq data, in which the read starts are informative for chromatin accessibility.

qExportWig() uses the genome bam files in proj as input to create wig or bigWig files with the number of alignments (pairs) per window of binsize nucleotides. By default (collapseBySample=TRUE), one file per unique sample will be created. If collapseBySample=FALSE, one file per genomic bam file will be created. See http://genome.ucsc.edu/goldenPath/help/wiggle.html for the definition of the wig format, and http://genome.ucsc.edu/goldenPath/help/bigWig.html for the definition of the bigWig format.

The genome is tiled with sequential windows of length binsize, and alignments in the bam file are assigned to these windows: Single read alignments are assigned according to their 5'-end coordinate shifted by shift towards the 3'-end (assuming that the 5'-end is the leftmost coordinate for plus-strand alignments, and the rightmost coordinate for minus-strand alignments). Paired-end alignments are assigned according to the base in the middle between the leftmost and rightmost coordinates of the aligned pair of reads. Each pair of reads is only counted once, and not properly paired alignments are ignored. If useRead is set to select only the first or last read in a paired-end experiment, the selected read will be treated as reads from a single read experiment. Secondary alignments can be excluded by setting includeSecondary=FALSE. In paired-end experiments, absIsizeMin and absIsizeMax can be used to select alignments based on their insert size (TLEN field in SAM Spec v1.4).

For scaling=TRUE, the number of alignments per bin n for the sample i are linearly scaled to the mean total number of alignments over all samples in proj according to: n_s = n /N[i] *mean(N) where n_s is the scaled number of alignments in the bin and N is a vector with the total number of alignments for each sample. Alternatively, if scaling is set to a positive numerical value s, this value is used instead of mean(N), and values are scaled according to: n_s = n /N[i] *s.

mapqMin and mapqMax allow to select alignments based on their mapping qualities. mapqMin and mapqMax can take integer values between 0 and 255 and equal to -10 log10 Pr(mapping position is wrong), rounded to the nearest integer. A value 255 indicates that the mapping quality is not available.

If createBigWig=FALSE and file ends with ‘.gz’, the resulting wig file will be compressed using gzip and is suitable for uploading as a custom track to your favorite genome browser (e.g. UCSC or Ensembl).

(invisible) The file name of the generated wig or bigWig file(s).

Anita Lerch, Dimos Gaidatzis and Michael Stadler

qProject, qAlign, wigToBigWig

# copy example data to current working directory
file.copy(system.file(package="QuasR", "extdata"), ".", recursive=TRUE)

# create alignments
sampleFile <- "extdata/samples_chip_single.txt"
genomeFile <- "extdata/hg19sub.fa"
proj <- qAlign(sampleFile, genomeFile)

# export wiggle file
qExportWig(proj, binsize=100L, shift=0L, scaling=TRUE)