ScoreMatrix-methods: Get base-pair score for bases in each window

Description Usage Arguments Value Note See Also Examples

Description

The funcion produces a base-pair resolution matrix of scores for given equal width windows of interest. The returned matrix can be used to draw meta profiles or heatmap of read coverage or wig track-like data. The windows argument can be a predefined region around transcription start sites or other regions of interest that have equal lengths The function removes all window that fall off the Rle object - have the start coordinate < 1 or end coordinate > length(Rle) The function takes the intersection of names in the Rle and GRanges objects. On Windows OS the function will give an error if the target is a file in .bigWig format.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
ScoreMatrix(target, windows, strand.aware = FALSE, weight.col = NULL,
  is.noCovNA = FALSE, type = "auto", rpm = FALSE, unique = FALSE,
  extend = 0, param = NULL, bam.paired.end = FALSE, library.size = NULL)

\S4method{ScoreMatrix}{RleList,GRanges}(target,windows,strand.aware)

\S4method{ScoreMatrix}{GRanges,GRanges}(target, windows, strand.aware,
                                                weight.col, is.noCovNA)

\S4method{ScoreMatrix}{character,GRanges}(target, windows, strand.aware,
                                                  weight.col=NULL,is.noCovNA=FALSE, 
                                                  type='auto', rpm=FALSE,
                                                  unique=FALSE, extend=0, param=NULL, 
                                                  bam.paired.end=FALSE,
                                                  library.size=NULL)

Arguments

target

RleList , GRanges, a BAM file or a BigWig to be overlapped with ranges in windows

windows

GRanges object that contains the windows of interest. It could be promoters, CpG islands, exons, introns. However the sizes of windows have to be equal.

strand.aware

If TRUE (default: FALSE), the strands of the windows will be taken into account in the resulting ScoreMatrix. If the strand of a window is -, the values of the bins for that window will be reversed

weight.col

if the object is GRanges object a numeric column in meta data part can be used as weights. This is particularly useful when genomic regions have scores other than their coverage values, such as percent methylation, conservation scores, GC content, etc.

is.noCovNA

(Default:FALSE) if TRUE,and if 'target' is a GRanges object with 'weight.col' provided, the bases that are uncovered will be preserved as NA in the returned object. This useful for situations where you can not have coverage all over the genome, such as CpG methylation values.

type

(Default:"auto") if target is a character vector of file paths, then type designates the type of the corresponding files (bam or bigWig).

rpm

boolean telling whether to normalize the coverage to per milion reads. FALSE by default. See library.size.

unique

boolean which tells the function to remove duplicated reads based on chr, start, end and strand

extend

numeric which tells the function to extend the reads to width=extend

param

ScanBamParam object

bam.paired.end

boolean indicating whether given BAM file contains paired-end reads (default:FALSE). Paired-reads will be treated as fragments.

library.size

numeric indicating total number of mapped reads in a BAM file (rpm has to be set to TRUE). If is not given (default: NULL) then library size is calculated using the Rsamtools idxstatsBam function: sum(idxstatsBam(target)$mapped).

Value

returns a ScoreMatrix object

Note

We assume that a paired-end BAM file contains reads with unique ids and we remove both mates of reads if they are repeated. Due to the fact that ScoreMatrix uses the GenomicAlignments:readGAlignmentPairs function to read paired-end BAM files a duplication of reads occurs when mates of one pair map into two different windows.

Strands of reads in a paired-end BAM are inferred depending on strand of first alignment from the pair. This is a default setting in the GenomicAlignments:readGAlignmentPairs function (see a strandMode argument). This mode should be used when the paired-end data was generated using one of the following stranded protocols: Directional Illumina (Ligation), Standard SOLiD.

See Also

ScoreMatrixBin

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
# When target is GRanges
data(cage)
data(promoters)
scores1=ScoreMatrix(target=cage,windows=promoters,strand.aware=TRUE,
                                weight.col="tpm")                                
                                 

# When target is RleList
library(GenomicRanges)
covs = coverage(cage)
scores2 = ScoreMatrix(target=covs,windows=promoters,strand.aware=TRUE)   
scores2 

# When target is a bam file
bam.file = system.file('unitTests/test.bam', package='genomation')
windows = GRanges(rep(c(1,2),each=2), IRanges(rep(c(1,2), times=2), width=5))
scores3 = ScoreMatrix(target=bam.file,windows=windows, type='bam') 
scores3

genomation documentation built on Nov. 8, 2020, 5:21 p.m.