scoreMergedBins: Average log odd scores over bins being merged into a single...

Description Usage Arguments Details Value Note Author(s) See Also Examples

Description

Sum, normalize the read counts, and average the logOdd score over the bins being merged into a single enriced region.

Usage

1
scoreMergedBins(findOverlapsHits, unmergedGRAll, mergedGRAll)

Arguments

findOverlapsHits

Output from findOverlaps as two columns indices with the first column containing the indices for unmerged GRanges and the second column the indices of the matched merged GRanges.

unmergedGRAll

GRanges before merging.

mergedGRAll

GRanges after merging.

Details

The consecutive RIP-bins predicted by the Viterbi function (See nbh_vit) are merged into a single RIP region. An aggregate RIPScore as the averaged RIPScores over the associated merged bins is assigned to each merged RIP region. In the RIPSeeker workflow, the averaged RIPScore then becomes the representative score for the region and subject to significance test carried out in seekRIP.

Value

A merged GRanges each with scores including summed read count, averaged log odd scores, and FPK (fragment per kilobase of region length). The latter score represent a normalized read count.

Note

This function is expected to be called only from logScoreWithoutControl and logScoreWithControl.

Author(s)

Yue Li

See Also

seekRIP, computeLogOdd logScoreWithControl, logScoreWithoutControl

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
if(interactive()) { # see example in seekRIP
# Retrieve system files
extdata.dir <- system.file("extdata", package="RIPSeeker") 

bamFiles <- list.files(extdata.dir, ".bam$", recursive=TRUE, full.names=TRUE)

bamFiles <- grep("PRC2", bamFiles, value=TRUE)

# Parameters setting
binSize <- 1e5					# use a large fixed bin size for demo only
multicore <- FALSE			# use multicore
strandType <- "-"				# set strand type to minus strand

################ run main function for HMM inference on all chromosomes ################
mainSeekOutputRIP <- mainSeek(bamFiles=
    grep(pattern="SRR039214", bamFiles, value=TRUE, invert=TRUE),
		binSize=binSize, strandType=strandType, 		
		reverseComplement=TRUE, genomeBuild="mm9",
		uniqueHit = TRUE, assignMultihits = TRUE, 
		rerunWithDisambiguatedMultihits = TRUE,				
		multicore=multicore, silentMain=FALSE, verbose=TRUE)
		

nbhGRRIP <- mainSeekOutputRIP$nbhGRList$chrX

logOddScore <- computeLogOdd(nbhGRRIP)

values(nbhGRRIP) <- cbind(as.data.frame(values(nbhGRRIP)), logOddScore)
	
enrichIdx <- which(values(nbhGRRIP)$viterbi_state == 2)

unmergedRIP <- nbhGRRIP[enrichIdx]	
	
mergedRIP <- reduce(unmergedRIP, min.gapwidth = median(width(unmergedRIP) ))
	
overlapIdx <- findOverlaps(mergedRIP, unmergedRIP)

# a list with query hits as names and subject hits as items
findOverlapsHits <- split(overlapIdx, queryHits(overlapIdx))

# get the score for the first merged region
x <- scoreMergedBins(findOverlapsHits[[1]], unmergedRIP, mergedRIP)

# get scores for all of the merged regions
mergedRIPList <- lapply(split(overlapIdx, queryHits(overlapIdx)),
			
			scoreMergedBins, unmergedRIP, mergedRIP)
	
names(mergedRIPList) <- NULL
	
mergedRIP <- do.call(c, mergedRIPList)

# logOddScore is the averaged logOddScore across merged bins
mergedRIP
}

gorillayue/RIPSeeker documentation built on May 17, 2019, 7:59 a.m.