GetCoverageMatrix: Get coverage matrix for genes which are highly expressed in...

Description Usage Arguments Value Author(s) Examples

Description

Get coverage matrix for genes which are highly expressed in every file listed in CoverageDFList. Only rows with gene data from all files included in the output (i.e. intersection of gene list from each file). Only the genes that have counts in every file will be "merged".

Usage

1
GetCoverageMatrix(CoverageDFList)

Arguments

CoverageDFList

List of coverage counts from bedgraph coverage file (*.bedgraph.gz).

Value

An coverage matrix for genes which are highly expressed in every file listed in CoverageDFList.

Author(s)

Nathaniel J. Madrid, Jason Byars

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
refseq <- import.gff("RefSeq_hg19_exons_nodups_021114g1k.gff")
refseq$gene <- sub("_exon_[0-9]+", "", refseq$gene_id)
refseq$width <- width(refseq)
txlength <- tapply(refseq$width, refseq$gene, sum)

# Get file paths of aligned bedgraph files
BedGraphFiles <- dir(path="/home/ubuntu/Projects/RNAseqPackage/bedgraphs", full.names=T)

# coverage calculated w/ bedtools genomecov -bga -split
BedgraphList <- lapply(BedGraphFiles[1:3], function(bg) import.bedGraph(bg))
BedGraphRefSeqOverlapsList <- lapply(BedgraphList, function(bg) GetBedGraphRefSeqOverlaps(bg, refseq))
BedGraphRefSeqOverlapsDFList <- lapply(BedGraphRefSeqOverlapsList, function(bg) as.data.frame(bg@elementMetadata@listData))
for (i in 1:length(BedGraphRefSeqOverlapsDFList)) { BedGraphRefSeqOverlapsDFList[[i]]$width <- BedGraphRefSeqOverlapsList[[i]]@ranges@width }
CoverageDFList <- lapply(BedGraphRefSeqOverlapsDFList, function(bg) GetTranscriptomeCoverage(bg, txlength))
avgReadDepthPerGene <- unlist(lapply(CoverageDFList, function(df) mean(df$score)))
mmdOverTranscriptome <- unlist(lapply(CoverageDFList, function(df) sum(df$TotalCoverage) / sum(df$TranscriptLengths)))

# Only rows with gene data from all files are included in the output (i.e. intersection of gene 
# list from each file). Only the genes that have counts in every file will be "merged".
CoverageMatrix <- GetCoverageMatrix(CoverageDFList)

njmadrid/RNAseqQuality documentation built on May 20, 2019, 3:32 p.m.