scaleTxCoverages: Scale transcript coverages by abundance estimate
In csoneson/jcc: Calculate JCC Scores

View source: R/scaleTxCoverages.R

scaleTxCoverages

R Documentation

Scale transcript coverages by abundance estimate

Description

This function takes a list of predicted transcript coverage profiles (from predictTxCoverage) and corresponding transcript abundance estimates (from, e.g., Salmon, kallisto, StringTie, RSEM, or any other transcript abundance estimation software), and scales the coverage profiles by the abundance estimates in order to determine the predicted number of reads covering each position in the transcript. In addition, it extracts the predicted number of reads spanning each junction in each transcript, and sums up these for each annotated junction, across all transcripts containing the junction.

Usage

scaleTxCoverages(txCoverageProfiles, txQuants, tx2gene, strandSpecific,
  methodName, verbose = FALSE)

Arguments

`txCoverageProfiles`	A `list` of predicted transcript coverage profiles (output from `predictTxCoverage`).
`txQuants`	A `data.frame` with estimated transcript abundances. Must contain columns named `transcript`, `TPM` and `count`, all other columns will be ignored.
`tx2gene`	A `data.frame` with transcript-to-gene mapping. Must have at least two columns, named `tx` and `gene`.
`strandSpecific`	Logical, whether or not the data is strand-specific. Determines whether or not the strand of a junction should be taken into account when the coverages are summed up across transcripts.
`methodName`	Name to use for the abundance estimation method in the output table.
`verbose`	Logical, whether to print progress messages.

Value

A list with two elements:

junctionPredCovs:: A tibble with the predicted coverage for each junction, scaled by the estimated transcript abundances and summarized across all transcripts.
txQuants:: A tibble with estimated transcript abundances, including information about whether or not the coverage profile could be predicted ('covOK') or if a uniform coverage was imposed ('covError' or 'covNA').

Author(s)

Charlotte Soneson

References

Soneson C, Love MI, Patro R, Hussain S, Malhotra D, Robinson MD: A junction coverage compatibility score to quantify the reliability of transcript abundance estimates and annotation catalogs. bioRxiv doi:10.1101/378539 (2018)

Examples

## Not run: 
gtf <- system.file("extdata/Homo_sapiens.GRCh38.90.chr22.gtf.gz",
                   package = "jcc")
bam <- system.file("extdata/reads.chr22.bam", package = "jcc")
biasMod <- fitAlpineBiasModel(gtf = gtf, bam = bam,
                              organism = "Homo_sapiens",
                              genome = Hsapiens, genomeVersion = "GRCh38",
                              version = 90, minLength = 230,
                              maxLength = 7000, minCount = 10,
                              maxCount = 10000, subsample = TRUE,
                              nbrSubsample = 30, seed = 1, minSize = NULL,
                              maxSize = 220, verbose = TRUE)
tx2gene <- readRDS(system.file("extdata/tx2gene.sub.rds", package = "jcc"))
predCovProfiles <- predictTxCoverage(biasModel = biasMod$biasModel,
                                     exonsByTx = biasMod$exonsByTx,
                                     bam = bam, tx2gene = tx2gene,
                                     genome = Hsapiens,
                                     genes = c("ENSG00000070371",
                                               "ENSG00000093010"),
                                     nCores = 1, verbose = TRUE)
txQuants <- readRDS(system.file("extdata/quant.sub.rds", package = "jcc"))
txsc <- scaleTxCoverages(txCoverageProfiles = predCovProfiles,
                         txQuants = txQuants, tx2gene = tx2gene,
                         strandSpecific = TRUE, methodName = "Salmon",
                         verbose = TRUE)

## End(Not run)

csoneson/jcc documentation built on Dec. 25, 2024, 2:32 a.m.