calcTransEff: Calculate translation efficiency

View source: R/expression.R

calcTransEffR Documentation

Calculate translation efficiency

Description

Calculate translation efficiency given a Ribo- and RNA-seq sample, and a list of ORF ranges. Basically, RPKM values for Ribo- and RNA-seq samples are calculated first, and then the translation efficiency is calculated as the log2(riboRPKM + pseudoCount) - log2(rnaRPKM + pseudoCount) with pseudoCount being a small value to prevent producing Inf.

Usage

calcTransEff(
  riboBam,
  rnaBam,
  orfGRL,
  riboLibSize = length(riboBam),
  rnaLibSize = length(rnaBam),
  trimStart = 6,
  trimEnd = 6,
  ignoreStrand = TRUE,
  pseudoCount = 0.001
)

Arguments

riboBam

A GRanges or GAlignments object of reads. For Ribo-seq data, the reads should be already size selected and shifted. Check function shiftReads on how to shift reads. Also, for each read, only the 5'-most position is used. (Required).

rnaBam

A GRanges or GAlignments object of reads. Note that for RNA-seq data, there is no need to shift or size select reads. Also, for each read, only the 5'-most position is used. (Required).

orfGRL

A GRangesList object of ORFs. We recommend assigning a unique name to each ORF using names(orfGRL). In addition, the following modifications are also applied: 1. If the names of orfGRL are NULL, rename each element as "orf_1", "orf_2", etc; 2. Strands marked as "*" are replaced with "+"; 3. Remove elements with multiple chromosomes or strands (one ORF is on multiple chromosomes or different strands); 4. Remove elements where the ORF length is not divisible by 3; and 5. MOST IMPORTANTLY, if an ORF is on positive strand, sort by coordinates (seqnames, start, end) in ascending order. Otherwise, sort by coordinates (seqnames, end, start) in descending order. The purpose is to achieve the same behavior as cdsBy function in GenomicFeatures package. (Required).

riboLibSize

A positive numeric variable indicating the library size of the Ribo-seq reads. By default, we use the number of reads in riboBam object specified. (Default: length(riboBam)).

rnaLibSize

A positive numeric variable indicating the library size of the RNA-seq reads. By default, we use the number of reads in rnaBam object specified. (Default: length(rnaBam)).

trimStart

A non-negative numeric variable indicating how many bases to trim for ORF start. (Default: 6).

trimEnd

A non-negative numeric variable indicating how many bases to trim for ORF end. (Default: 6).

ignoreStrand

A logical variable indicating if ignoring that reads and ORFs must be on the same strand. (Default: TRUE).

pseudoCount

A non-negative numeric variable specifying a small value added to the RPKMs calculated before taking log2 transformation in order to prevent producing Inf. (Default: 1e-03).

Value

A data.frame with 7 columns, specified below: 1. Column 1 is ORF ID (orfId, either user specified in orfGRL or internally generated); 2. Column 2 is trimmed ORF length (orfLenTrimmed); 3. Column 3 and 4 are the read counts for Ribo- and RNA-seq samples (countRibo and countRNA) in the trimmed ORF region, respectively; Column 5 and 6 are the RPKM values for Ribo- and RNA-seq samples (rpkmRibo and rpkmRNA), respectively; Column 7 is the calculated translation efficiency (log2 transformed).


nzhang89/RiboSeeker documentation built on April 15, 2022, 10:18 a.m.