transcriptToProtein: Map transcript-relative coordinates to amino acid residues of...
In jotsetung/ensembldb: Utilities to create and use Ensembl-based annotation databases

transcriptToProtein

R Documentation

Map transcript-relative coordinates to amino acid residues of the encoded protein

Description

transcriptToProtein maps within-transcript coordinates to the corresponding coordinates within the encoded protein sequence. The provided coordinates have to be within the coding region of the transcript (excluding the stop codon) but are supposed to be relative to the first nucleotide of the transcript (which includes the 5' UTR). Positions relative to the CDS of a transcript (e.g. /PKP2 c.1643delg/) have to be first converted to transcript-relative coordinates. This can be done with the cdsToTranscript() function.

Usage

transcriptToProtein(
  x,
  db,
  id = "name",
  proteins = NA,
  exons = NA,
  transcripts = NA
)

Arguments

`x`	`IRanges` with the coordinates within the transcript. Coordinates are counted from the start of the transcript (including the 5' UTR). The Ensembl IDs of the corresponding transcripts have to be provided either as `names` of the `IRanges`, or in one of its metadata columns.
`db`	`EnsDb` object.
`id`	`character(1)` specifying where the transcript identifier can be found. Has to be either `"name"` or one of `colnames(mcols(prng))`.
`proteins`	`DFrame` object generated by `proteins()`.
`exons`	`CompressedGRangesList` object generated by `exonsBy()` where by = 'tx'.
`transcripts`	`GRanges` object generated by `transcripts()`.

Details

Transcript-relative coordinates are mapped to the amino acid residues they encode. As an example, positions within the transcript that correspond to nucleotides 1 to 3 in the CDS are mapped to the first position in the protein sequence (see examples for more details).

Value

IRanges with the same length (and order) than the input IRanges x. Each element in IRanges provides the coordinates within the protein sequence, names being the (Ensembl) IDs of the protein. The original transcript ID and the transcript-relative coordinates are provided as metadata columns. Metadata columns "cds_ok" indicates whether the length of the transcript's CDS matches the length of the encoded protein. IRanges with a start coordinate of -1 is returned for transcript coordinates that can not be mapped to protein-relative coordinates (either no transcript was found for the provided ID, the transcript does not encode a protein or the provided coordinates are not within the coding region of the transcript).

Author(s)

Johannes Rainer

Examples


library(EnsDb.Hsapiens.v86)
## Restrict all further queries to chromosome x to speed up the examples
edbx <- filter(EnsDb.Hsapiens.v86, filter = ~ seq_name == "X")

## Define an IRanges with the positions of the first 2 nucleotides of the
## coding region for the transcript ENST00000381578
txpos <- IRanges(start = 692, width = 2, names = "ENST00000381578")

## Map these to the corresponding residues in the protein sequence
## The protein-relative coordinates are returned as an `IRanges` object,
## with the original, transcript-relative coordinates provided in metadata
## columns tx_start and tx_end
transcriptToProtein(txpos, edbx)

## We can also map multiple ranges. Note that for any of the 3 nucleotides
## encoding the same amino acid the position of this residue in the
## protein sequence is returned. To illustrate this we map below each of the
## first 4 nucleotides of the CDS to the corresponding position within the
## protein.
txpos <- IRanges(start = c(692, 693, 694, 695),
    width = rep(1, 4), names = rep("ENST00000381578", 4))
transcriptToProtein(txpos, edbx)

## If the mapping fails, an IRanges with negative start position is returned.
## Mapping can fail (as below) because the ID is not known.
transcriptToProtein(IRanges(1, 1, names = "unknown"), edbx)

## Or because the provided coordinates are not within the CDS
transcriptToProtein(IRanges(1, 1, names = "ENST00000381578"), edbx)

## Meanwhile, this function can be called in parallel processes if you preload
## the protein, exons and transcripts database.

proteins <- proteins(edbx)
exons <- exonsBy(edbx)
transcripts <- transcripts(edbx)

txpos <- IRanges(start = c(692, 693, 694, 695),
    width = rep(1, 4), 
    names = c(rep("ENST00000381578", 2), rep("ENST00000486554", 2)), 
    info='test')

transcriptToProtein(txpos,edbx,proteins = proteins,exons = exons,transcripts = transcripts)

jotsetung/ensembldb documentation built on Aug. 21, 2024, 11:23 a.m.

jotsetung/ensembldb index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

jotsetung/ensembldb
Utilities to create and use Ensembl-based annotation databases

transcriptToProtein: Map transcript-relative coordinates to amino acid residues of...
In jotsetung/ensembldb: Utilities to create and use Ensembl-based annotation databases

Map transcript-relative coordinates to amino acid residues of the encoded protein

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to transcriptToProtein in jotsetung/ensembldb...

R Package Documentation

Browse R Packages

We want your feedback!

jotsetung/ensembldb Utilities to create and use Ensembl-based annotation databases

transcriptToProtein: Map transcript-relative coordinates to amino acid residues of... In jotsetung/ensembldb: Utilities to create and use Ensembl-based annotation databases

Map transcript-relative coordinates to amino acid residues of the encoded protein

Description

Usage

Arguments

Details

Value

Author(s)

See Also

Examples

Related to transcriptToProtein in jotsetung/ensembldb...

R Package Documentation

Browse R Packages

We want your feedback!

jotsetung/ensembldb
Utilities to create and use Ensembl-based annotation databases

transcriptToProtein: Map transcript-relative coordinates to amino acid residues of...
In jotsetung/ensembldb: Utilities to create and use Ensembl-based annotation databases