makeDomainsFromExons: Calculate genomic coordinates for amino acid positions of...

Description Usage Arguments Details Value Author(s)

View source: R/HLP_makeDomainsFromExons.R

Description

Use genomic exon coordinates and Uniprot protein domain data to determine genomic coordinates of protein domains for plotting in genomic context.

Usage

1
2
3
4
5
6
makeDomainsFromExons(
  ID,
  biomaRt,
  uniprot_domains.gff,
  suffix.outputFilname = ".txt"
)

Arguments

ID

character with either Ensembl transcript ID, Ensembl gene ID or gene symbol.

biomaRt

biomaRt object to obtain exon ccordinates.

uniprot_domains.gff

character with file path to Uniprot gff export. In Uniprot, select desired features e.g. PTM/Processing and Family & Domains and export the basket to gff-file format.

suffix.outputFilname

character to be added at output filename additionally to the used transcript name.

Details

Genomic exon coordinates for the desired transcript are downloaded from biomaRt. If a gene is selected, the canonical transcript is determined as transcript with longest coding sequence. Protein data is expected in gff-format from Uniprot giving the amino acid positions for each domain. The total protein length is read from the 2nd comment line of the gff-file. The exon bp coordinates from the selected transcript are used to calculate the cooresponding bp coordinates for each protein domain based on their amino acid positions. The strored result table may be edited later to modify suggested plotting parameter. The name of the used gene transcript is added to the output filename.

Value

dataframe with exon data downloaded from biomaRt. The result domain table is stored as side effect in the filepath given by uniprot_domains.gff.

Author(s)

Frank Ruehle


frankRuehle/systemsbio documentation built on Sept. 14, 2020, 1:18 a.m.