Home

/

CRAN

/

geno2proteo

/

proteinLocsToGenomic: Obtaining the genomic coordinates for a list of protein...

proteinLocsToGenomic: Obtaining the genomic coordinates for a list of protein...
In geno2proteo: Finding the DNA and Protein Sequences of Any Genomic or Proteomic Loci

View source: R/proteinLocsToGenomic.R

proteinLocsToGenomic

R Documentation

Obtaining the genomic coordinates for a list of protein sections

Description

The function takes a list of protein sections and the corresponding ENSEMBL ID of these proteins, and tries to find the genomic coordinates of these protein sections.

Usage

proteinLocsToGenomic(inputLoci, CDSaaFile)

Arguments

inputLoci

A data frame containing the protein sections as the input. The 1st column must be the ENSEMBL ID of either the protein or the transcript encoding the protein (or the equivalent of ENSEMBL ID if you have created your own gene annotation GTF file). But you have to use only one of two formats (namely either protein ID or transcript ID), and cannot use both of them in the input of one function call. The 2nd and 3rd columns give the coordinate of the first and last amino acids of the section along the protein sequence. Other columns are optional and will not be used by the function.

CDSaaFile

The data file generated by the package's function generatingCDSaaFile, containing the genomic locations, DNA sequences and protein sequences of all coding regions in a specific genome which is used in your analysis.

Value

The function returns a data frame containing the original protein locations specified in the input and before them, the six added columns for the corresponding genomic coordinates of the protein sections:

The 1st, 2nd, 3rd and 4th columns give the chromosome name, the coordinates of the start and end positions, and the strand in the chromosome, which specify the genomic locus corresponding to the protein section.
The 5th and 6th columns give the first and last coding exons in the given transcript which correspond to the given protein section.

Author(s)

Yaoyong Li

Examples


    dataFolder = system.file("extdata", package="geno2proteo")
    inputFile_loci=file.path(dataFolder, 
        "transId_pfamDomainStartEnd_chr16_Zdomains_22examples.txt")
    CDSaaFile=file.path(dataFolder, 
        "Homo_sapiens.GRCh37.74_chromosome16_35Mlong.gtf.gz_AAseq.txt.gz")

    inputLoci = read.table(inputFile_loci, sep="\t", stringsAsFactors=FALSE)

    genomicLoci = proteinLocsToGenomic(inputLoci=inputLoci, CDSaaFile=CDSaaFile)

geno2proteo documentation built on June 13, 2022, 5:08 p.m.

geno2proteo index

Package overview An Introduction to the geno2proteo package

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

geno2proteo
Finding the DNA and Protein Sequences of Any Genomic or Proteomic Loci

proteinLocsToGenomic: Obtaining the genomic coordinates for a list of protein...
In geno2proteo: Finding the DNA and Protein Sequences of Any Genomic or Proteomic Loci

Obtaining the genomic coordinates for a list of protein sections

Description

Usage

Arguments

Value

Author(s)

Examples

Related to proteinLocsToGenomic in geno2proteo...

R Package Documentation

Browse R Packages

We want your feedback!

geno2proteo Finding the DNA and Protein Sequences of Any Genomic or Proteomic Loci

proteinLocsToGenomic: Obtaining the genomic coordinates for a list of protein... In geno2proteo: Finding the DNA and Protein Sequences of Any Genomic or Proteomic Loci

Obtaining the genomic coordinates for a list of protein sections

Description

Usage

Arguments

Value

Author(s)

Examples

Related to proteinLocsToGenomic in geno2proteo...

R Package Documentation

Browse R Packages

We want your feedback!

geno2proteo
Finding the DNA and Protein Sequences of Any Genomic or Proteomic Loci

proteinLocsToGenomic: Obtaining the genomic coordinates for a list of protein...
In geno2proteo: Finding the DNA and Protein Sequences of Any Genomic or Proteomic Loci