View source: R/readUniProtExport.R
| readUniProtExport | R Documentation | 
This function allows reading and importing protein-ID conversion results from UniProt.
To do so, first copy/paste your query IDs into UniProt 'Retrieve/ID mapping' field called '1. Provide your identifiers' (or upload as file), verify '2. Select options'.
In a typical case of 'enst000xxx' IDs  you may leave default settings, ie 'Ensemble Transcript' as input and 'UniProt KB' as output. Then, 'Submit' your search and retreive results via 
'Download', you need to specify a 'Tab-separated' format ! If you download as 'Compressed' you need to decompress the .gz file before running the function readUCSCtable 
In addition, a file with UCSC annotation (Ensrnot accessions and chromosomic locations, obtained using readUCSCtable) can be integrated.
readUniProtExport(
  UniProtFileNa,
  deUcsc = NULL,
  targRegion = NULL,
  useUniPrCol = NULL,
  silent = FALSE,
  debug = FALSE,
  callFrom = NULL
)
| UniProtFileNa | (character) name (and path) of file exported from Uniprot (tabulated text file inlcuding headers) | 
| deUcsc | (data.frame) object produced by  | 
| targRegion | (character or list) optional marking of chromosomal locations to be part of a given chromosomal target region, 
may be given as character like  | 
| useUniPrCol | (character) optional declaration which colums from UniProt exported file should be used/imported (default 'EnsID','Entry','Entry.name','Status','Protein.names','Gene.names','Length'). | 
| silent | (logical) suppress messages | 
| debug | (logical) display additional messages for debugging | 
| callFrom | (character) allow easier tracking of message(s) produced | 
In a typicall use case, first chromosomic location annotation is extracted from UCSC for the species of interest and imported to R using  readUCSCtable . 
However, the tables provided by UCSC don't contain Uniprot IDs. Thus, an additional (batch-)conversion step needs to get added. 
For this reason readUCSCtable allows writing a file with Ensemble transcript IDs which can be converted tu UniProt IDs at the site of  UniProt. 
Then, UniProt annotation (downloaded as tab-separated) can be imported and combined with the genomic annotation using this function.
This function returns a data.frame (with columns $EnsID, $Entry, $Entry.name, $Status, $Protein.names, $Gene.names, $Length; if deUcsc is integrated plus: $chr, $type, $start, $end, $score, $strand, $Ensrnot, $avPos)
readUCSCtable
path1 <- system.file("extdata",package="wrProteo")
deUniProtFi <- file.path(path1,"deUniProt_hg38chr11extr.tab")
deUniPr1a <- readUniProtExport(deUniProtFi) 
str(deUniPr1a)
## Workflow starting with UCSC annotation (gtf) files :
gtfFi <- file.path(path1,"UCSC_hg38_chr11extr.gtf.gz")
UcscAnnot1 <- readUCSCtable(gtfFi)
## Results of conversion at UniProt are already available (file "deUniProt_hg38chr11extr.tab")
myTargRegion <- list("chr1", pos=c(198110001,198570000))
myTargRegion2 <-"chr11:1-135,086,622"      # works equally well
deUniPr1 <- readUniProtExport(deUniProtFi,deUcsc=UcscAnnot1,
  targRegion=myTargRegion)
## Now UniProt IDs and genomic locations are both available :
str(deUniPr1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.