Description Usage Arguments Details Value References See Also Examples
This method is used to generate the identifier mapping file which is necessary for methods formatSIFfile
and formatSTRINGPPI
.
1 2 3 | getMappingFile(sprotFile, output, tremblFile="", taxonId="")
## S4 method for signature 'character,character'
getMappingFile(sprotFile, output, tremblFile="", taxonId="")
|
sprotFile |
Input: File downloaded from the UniProt database (UniProtKB/Swiss-Prot) (character(1)). |
output |
Output file (character(1)). |
tremblFile |
Input: File downloaded from the UniProt database (UniProtKB/TrEMBL) (character(1)). |
taxonId |
NCBI taxonomy specie identifier (character(1)). |
UniProtKB/Swiss-Prot: fully annotated curated entries.
UniProtKB/TrEMBL: computer-generated entries enriched with automated classification and annotation.
sprotFile
is mandatory, while tremblFile
is optional.
If users only want to process the reviewed proteins from the UniProt database, tremblFile
should be ignored.
All species:
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_sprot.dat.gz
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/complete/uniprot_trembl.dat.gz
Taxonomic divisions:
ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/
uniprot_sprot_archaea.dat.gz and uniprot_trembl_archaea.dat.gz contain all archaea entries.
uniprot_sprot_bacteria.dat.gz and uniprot_trembl_bacteria.dat.gz contain all bacteria entries.
uniprot_sprot_fungi.dat.gz and uniprot_trembl_fungi.dat.gz contain all fungi entries.
uniprot_sprot_human.dat.gz and uniprot_trembl_human.dat.gz contain all human entries.
uniprot_sprot_invertebrates.dat.gz and uniprot_trembl_invertebrates.dat.gz contain all invertebrate entries.
uniprot_sprot_mammals.dat.gz and uniprot_trembl_mammals.dat.gz contain all mammalian entries except human and rodent entries.
uniprot_sprot_plants.dat.gz and uniprot_trembl_plants.dat.gz contain all plant entries.
uniprot_sprot_rodents.dat.gz and uniprot_trembl_rodents.dat.gz contain all rodent entries.
uniprot_sprot_vertebrates.dat.gz and uniprot_trembl_vertebrates.dat.gz contain all vertebrate entries except mammals.
uniprot_sprot_viruses.dat.gz and uniprot_trembl_viruses.dat.gz contain all eukaryotic entries except those from vertebrates, fungi and plants.
We suggest you take a look at the README file before you download these files.
If you make use of these files, please cite the UniProt database.
The output file contains identifier mapping information which is necessary for methods formatSIFfile
and formatSTRINGPPI
.
Each line contains both the Ensembl Genomes Protein identifier and the Swiss-Prot accession number for a given protein.
UniProt Consortium and others. (2012) Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res 40, D71-D75.
cisPath
,
formatSTRINGPPI
,
formatSIFfile
,
combinePPI
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | library(cisPath)
sprotFile <- system.file("extdata", "uniprot_sprot_human10.dat", package="cisPath")
output <- file.path(tempdir(), "mappingFile.txt")
getMappingFile(sprotFile, output, taxonId="9606")
## Not run:
if (!requireNamespace("BiocManager", quietly=TRUE))
install.packages("BiocManager")
BiocManager::install("R.utils")
library(R.utils)
outputDir <- file.path(getwd(), "cisPath_test")
dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
# Download protein information file for humans only from UniProt (decompressed:~246M)
destfile <- file.path(outputDir, "uniprot_sprot_human.dat.gz");
cat("Downloading...\n")
download.file("ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_sprot_human.dat.gz", destfile)
gunzip(destfile, overwrite=TRUE, remove=FALSE)
destfile <- file.path(outputDir, "uniprot_trembl_human.dat.gz");
cat("Downloading...\n")
download.file("ftp://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/taxonomic_divisions/uniprot_trembl_human.dat.gz", destfile)
gunzip(destfile, overwrite=TRUE, remove=FALSE)
# Generate identifier mapping file
sprotFile <- file.path(outputDir, "uniprot_sprot_human.dat")
tremblFile <- file.path(outputDir, "uniprot_trembl_human.dat")
mappingFile <- file.path(outputDir, "mappingFile.txt")
getMappingFile(sprotFile, output=mappingFile, tremblFile)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.