Description Usage Arguments Details Value References See Also Examples
This method is used to format the PPI file which is downloaded from the STRING database.
1 2 3 | formatSTRINGPPI(input, mappingFile, taxonId, output, minScore=700)
## S4 method for signature 'character,character,character,character'
formatSTRINGPPI(input, mappingFile, taxonId, output, minScore=700)
|
input |
File downloaded from the STRING database (character(1)). |
mappingFile |
Identifier mapping file (character(1)). |
taxonId |
NCBI taxonomy specie identifier (character(1)). |
output |
Output file (character(1)). |
minScore |
Filter out PPI information with STRING scores less than this value. (integer(1)). |
The input file is downloaded from the STRING database (http://string-db.org/).
The URL of this file is http://string-db.org/newstring_download/protein.links.v9.1.txt.gz.
Access http://string-db.org/newstring_download/species.v9.1.txt to determine the parameter taxonId
.
Access http://string-db.org/newstring_cgi/show_download_page.pl for more details.
If you make use of this file, please cite the STRING database.
Each line of the output file contains Swiss-Prot accession numbers and gene names for two interacting proteins.
An edge value is estimated for each link between two interacting proteins.
This value is defined as max(1,log(1000-STRING_SCORE,100))
.
This may be treated as the “cost” while determining the shortest paths between proteins.
Advanced users can edit the file and change this value for each edge.
Szklarczyk,D. and et al. (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res, 39, D561-D568.
Franceschini,A. and et al. (2013) STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res, 41, D808-D815.
UniProt Consortium and others. (2012) Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res 40, D71-D75.
cisPath
, getMappingFile
, formatPINAPPI
, formatSIFfile
, formatiRefIndex
, combinePPI
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | library(cisPath)
# Generate the identifier mapping file
input <- system.file("extdata", "uniprot_sprot_human10.dat", package="cisPath")
mappingFile <- file.path(tempdir(), "mappingFile.txt")
getMappingFile(input, output=mappingFile, taxonId="9606")
# Format the file downloaded from STRING database
output <- file.path(tempdir(), "STRINGPPI.txt")
fileFromSTRING <- system.file("extdata", "protein.links.txt", package="cisPath")
formatSTRINGPPI(fileFromSTRING, mappingFile, "9606", output, 700)
## Not run:
if (!requireNamespace("BiocManager", quietly=TRUE))
install.packages("BiocManager")
BiocManager::install("R.utils")
library(R.utils)
outputDir <- file.path(getwd(), "cisPath_test")
dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
# Generate the identifier mapping file
fileFromUniProt <- file.path(outputDir, "uniprot_sprot_human.dat")
mappingFile <- file.path(outputDir, "mappingFile.txt")
getMappingFile(fileFromUniProt, output=mappingFile)
# Download STRING PPI for Homo sapiens (compressed:~27M, decompressed:~213M)
destfile <- file.path(outputDir, "9606.protein.links.v9.1.txt.gz")
cat("Downloading...\n")
download.file("http://string-db.org/newstring_download/protein.links.v9.1/9606.protein.links.v9.1.txt.gz", destfile)
cat("Uncompressing...\n")
gunzip(destfile, overwrite=TRUE, remove=FALSE)
# Format STRING PPI
fileFromSTRING <- file.path(outputDir, "9606.protein.links.v9.1.txt")
STRINGPPI <- file.path(outputDir, "STRINGPPI.txt")
formatSTRINGPPI(fileFromSTRING, mappingFile, "9606", output=STRINGPPI, 700)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.