formatSTRINGPPI: Format PPI file downloaded from the STRING database

Description Usage Arguments Details Value References See Also Examples

Description

This method is used to format the PPI file which is downloaded from the STRING database.

Usage

1
2
3
formatSTRINGPPI(input, mappingFile, taxonId, output, minScore=700)
## S4 method for signature 'character,character,character,character'
formatSTRINGPPI(input, mappingFile, taxonId, output, minScore=700)

Arguments

input

File downloaded from the STRING database (character(1)).

mappingFile

Identifier mapping file (character(1)).
Generate this file with method getMappingFile.

taxonId

NCBI taxonomy specie identifier (character(1)).
Process only data for this specie.
Examples:
9606: Homo sapiens
4932: Saccharomyces cerevisiae
6239: Caenorhabditis elegans
7227: Drosophila melanogaster
10090: Mus musculus
10116: Rattus norvegicus

output

Output file (character(1)).

minScore

Filter out PPI information with STRING scores less than this value. (integer(1)).
Recommended default 700 (Only consider high confidence interactions).

Details

The input file is downloaded from the STRING database (http://string-db.org/). The URL of this file is http://string-db.org/newstring_download/protein.links.v9.1.txt.gz. Access http://string-db.org/newstring_download/species.v9.1.txt to determine the parameter taxonId. Access http://string-db.org/newstring_cgi/show_download_page.pl for more details.
If you make use of this file, please cite the STRING database.

Value

Each line of the output file contains Swiss-Prot accession numbers and gene names for two interacting proteins. An edge value is estimated for each link between two interacting proteins. This value is defined as max(1,log(1000-STRING_SCORE,100)). This may be treated as the “cost” while determining the shortest paths between proteins. Advanced users can edit the file and change this value for each edge.

References

Szklarczyk,D. and et al. (2011) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res, 39, D561-D568.

Franceschini,A. and et al. (2013) STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res, 41, D808-D815.

UniProt Consortium and others. (2012) Reorganizing the protein space at the Universal Protein Resource (UniProt). Nucleic Acids Res 40, D71-D75.

See Also

cisPath, getMappingFile, formatPINAPPI, formatSIFfile, formatiRefIndex, combinePPI.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
    library(cisPath)
    
    # Generate the identifier mapping file 
    input <- system.file("extdata", "uniprot_sprot_human10.dat", package="cisPath")
    mappingFile <- file.path(tempdir(), "mappingFile.txt")
    getMappingFile(input, output=mappingFile, taxonId="9606")
    
    # Format the file downloaded from STRING database
    output <- file.path(tempdir(), "STRINGPPI.txt")
    fileFromSTRING <- system.file("extdata", "protein.links.txt", package="cisPath")
    formatSTRINGPPI(fileFromSTRING, mappingFile, "9606", output, 700)
    
## Not run: 
    if (!requireNamespace("BiocManager", quietly=TRUE))
        install.packages("BiocManager")
    BiocManager::install("R.utils")
    library(R.utils)
    
    outputDir <- file.path(getwd(), "cisPath_test")
    dir.create(outputDir, showWarnings=FALSE, recursive=TRUE)
    
    # Generate the identifier mapping file 
    fileFromUniProt <- file.path(outputDir, "uniprot_sprot_human.dat")
    mappingFile <- file.path(outputDir, "mappingFile.txt")
    getMappingFile(fileFromUniProt, output=mappingFile)
    
    # Download STRING PPI for Homo sapiens (compressed:~27M, decompressed:~213M)
    destfile <- file.path(outputDir, "9606.protein.links.v9.1.txt.gz")
    cat("Downloading...\n")
    download.file("http://string-db.org/newstring_download/protein.links.v9.1/9606.protein.links.v9.1.txt.gz", destfile)
    cat("Uncompressing...\n")
    gunzip(destfile, overwrite=TRUE, remove=FALSE)
    
    # Format STRING PPI
    fileFromSTRING <- file.path(outputDir, "9606.protein.links.v9.1.txt")
    STRINGPPI <- file.path(outputDir, "STRINGPPI.txt")
    formatSTRINGPPI(fileFromSTRING, mappingFile, "9606", output=STRINGPPI, 700)
    
## End(Not run)    

cisPath documentation built on Nov. 8, 2020, 7:15 p.m.