importNet.STRING: Import network data from a STRING database file.

Description Usage Arguments Value Selecting edges Networks Tax ID Console and log The gG object

View source: R/importNet.STRING.R

Description

importNet.STRING imports network edges from a STRING database file, selects the highest confidence edges, maps ENSP IDs to HGNC gene symbols, and returns a weighted, directed igraph graph object of a rete gG type.

Usage

1
2
importNet.STRING(fName, net = "combined_score", cutoffType = "xN", val,
  taxID = "9606", dropUnmapped = TRUE, silent = FALSE, writeLog = TRUE)

Arguments

fName

Filename of a STRING protein.links.detailed.v10.txt file.

net

The requested network. This must be a string that exists in the header. The default is "combined_score".

cutoffType

one of xN, xQ or xS (see Details).

val

A number, quantile or score appropriate to the requested cutoff type.

taxID

The NCBI tax ID prefix of the protein1 and protein2 IDs. Defaults to "9606" (homo sapiens)

dropUnmapped

Controls whether to drop records in which at least one interactor could not be mapped to HGNC gene symbol. TRUE by default. by default.

silent

Controls whether output to console should be suppressed. FALSE by default.

writeLog

Controls whether writing the result to the global logfile is enabled. TRUE by default.

Value

a weighted, directed, simple igraph graph which is a rete gG object.

Selecting edges

STRING scores are p-values * 1000, rounded to integer. The function can retrieve the highest scored edges according to three different cutoff type. Type "xN" (default: 10000) retrieves the xN highest scored edges. Type xQ (default 0.9) retrieves the edges with scores larger than the xQ quantile. Type "xS" (default 950) retrieves all edges with scores larger or equal to xS. If different values are requested, they are passed in the parameter val. To read all edges, cutoff type should be (the default) xN, val = Inf.

Networks

STRING "protein.links.detailed.v10.txt" files contain several protein networks: neighborhood, fusion, cooccurence, coexpression, experimental, database, textmining, and combined_score. However this function is not restricted to these types, but will read one network for which the column name is requested in the function's net parameter. This allows users to define their own column.

Tax ID

The taxID parameter is used as a sanity check on the file contents. Currently only the first data record protein IDs are checked for being prefixed with the tax ID. During processing all numbers and one period prefixed to the ENSP ID are removed.

Console and log

Progress is summarized to console and results are written to the log-file unless writeLog is FALSE.

The gG object

The function returns a rete gG object, a weighted, directed, simple igraph graph in which HGNC gene symbols are vertex names, and the edge attributes $weight hold the network scores. Metadata are stored as object attributs: $type: "gG"; $version: the gG object version; $UUID: the UUID which allows to retrieve detailed process information from the log file with findUUID.


hyginn/ekplektoR documentation built on May 17, 2017, 12:08 a.m.