IMMAN: Interlog protein network reconstruction by Mapping and Mining...

Description Usage Arguments Value Author(s) See Also Examples

View source: R/IMMAN.R

Description

A function for reconstructing Interlog Protein Network (IPN) integrated from Protein-protein Interaction Networks (PPIN) from different species. Users can overlay different PPINs to mine conserved common network between diverse species. It helps to retrieve IPN with different degrees of conservation to have better protein function prediction and PPIN analysis.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
IMMAN(
  ProteinLists,
  fileNames = NULL,
  Species_IDs,
  identityU,
  substitutionMatrix,
  gapOpening,
  gapExtension,
  BestHit,
  coverage,
  NetworkShrinkage,
  score_threshold,
  STRINGversion,
  InputDirectory = getwd()
)

Arguments

ProteinLists

a list in which each element contains protein names of a species as a character vector. If it was NULL then the protein lists file name should be addressed in fileNames parameter.

fileNames

a character vector, containing names of text files containing protein list for each species. The protein list of each species must be in a column without header and rownames in seperate ".txt" files. The ProteinLists argument should be include at least two text file names addressing the protein list of each species which are in UniProt accession IDs format.

Species_IDs

a numeric vector; taxonomy ID for each organism which are provided in fileNames

identityU

numeric; value for selecting proteins whose alignment score is greater or equal than identityU

substitutionMatrix

a scoring substitution matrix to be used for alignment setting.

gapOpening

numeric; indicating the cost for opening a gap in the alignment

gapExtension

The incremental cost incurred along the length of the gap in the alignment

BestHit

logical; if TRUE describes a pair protein sequence among two different species which is the reciprocal best hit in sequence similarity analysis, whilst, if it is FALSE, indicates a nonreciprocal best hit

coverage

Number of connected proteins pairs in each Ortholog Protein Set (OPS) pair (termed as "coverage") to reconstruct an edge of OPS pair in the IPN (Interlog Protein Network)

NetworkShrinkage

logical; if TRUE OPSs that are similar to each other would be merged.

score_threshold

numeric; STRINGdb score for protein protein interaction (PPI) selection in STRING database

STRINGversion

character; indicating which version of STRING database should program search in for the score of PPIs.

InputDirectory

By default is getwd(). You can set this parameter to indicate where the downloaded file from STRING should be saved.

Value

a list contaning four elements:

IPNEdges : data.frame; Edges of resulted interlog protein network.

IPNNodes : data.frame; Nodes of resulted interlog protein network. Each node represents an OPS which is a set of ortholog proteins.

Network : list; Retrived PPINs of each input species.

maps : list; It includes data.frames indicating STRING_id data base matched to their corresponding UNIPROT_AC. The number of data.frames is according to the the number of species.

IPN : an igraph object representing the interlog protein network.

Author(s)

Minoo Ashtiani, Payman Nickchi, Abdollah Safari, Mehdi Mirzaie, Mohieddin Jafari

See Also

pairwiseAlignment

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
data(FruitFly)
data(Celegance)

subFruitFly <- as.character(FruitFly$V1)[1:10]
subCelegance <- as.character(Celegance$V1)[1:10]

ProteinLists = list(subFruitFly, subCelegance)

List1_Species_ID = 7227  # taxonomy ID FruitFly
List2_Species_ID = 6239  # taxonomy ID Celegance

Species_IDs  = c(List1_Species_ID, List2_Species_ID)

identityU = 30
substitutionMatrix = "BLOSUM62"
gapOpening = -8
gapExtension = -8
NetworkShrinkage = FALSE
coverage = 1
BestHit = TRUE
score_threshold = 400
STRINGversion="11"

# Run the IMMAN function for the parameters
output = IMMAN(ProteinLists, fileNames=NULL, Species_IDs,
              identityU, substitutionMatrix,
              gapOpening, gapExtension, BestHit,
              coverage, NetworkShrinkage,
              score_threshold, STRINGversion,
              InputDirectory = getwd())

output$IPNEdges
output$IPNNodes
output$Networks
output$Networks[[1]]
output$maps
output$maps[[2]]

IMMAN documentation built on Nov. 8, 2020, 7:35 p.m.