iJRFNet_parallel: Derive importance scores for a subset of target genes for...

Description Usage Arguments Value References Examples

Description

This function computes importance score in parallel for a subset of target genes.

Usage

1
2
iJRFNet_parallel(X, W=NULL, ntree=NULL, mtry=NULL, model=NULL, 
                genes.name, ptm.name=NULL,parallel)

Arguments

X

List object containing expression data for each class, X=list(x_1,x_2, ... ) where x_j is a (p x n_j) matrix with rows corresponding to genes and columns to samples. Rows need to be the same across objects, while samples can vary. Missing values are not allowed. If model="ptmJRF", the first object of the list must contain the expression of post translational modification variables. Only in this case, the number of variables in the first object might differ from that of other objects. Rows of X[[1]] does not need to be ordered in a specific way.

W

(p x p) Optional symmetric matrix containing sampling scores. When omitted, the standard JRF algorithm without weighted sampling scheme will be implemented. Element (i,j) contains interaction score (i - j). Scores must be non-negative. Larger value of sampling score corresponds to higher likelihood of gene i interacting with gene j. Columns and rows of W must be in the same order as the columns of X. Sampling scores W are computed considering one prior data such as protein-protein interactions.

ntree

Numeric value: number of trees. If omitted, 1000 trees are considered.

mtry

Numeric value: number of predictors to be sampled at each node. If omitted, mtry is set to the square root of the number of variables.

model

Variable indicating which iJRFNet model needs to be run. Takes values in c("iJRF", "iRafNet","ptmJRF")

genes.name

Vector containing genes name. The order needs to match the rows of x_j.

ptm.name

List of post translational modification variables in protein domain. This list must be ordered as rows of X[[1]]. This is required only if function ptmJRF is implemented.

parallel

Vector containing two elements c(num.job,num.targets). The first element is the job number that is implemented, target genes will be divided in J jobs each containing a specific number of target genes. The second element contains the number of target genes considered in each job.

Value

List object containing:

num.par

Integer. Parallel batch implemented.

model

Variable indicating which iJRFNet model needs to be run. Takes values in c("iJRF", "iRafNet","ptmJRF")

importance

A matrix containing importance score. When function iRafNet, this is a two dimensional matrix (p x num.targets) with num.targets being the number of targets considered for this parallel batch and p the number of genes. When function iJRF or ptmJRF is implemented, this is a three dimensional matrix of importance scores (p x num.targets x C) with num.targets being the number of targets considered in each batch, p the total number of genes/proteins and C the number of classes.

References

Petralia, F., Song, W.M., Tu, Z. and Wang, P. (2016). New method for joint network analysis reveals common and different coexpression patterns among genes and proteins in breast cancer. Journal of proteome research, 15(3), pp.743-754.

A. Liaw and M. Wiener (2002). Classification and Regression by randomForest. R News 2, 18–22.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
 # --- Generate data sets
 nclasses=2               # number of data sets / classes
 n1<-n2<-20               # sample size for each data sets
 p<-5                   # number of variables (genes)
 genes.name<-paste("G",seq(1,p),sep="")   # genes name
 
 data1<-matrix(rnorm(p*n1),p,n1)       # generate data1
 data2<-matrix(rnorm(p*n2),p,n1)       # generate data2
 
 # --- Run moultiple jobs and combine them for each function
  
   # -- function iJRF
   out.new<-array(0,c(p,p,nclasses))
   n.var=0
   for (k in 1:3){ 
      out<-iJRFNet_parallel(X=list(data1,data2),genes.name=genes.name,
      model="iJRF",parallel=c(k,2))
      
      n.target<-dim(out$importance)[2]
      for (c in 1:nclasses) {
      out.new[,seq(n.var+1,n.var+n.target),c]<-out$importance[,,c];}
      n.var=n.var+n.target
    }

   # -- function iRafNet
   W<-abs(matrix(rnorm(p*p),p,p))    # generate weights for interactions
   for (k in 1:3){ 
      out<-iJRFNet_parallel(X=list(data1),W=W,genes.name=genes.name,
      model="iRafNet",parallel=c(k,2))
      print(dim(out$importance))
      if (k==1) out.new<-out$importance
      if (k >2) out.new<-cbind(out.new,out$importance)
   }

    # -- function ptmJRF
    genes.name<-paste("G",seq(1,p),sep="")   # genes name
    ptm.name<-c("G1","G2","G3","G3","G4","G5","G1")   # ptm name
    p.ptm<-length(ptm.name)
 
    data1<-matrix(rnorm(p.ptm*n2),p.ptm,n1)       # generate PTM data
    data2<-matrix(rnorm(p*n1),p,n1)       # generate global proteomics data
 
    out.new<-array(0,c(p,p,nclasses)) # -- p x p matrix of importance scores
    n.var=0
    for (k in 1:3){ 

      out<-iJRFNet_parallel(X=list(data1,data2),genes.name=genes.name,
                     ptm.name=ptm.name,model="ptmJRF",parallel=c(k,2))
      
      n.target<-dim(out$importance)[2]
      for (c in 1:nclasses) {
      out.new[,seq(n.var+1,n.var+n.target),c]<-out$importance[,,c];}
      n.var=n.var+n.target
    }

  

petraf01/iJRF documentation built on Dec. 22, 2021, 7:46 a.m.