iJRF: Joint Random Forest for the simultaneous estimation of...

Description Usage Arguments Value References Examples

Description

Algorithm for the simultaneous estimation of multiple related networks. Some of the functions utilized are a modified version of functions contained in the R package randomForest (A. Liaw and M. Wiener, 2002).

Usage

1
iJRF(X, W=NULL, ntree=NULL, mtry=NULL,genes.name)

Arguments

X

List object containing expression data for each class, X=list(x_1,x_2, ... ) where x_j is a (p x n_j) matrix with rows corresponding to genes and columns to samples. Missing values are not allowed.

W

(p x p) Optional symmetric matrix containing sampling scores. When omitted, the standard JRF algorithm without weighted sampling scheme will be implemented. Element (i,j) contains interaction score (i - j). Scores must be non-negative. Larger value of sampling score corresponds to higher likelihood of gene i interacting with gene j. Columns and rows of W must be in the same order as the columns of X. Sampling scores W are computed considering one prior data such as protein-protein interactions.

ntree

Numeric value: number of trees. If omitted, 1000 trees are considered.

mtry

Numeric value: number of predictors to be sampled at each node. If omitted, mtry is set to the square root of the number of variables.

genes.name

Vector containing genes name. The order needs to match the rows of x_j.

Value

A matrix with I rows and C + 2 columns where I is the total number of gene-gene interactions and C is the number of classes. The first two columns contain gene names for each interaction while the remaining columns contain importance scores for different classes.

References

Petralia, F., Song, W.M., Tu, Z. and Wang, P. (2016). New method for joint network analysis reveals common and different coexpression patterns among genes and proteins in breast cancer. Journal of proteome research, 15(3), pp.743-754.

A. Liaw and M. Wiener (2002). Classification and Regression by randomForest. R News 2, 18–22.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
 # --- Generate data sets
 nclasses=2               # number of data sets / classes
 n1<-n2<-20               # sample size for each data sets
 p<-5                   # number of variables (genes)
 genes.name<-paste("G",seq(1,p),sep="")   # genes name
 W<-abs(matrix(rnorm(p*p),p,p))    # generate weights for relationships

 data1<-matrix(rnorm(p*n1),p,n1)       # generate data1
 data2<-matrix(rnorm(p*n2),p,n1)       # generate data2
 
 # --- Standardize variables to mean 0 and variance 1
  data1 <- t(apply(data1, 1, function(x) { (x - mean(x)) / sd(x) } ))
  data2 <- t(apply(data2, 1, function(x) { (x - mean(x)) / sd(x) } ))

 # --- Run JRF and obtain importance score of interactions for each class
  out<-iJRF(X=list(data1,data2),W=W,mtry=round(sqrt(p-1)),
           ntree=1000,genes.name=genes.name)

petraf01/iJRF documentation built on Dec. 22, 2021, 7:46 a.m.