generateRFM: Generation of a random forest of cluster abundances

View source: R/models.R

generateRFMR Documentation

Generation of a random forest of cluster abundances

Description

This function generates a random forest model between the values associated to samples and the abundance of each clusters. This generated model can then be used to predict biological outcomes in the context of survival/progression studies.

Usage

generateRFM(Results, variable, status, use.percentages = FALSE,
  clusters = NULL, ntree = 1000, ...)

Arguments

Results

a 'Results' object

variable

a numerical named vector providing the correspondence between sample names and specific phenotypes (or NA values to infer the phenotypes)

status

a numerical named vector providing the correspondence between sample names and specific status (or NA values to infer the status)

use.percentages

a logical specifying if the computations should be performed on percentage

clusters

a character vector specifying the names of the clusters used to compute the model (all clusters are selected by default)

ntree

a numerical value specifying the number of tree to generate

...

further parameters passed to the R randomForest method

Details

The 'clusters' parameter provide the name of the clusters to include in the model.

The named vector containg the value associated to each sample are provided to the 'variable' parameter. In order to infer unknown values associated to samples, it is possible to set to NA for these samples. In this way, the function will infer, based on the computed model, all values associated to these samples.

The function involve the randomForestSRC to compute a random forest model using all clusters as variable of the model. Several visualisation based on the "ggfortify" package are returned allowing to identify the most important variables, their minimum depth, the OOB Error Rate, as well as the variables predictions.

Value

a list of 4 elements corresponding to: random forest model object as provided by the R randomForest function ('model' element), and a named vector of predicted values ('variable.predictions' element), the representation of clusters coeficients ('plot.vimp' element), and the representation of samples predictions values ('plot.samples' element)


tchitchek-lab/SPADEVizR documentation built on Jan. 27, 2024, 8:58 p.m.