generateGLM: Generation a generalized linear models of cluster abundances

View source: R/models.R

generateGLMR Documentation

Generation a generalized linear models of cluster abundances

Description

This function generates a generalized linear model between a provided vector of values associated to samples and the abundance of each cluster.

Usage

generateGLM(Results, variable, use.percentages = FALSE,
  clusters = NULL, th.pvalue = 1, show.error = FALSE,
  verbose = FALSE, ...)

Arguments

Results

a 'Results' object

variable

a numerical named vector providing the correspondence between sample names and specific phenotypes (or NA values to infer the phenotypes)

use.percentages

a logical specifying if the computations should be performed on percentage

clusters

a character vector specifying the names of the clusters used to compute the linear model (all clusters are selected by default)

th.pvalue

a numeric between 0 and 1 specifying the maximal p-value of each term in the returned model

show.error

a logical indicating if error bars should be used to display the coefficients standard deviations

verbose

a logical indicating if debug messages must be displayed

...

further parameters passed to the R glm method

Details

The 'clusters' parameter provides the name of the clusters to include in the model.

The number of clusters allowed in the model can vary depending the number of values to infer. Please refer to the documentation of R coxph and coxph.predict functions for more details.

Firstly, the function computed the glm model using all clusters as terms of the model. Then, the model is iteratively regenerated by discarding at each step the terms with the highest p-value higher than the 'th.pvalue' thershold. In this way, the model can correctly fit the data while being parsimonious. By default, the 'p-value' thershold parameter is set to 1 in order to include all terms in the model. If no terms having a p-value below the threshold, both the returned model and the prediction are set to NULL.

Value

a list of 5 elements corresponding to: a generalized linear model object as provided by the R glm function ('model' element), a named vector of predicted values ('variable.predictions' element), a named vector of predicted cluster abundance coefficiants ('cluster.coefficients' element), the representation of clusters coeficients ('plot.cluster' element), and the representation of samples predictions values ('plot.samples' element).


tchitchek-lab/SPADEVizR documentation built on Jan. 27, 2024, 8:58 p.m.