simulation: Simulated dataset

simulationR Documentation

Simulated dataset

Description

This data contains the parameters used in the study of the finite case behavior of source set algorithm, as described in of Salviato et al. (2019).

Usage

data(simulation)

Format

A list class that contains the true parameters (mu, vector of means and S, covariances matrix) of two multivariate normal distributions in two different experimental conditions (condition1, reference condition and condition2, perturbed condition) and the underlying graphical structure G (graph. Six different perturbations are considered, see below. ).

The differences between the two conditions are driven by:

  • a node that is a separator within the graph (simulation$condition2$`5`)

  • a node that is contained in only one clique of the graph (simulation$condition2$`10`)

The intensity of the artificial perturbation is:

  • mild (simulation$condition2$`10`$`1.2`)

  • moderate (simulation$condition2$`10`$`1.6`)

  • strong (simulation$condition2$`10`$`2`)

Details

The starting parameters of the reference condition are obtained by randomly selecting a gene set of the same cardinality as the order of the graph G, from the Acute Lymphocytic Leukemia (ALL) dataset. These are then modified to represent the parameters of the perturbed condition. Formally, starting from the parameters related to the reference group, the procedure act on means and variances so that the conditional distribution of the variables on which it does not directly intervene remains unchanged under the two conditions. However, this action affects the entire global joint distribution, thus creating the propagation effect. See Salviato et al. (2016) for more details.

References

Chiaretti, S. et al. (2005). Gene expression profiles of b-lineage adult acute lymphocytic leukemia reveal genetic patterns that identify lineage derivation and distinct mechanisms of transformation. Clinical Cancer Research, 11(20), 7209–7219.

Salviato, E. et al. (2016). simPATHy: a new method for simulating data from perturbed biological pathways. Bioinformatics, 33(3), 456–457.

Salviato, E. et al. (2019). SourceSet: a graphical model approach to identify primary genes in perturbed biological pathways. PLoS computational biology 15 (10), e1007357.

See Also

simPATHy, ALL


SourceSet documentation built on Nov. 21, 2022, 5:06 p.m.