stability.sim: Stability analysis of the mutagenetic trees mixture model

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/StabilityFunctions.R

Description

The function includes stability analysis on different levels of the mutagenetic trees mixture model: GPS values, encoded probability distribution, tree topologies. Each analysis contains the values of different similarity measures with their corresponding p-values.

Usage

1
2
stability.sim(no.trees = 3, no.events = 9, prob = c(0.2, 0.8),
              no.draws = 300, no.rands = 100, no.sim = 1)

Arguments

no.trees

An integer larger than 2 giving the number of tree components of the mixture models considered in the stability analysis. The default value is 3.

no.events

An integer larger than 0 giving the number of genetic events of the mixture models considered in the stability analysis.

prob

A numeric vector of length 2 specifying the boundaries for the edge weights of the randomly generated trees. The first component of the vector (the lower boundary) must be smaller than the second component (the upper boundary). The default value is (0.2, 0.8).

no.draws

An integer larger than 0 giving the size of the data sample drawn from the random models used for learning the mixture models. The default value is 300.

no.rands

An integer larger than 0 specifying the number of random models used for calculating the p-values. The default value is 100.

no.sim

An integer larger than 0 specifying the number of iterations used for the waiting time simulations (a part of the GPS calculation). The default value is 1.

Details

The stability analysis is performed by first drawing a true mixture model uniformly at random from the model space, and drawing a data sample from it. Afterwards, a mutagenetic trees model is fitted to the drawn sample. The quality of the features derived from the model is then assessed by comparing its quality with the quality of the corresponding features of a sufficient number of random mixture models sampled uniformly from the model space. A p-value is obtained as a percentage of cases in which the true model is closer to a random model tnah to the fitted model.

Value

comp1

Results from the stability analysis of the GPS values derived from a fitted mixture model. A matrix with 4 columns and no.sim rows. The first two columns give the similarity values and their corresponding p-values when the Euclidian distance is used as a similarity measure for comparing the respective GPS vectors. The last two columns depict the same results, but with the rank correlation distance used as a similarity measure.

comp2

Results from the stability analysis of the probability distributions induced by a fitted mixture model. A matrix with 6 columns and no.sim rows. Each two columns give the values of the comparissons between the true and the fitted probability distributions and their corresponding p-values, when using the cosine distance, the L1 distance, and the Kullback-Leibler divergence as similarity measures.

comp3

Results from the stability analysis of the topologies of the tree components of a fitted mixture model. A matrix with 2 columns and no.sim rows that give the value of the comparisson of the topologies between the true and the corresponding fitted model and their p-values. The similarity measure underlying the number of different edges was used.

comp4

Similar to comp3. However, the similarity measure for comparing the tree topologies besides the number of distinct edges includes the L1 distances of the level vectors of events. See get.tree.levels.

comp5

A matrix where the columns correspond to the true GPS vector from each simulation iteration. The matrix has no.sim columns and no.draws rows.

comp6

Same as comp5, but the matrix contains the fitted GPS values from each simulation iteration.

comp7

A list where each component corresponds to the true models generated in each simulation iteration. the length of the list is no.sim.

comp8

Same as comp7, but the list contains the fitted models.

Note

The stability simulation examples are time consuming. They are commented out because of the time restrictions of the check of the package. For trying out the code please copy it and uncomment it.

Author(s)

Jasmina Bogojeska

References

Learning multiple evolutionary pathways from cross-sectional data, N. Beerenwinkel et al.; Estimating cancer survival and clinical outcome based on genetic tumor progression scores, J. Rahnenf\"urer et al.

See Also

RtreemixData-class, RtreemixModel-class, RtreemixGPS-class, RtreemixStats-class, fit-methods, gps-methods, distribution-methods, generate-methods, sim-methods, L1.dist, Pval.dist, comp.models, comp.trees, get.tree.levels, kullback.leibler

Examples

1
2
## Stability analysis - a toy example
#stability.sim(no.trees = 3, no.rands = 5, no.sim = 4, no.draws = 300)

Rtreemix documentation built on Nov. 8, 2020, 5:57 p.m.