POM: Obtain Lines of Descent and Paths of the Maximum and their...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/LOD_POM.R

Description

Compute Lines of Descent (LOD) and Path of the Maximum (POM) for a single simulation or a set of simulations (from oncoSimulPop).

diversityPOM and diversityLOD return the Shannon's diversity (entropy) of the POM and LOD, respectively, of a set of simulations (it makes no sense to compute those from a single simulation).

Usage

1
2
3
4
POM(x)
LOD(x)
diversityPOM(lpom)
diversityLOD(llod)

Arguments

x

An object of class oncosimulpop (version >= 2, so simulations with the old poset specification will not work) or class oncosimul2 (a single simulation). For LOD simulations must have been run with keepPhylog = TRUE.

lpom

A list of POMs, as returned from POM on an object of class oncosimulpop.

llod

A list of LODs, as returned from LOD on an object of class oncosimulpop.

...

Other arguments passed to methods (ignored now).

Details

Lines of Descent (LOD) and Path of the Maximum (POM) were defined in Szendro et al. (2013) and I follow those definitions here as closely as possible, as applied to a process in continuous time with sampling at user-specified periods.

For POM, the results can depend strongly on how often we sample and keep samples (i.e., the sampleEvery and keepEvery arguments to oncoSimulIndiv and oncoSimulPop), since the POM is computed from the values stored in the pops.by.time matrix. This also explains why it is generally meaningless to use POM on oncoSimulSample runs: these only keep the very last sample.

For LOD my implementation is not exactly identical to the definition given in p. 572 of Szendro et al. (2013). First, in case this might be useful, for each simulation I keep all the paths that "(...) arrive at the most populated genotype at the final time" (first paragraph in p. 572 of Szendro et al.), whereas they only keep one (see second column of p. 572). However, I do provide a single LOD for each run, too. This is the first path to arrive at the genotype that eventually becomes the most populated genotype at the final time (and, in this sense, agrees with the LOD of Szendro et al.). However, in contrast to what is apparently done in Szendro ("A given genotype may undergo several episodes of colonization and extinction that are stored by the algorithm, and the last episode before the colonization of the final state is used to construct the step."), I do not check that this genotype (which is the one that will become the most populated at final time) does not become extinct before the final colonization. So there could be other paths (all in all_paths) that are actually the one(s) that are colonizers of the most populated genotype (with no extinction before the final colonization).

Value

For POM either a character vector (if x is a single simulation) or a list of character vectors. Each character vector is the ordered set of genotypes that contain the largest subpopulation at the times of sampling.

For LOD, if x is a single simulation, a two-element list. The first, all_paths, contains all paths to the maximum. The second, lod_single, contain the single LOD which is closest in meaning to the original definition of Szendro et al. (See "Details"). If x is a list (population) of simulations, then a list where each element is a two-element list, as just explained. All the lists contain objects of class "igraph.vs" (an igraph vertex sequence: see vertex_attr).

For diversityLOD and diversityPOM a single element vector with the Shannon's diversity (entropy) of the lod_single (for diversityLOD) or of the POMs (for diversityPOM).

Author(s)

Ramon Diaz-Uriarte

References

Szendro, I. G., Franke, J., Visser, J. A. G. M. de, & Krug, J. (2013). Predictability of evolution depends nonmonotonically on population size. Proceedings of the National Academy of Sciences, 110(2), 571-576. https://doi.org/10.1073/pnas.1213613110

See Also

oncoSimulPop, oncoSimulIndiv

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
######## Using a poset for pancreatic cancer from Gerstung et al.
###      (s and sh are made up for the example; only the structure
###       and names come from Gerstung et al.)

pancr <- allFitnessEffects(data.frame(parent = c("Root", rep("KRAS", 4), "SMAD4", "CDNK2A", 
                                          "TP53", "TP53", "MLL3"),
                                      child = c("KRAS","SMAD4", "CDNK2A", 
                                          "TP53", "MLL3",
                                          rep("PXDN", 3), rep("TGFBR2", 2)),
                                      s = 0.05,
                                      sh = -0.3,
                                      typeDep = "MN"))


pancr1 <- oncoSimulIndiv(pancr, model = "Exp", keepPhylog = TRUE)
pancr8 <- oncoSimulPop(8, pancr, model = "Exp", keepPhylog = TRUE,
                       mc.cores = 2)

POM(pancr1)
LOD(pancr1)

POM(pancr8)
LOD(pancr8)

diversityPOM(POM(pancr8))
diversityLOD(LOD(pancr8))

Bioconductor-mirror/OncoSimulR documentation built on May 31, 2017, 9:37 p.m.