library(TRONCO) data(aCML) data(crc_maf) data(crc_gistic) data(crc_plain) gene.hypotheses = c('KRAS', 'NRAS', 'IDH1', 'IDH2', 'TET2', 'SF3B1', 'ASXL1') alterations = events.selection(as.alterations(aCML), filter.freq = .05) aCML.clean = events.selection(aCML, filter.in.names=c(as.genes(alterations), gene.hypotheses)) aCML.clean = annotate.description(aCML.clean, 'CAPRI - Bionformatics aCML data (selected events)') aCML.hypo = hypothesis.add(aCML.clean, 'NRAS xor KRAS', XOR('NRAS', 'KRAS')) aCML.hypo = hypothesis.add(aCML.hypo, 'SF3B1 xor ASXL1', XOR('SF3B1', XOR('ASXL1')), '*') as.events(aCML.hypo, genes = 'TET2') aCML.hypo = hypothesis.add(aCML.hypo, 'TET2 xor IDH2', XOR('TET2', 'IDH2'), '*') aCML.hypo = hypothesis.add(aCML.hypo, 'TET2 or IDH2', OR('TET2', 'IDH2'), '*') aCML.hypo = hypothesis.add.homologous(aCML.hypo) aCML.hypo = annotate.description(aCML.hypo, '') aCML.clean = annotate.description(aCML.clean, '') model.capri = tronco.capri(aCML.hypo, boot.seed = 12345, nboot = 5) model.capri = annotate.description(model.capri, 'CAPRI - aCML') model.caprese = tronco.caprese(aCML.clean) model.caprese = annotate.description(model.caprese, 'CAPRESE - aCML') model.edmonds = tronco.edmonds(aCML.clean, nboot = 5, boot.seed = 12345) model.edmonds = annotate.description(model.edmonds, 'MST Edmonds - aCML') model.gabow = tronco.gabow(aCML.clean, nboot = 5, boot.seed = 12345) model.gabow = annotate.description(model.gabow, 'MST Gabow - aCML') model.chowliu = tronco.chowliu(aCML.clean, nboot = 5, boot.seed = 12345) model.chowliu = annotate.description(model.chowliu, 'MST Chow Liu - aCML') model.prim = tronco.prim(aCML.clean, nboot = 5, boot.seed = 12345) model.prim = annotate.description(model.prim, 'MST Prim - aCML data')
TRONCO provides functions to plot a model, access information about the probabilities used to extract it from data, and two types of confidence measures: those used to infer the model, and those computed a posteriori from it.
Function view
provides updated information about a model if this is available.
view(model.capri)
We can plot a model by using function \Rfunction{tronco.plot}. Here, we plot the aCML model inferred by CAPRI with BIC and AIC as a regolarizator. We set some parameters to get a nice plot (scaling etc.), and distinguish the edges detected by the two regularization techniques. The confidence of each edge is shown in terms of temporal priority and probability raising (selective advantage scores) and hypergeometric testing (statistical relevance of the dataset of input). Events are annotated as in the oncoprint, edge p-values above a minium threshold (default 0.05) are red.
tronco.plot(model.capri, fontsize = 12, scale.nodes = 0.6, confidence = c('tp', 'pr', 'hg'), height.logic = 0.25, legend.cex = 0.35, pathways = list(priors = gene.hypotheses), label.edge.size = 10)
We can also make a multiplot with this function, which in this case we do by showing the models inferred by the other algorithms based on Minimum Spanning Trees.
par(mfrow = c(2,2)) tronco.plot(model.caprese, fontsize = 22, scale.nodes = 0.6, legend = FALSE) tronco.plot(model.edmonds, fontsize = 22, scale.nodes = 0.6, legend = FALSE) tronco.plot(model.chowliu, fontsize = 22, scale.nodes = 0.6, legend.cex = .7) tronco.plot(model.prim, fontsize = 22, scale.nodes = 0.6, legend = FALSE)
We can visualize a summary of the parameters used for the reconstruction, test if an object has a model or delete it (which shall be done to retrieve the original dataset).
as.data.frame(as.parameters(model.capri)) has.model(model.capri) dataset = delete.model(model.capri)
A set of functions can be used to visualize the content of object which contains the reconstructed model. For instance, we can access the adjacency matrix of a model by using as.adj.matrix
which will return a matrix for each one of the regularizators used -- in this case because CAPRI was run with both BIC/AIC.
str(as.adj.matrix(model.capri))
Every model is inferred by estimating the empirical marginal, joint and conditional probabilities for all the events, from input data. These in some cases are estimated by a bootstrap procedure (see the algorithms implemented). TRONCO has functions to extract such table, that could be in turn printed by using external functions for, e.g., heatmap visualization (see below for an example via the pheatmap
package). We show these functions working with the CAPRI model; in this case the tables are the same for both BIC/AIC structures as they are computed before performing penalized likelihood-fit. The marginal P(x) for x an event in the dataset are obtained by as.marginal.probs
.
marginal.prob = as.marginal.probs(model.capri) head(marginal.prob$capri_bic)
Similarly, the joint P(x,y) for every pair of events in the dataset is given by as.joint.probs
.
joint.prob = as.joint.probs(model.capri, models='capri_bic') joint.prob$capri_bic[1:3, 1:3]
And as.conditional.probs
finally gives the conditional P(x|y) for every edge in the dataset.
conditional.prob = as.conditional.probs(model.capri, models='capri_bic') head(conditional.prob$capri_bic)
Confidence scores can be accessed by function \Rfunction{as.confidence}, which takes as parameter the type of confidence measure that one wants to access to. This will work for either confidence measures assessed before reconstructing the model -- if available --, or afterwards.
str(as.confidence(model.capri, conf = c('tp', 'pr', 'hg')))
Other functions visualize tables summarizing the statistics for each edge in the model, For instance, if one uses function as.selective.advantage.relations
the p-values for temporal priority, probability raising and hypergeometric testing, as well as other information about each edge can be accessed, e.g., the number of observations for the upstream and the downstream events.
as.selective.advantage.relations(model.capri)
TRONCO provides three different strategies to perform bootstrap and assess confidence of each edge in terms of a score in the range [0, 100], where 100 is the highest confidence). Non-parametric (default) and statistical bootstrap strategies are available, and can be executed by calling function tronco.bootstrap
with type
parameter set appropriately. This function is parallel, and parameter cores.ratio
(default 1) can be used to percentage of available cores that shall be used to compute the scores.
Parameter nboot
controls the number of bootstrap iterations.
model.boot = tronco.bootstrap(model.capri, nboot = 3, cores.ratio = 0) model.boot = tronco.bootstrap(model.boot, nboot = 3, cores.ratio = 0, type = 'statistical')
Bootstrap scores can be annotated to the tronco.plot
output by setting them via the confidence parameter confidence=c('npb', 'sb')
. In this case edge thickness will be proportional to the non-parametric npb
) scores -- the last to appear in the confidence
parameter.
tronco.plot(model.boot, fontsize = 12, scale.nodes = .6, confidence=c('sb', 'npb'), height.logic = 0.25, legend.cex = .35, pathways = list(priors= gene.hypotheses), label.edge.size=10)
Bootstrap scores can extracted or visualized even with other TRONCO functions. For instance, we can accessall scores via as.bootstrap.scores
, which resembles function as.selective.advantage.relations
and will display the scores per edge. Notice that even function view
gives an update output by mentioning the available bootstrap scores.
as.bootstrap.scores(model.boot) view(model.boot)
If we want to access a matrix with the scores and visualize that in a heatmap we can use for instance the pheatmap
function of TRONCO. In this case we need to use also function keysToNames
to translate internal TRONCO keys to mnemonic names in the plot
pheatmap(keysToNames(model.boot, as.confidence(model.boot, conf = 'sb')$sb$capri_aic) * 100, main = 'Statistical bootstrap scores for AIC model', fontsize_row = 6, fontsize_col = 6, display_numbers = TRUE, number_format = "%d" )
TRONCO implements k-fold cross-validation routines (from the bnlearn package) to provide estimates of the following statistics:
tronco.kfold.eloss
) of a whole model ? i.e., the negated expected log-likelihood of the test set for the Bayesian network fitted from the training set.tronco.kfold.prederr
) for a single node x and its parents set X -- i.e., how precisely we can predict the values of x by using only the information present in its local distribution, via X.tronco.kfold.posterr
) for a single node x and one of its parent node y in X -- i.e., the values of x are predicted using only the information present in y by likelihood weighting and Bayesian posterior estimates.By default, a 10 repetitions from 10-fold cross-validation experiment are perfomed, for all the models which are found inside a TRONCO object -- in this case 2, one for CAPRI with BIC and one for CAPRI with AIC.
model.boot = tronco.kfold.eloss(model.boot) model.boot = tronco.kfold.prederr(model.boot, runs = 2, cores.ratio = 0) model.boot = tronco.kfold.posterr(model.boot, runs = 2, cores.ratio = 0)
These results can be visualized in terms of summary tables, as for the other confidence scores.
as.kfold.eloss(model.boot) as.kfold.prederr(model.boot) as.kfold.posterr(model.boot)
Notice that these can be combined to create a nice table with all these statistics -- we make here the example of a table with all the BIC statistics. This format can be readily exported to external spreadsheets for further visualization.
tabular = function(obj, M){ tab = Reduce( function(...) merge(..., all = TRUE), list(as.selective.advantage.relations(obj, models = M), as.bootstrap.scores(obj, models = M), as.kfold.prederr(obj, models = M), as.kfold.posterr(obj,models = M))) # merge reverses first with second column tab = tab[, c(2,1,3:ncol(tab))] tab = tab[order(tab[, paste(M, '.NONPAR.BOOT', sep='')], na.last = TRUE, decreasing = TRUE), ] return(tab) } head(tabular(model.boot, 'capri_bic'))
We finally show the plot of the model with the confidences by cross-validation.
tronco.plot(model.boot, fontsize = 12, scale.nodes = .6, confidence=c('npb', 'eloss', 'prederr', 'posterr'), height.logic = 0.25, legend.cex = .35, pathways = list(priors= gene.hypotheses), label.edge.size=10)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.