knitr::opts_chunk$set(echo = TRUE)

Constructing a principal graph

The construction of a principal graph with a given topology is done via the specification of an appropriate initial conditions and of appropriate growth/shrink grammars. This can be done via the computeElasticPrincipalGraph.

Specific wrapping functions are also provided to build commonly encountered topologies (computeElasticPrincipalCurve, `computeElasticPrincipalTree, computeElasticPrincipalCircle), with minimal required inputs. In all of these function, it is necessary to specify a numeric matrix with the data points (X) and the number of nodes to of the principal graph (NumNodes). It is possible to control the behavior of the algorithm via a set of optional parameters. For example, it is possible to:

Examples of principal curves

The function computeElasticPrincipalCurve constructs a principal curve on the data. For example to construct a principal curve with 50 nodes on the example dataset line_data, it is sufficient to write

library("ElPiGraph.R")
CurveEPG <- computeElasticPrincipalCurve(X = curve_data, NumNodes = 60)

A principal tree can be constructed via the computeElasticPrincipalTree function. For example to construct a principal tree with 50 nodes on the example dataset tree_data, it is sufficient to write

TreeEPG <- computeElasticPrincipalTree(X = tree_data, NumNodes = 60, Lambda = .03, Mu = .01)

Finally, a principal circle can be constructed via the computeElasticPrincipalCircle function. For example to construct a principal circle with 50 nodes on the example dataset circe_data, it is sufficient to write

CircleEPG <- computeElasticPrincipalCircle(X = circle_data, NumNodes = 40)

All of these functions will return a list of length 1, with all the information associated with the graph.

Using bootstrapping

All of the functions provided to build principal graphs allow a bootstrapped construction. To enable that it is sufficient to modify the parameters nReps and ProbPoint. nReps indicates the number of repetitions and ProbPoint indicates the probability to include a point in each of the repetition. When nReps is larger than 1, a final consensus principal graph will be constructed using the nodes of the graph derived in each repetition.

As an example, let us perform bootstrapping on the circle data. We will also prevent the plotting, for now.

set.seed(42)
CircleEPG.Boot <- computeElasticPrincipalCircle(X = circle_data, NumNodes = 40, nReps = 50, ProbPoint = .6,
                                                drawAccuracyComplexity = FALSE, drawEnergy = FALSE, drawPCAView = FALSE)

CircleEPG.Boot will be a list with 51 elements: the 50 bootstrapped circles and the final consensus one.

Plotting data with principal graphs

The ElPiGraph.R provides different functions to explore show how the principal graph approximate the data. The main function is plotPG. This function can be used to show how the principal graph fit the data in different ways.

To plot the principal tree previously constructed we can type

PlotPG(X = tree_data, TargetPG = TreeEPG[[1]], Main = "A tree")

The main plot reports different features including the percentage of variance explained relative to the nodes of the principal graph (PG var), the percentage of variance explained relative to the data points (Data var), the fraction of variance of the data explained by the nodes of the principal graph (FVE) and the fraction of variance of the data explained by the projection of the points on the the principal graph (FVEP). In this example the nodes of the principal graph have been used to compute PCA and rotate the space (the Do_PCA parameter is TRUE be default), this can be seen by the "EpG PC" label of the axes.

To include additional dimension in the plot it is sufficient to specify them with the DimToPlot parameter, e.g.,

PlotPG(X = tree_data, TargetPG = TreeEPG[[1]], Main = "A tree", DimToPlot = 1:3)

We can also visualize the results of the bootstrapped construction by using the BootPG parameter:

PlotPG(X = circle_data, TargetPG = CircleEPG.Boot[[length(CircleEPG.Boot)]],
       BootPG = CircleEPG.Boot[1:(length(CircleEPG.Boot)-1)],
       Main = "A bootstrapped circle", DimToPlot = 1:2)


Albluca/ElPiGraph.R documentation built on May 28, 2019, 11:02 a.m.