GADAG_Run: Run GADAG

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/GADAG_Run.R

Description

Function to run GADAG, an algorithm that aims at inferring large sparse directed acyclic graphs based on an observation sample X, by minimizing the penalized negative log-likelihood with a convex program embedded in a genetic algorithm.

Usage

1
2
3
4
GADAG_Run(X, lambda, threshold = 0.1, GADAG.control = list(n.gen = 100,
  tol.Shannon = 1e-06, max.eval = 10000, pop.size = 10, p.xo = 0.25, p.mut =
  0.05), grad.control = list(tol.obj.inner = 1e-06, max.ite.inner = 50),
  ncores = 1, print.level = 0, return.level = 0)

Arguments

X

Design matrix, with samples (n) in rows and variables (p) in columns.

lambda

Parameter of penalization (>0).

threshold

Thresholding value for the estimated edges.

GADAG.control

A list containing parameters for controlling GADAG (termination conditions and inherent parameters of the Genetic Algortihm). Some parameters (n.gen, max.eval and pop.size) are particularly critical for reducing the computational time.

  • n.gen maximal number of population generations (>0),

  • pop.size initial population size for the genetic algorithm (>0),

  • max.eval overall maximal number of calls of the evaluation function (>0, should be of the order of n.gen*pop.size),

  • tol.Shannon threshold for the Shannon entropy (>0),

  • p.xo crossover probability of the genetic algorithm (between 0 and 1),

  • p.mut mutation probability of the genetic algorithm (between 0 and 1).

grad.control

A list containing the parameters for controlling the inner optimization, i.e. the gradient descent.

  • tol.obj.inner tolerance (>0),

  • max.ite.inner maximum number of iterations (>0).

ncores

Number of cores (>0, depending on your computer).

print.level

0 no print, 1 some info on the genetic algorithm behaviour are printed.

return.level

0 only best solution is returned, 1 evolution of the current best solution and statistics on the population fitness values are also returned.

Details

This function returns as a primary output G.best, the adjacency matrix of the inferred graph. This matrix is computed thanks to its decomposition (P.best, T.best).

The values of the inputs n.gen, max.eval and pop.size largely influence the algorithm inference capability, but also its computational cost. As a rule-of-thumb, we recommend setting pop.size between 1 to 10 times the number of nodes, and n.gen between 10 to 100 times pop.size. tol.Shannon may be decreased in case of premature stop. The other parameters should only be modified with care.

Value

A list with the following elements:

Author(s)

Magali Champion, Victor Picheny and Matthieu Vignes

See Also

GADAG, GADAG_Run, GADAG_Analyze.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
 #############################################################
 # Loading toy data
 #############################################################
 data(toy_data)
 # toy_data is a list of two matrices corresponding to a "star"
 # DAG (node 1 activates all other nodes):
 # - toy_data$X is a 100x10 design matrix
 # - toy_data$G is the 10x10 adjacency matrix (ground trough)

 #############################################################
 # Running GADAG
 #############################################################
 # Simple run, with only the penalty term specified
 GADAG_results <- GADAG_Run(X=toy_data$X, lambda=0.1)
 print(GADAG_results$G.best) # optimal adjacency matrix graph

 # Expensive run with many evaluations if we refine the
 # termination conditions
 ## Not run: 
 n.gen <- 1e10 # we allow a very large number of iterations
 tol.Shannon <- 1e-10 # the entropy of Shannon of the population
                      # has to be very small
 pop.size <- 5*ncol(toy_data$G) # this is usually a good
                                # population size
 max.eval <- n.gen * pop.size # maximal number of nested
                              # evaluation
 GADAG_results <- GADAG_Run(X=toy_data$X, lambda=0.1,
      GADAG.control=list(n.gen=n.gen, tol.Shannon=tol.Shannon,
                         pop.size = pop.size, max.eval=max.eval))
 print(GADAG_results$G.best) # optimal adjacency matrix graph
 
## End(Not run)

 # Expensive run if we also increase the population size
 ## Not run: 
 pop.size <- 10*ncol(toy_data$G)
 GADAG_results <- GADAG_Run(X=toy_data$X, lambda=0.1,
      GADAG.control=list(pop.size=pop.size))
 print(GADAG_results$G.best) # optimal adjacency matrix graph
 
## End(Not run)

 # You can have more information about the evolution of the
 # algorithm by turning return.level on
 ## Not run: 
 return.level <- 1
 GADAG_results <- GADAG_Run(X=toy_data$X, lambda=0.1, return.level = return.level)
 print(GADAG_results$f.best.evol) # this shows the evolution of the fitness
                                  # across the iterations
 
## End(Not run)

GADAG documentation built on May 2, 2019, 3:25 p.m.