Description Usage Arguments Details Value Author(s) See Also Examples
Function to run GADAG, an algorithm that aims at inferring large sparse directed acyclic graphs based on an observation sample X, by minimizing the penalized negative log-likelihood with a convex program embedded in a genetic algorithm.
1 2 3 4 |
X |
Design matrix, with samples (n) in rows and variables (p) in columns. |
lambda |
Parameter of penalization (>0). |
threshold |
Thresholding value for the estimated edges. |
GADAG.control |
A list containing parameters for controlling GADAG (termination conditions and inherent parameters of the Genetic Algortihm). Some parameters (n.gen, max.eval and pop.size) are particularly critical for reducing the computational time.
|
grad.control |
A list containing the parameters for controlling the inner optimization, i.e. the gradient descent.
|
ncores |
Number of cores (>0, depending on your computer). |
print.level |
0 no print, 1 some info on the genetic algorithm behaviour are printed. |
return.level |
0 only best solution is returned, 1 evolution of the current best solution and statistics on the population fitness values are also returned. |
This function returns as a primary output G.best
, the adjacency matrix of the inferred graph. This matrix is computed thanks
to its decomposition (P.best
, T.best
).
The values of the inputs n.gen
, max.eval
and pop.size
largely influence the algorithm inference capability,
but also its computational cost. As a rule-of-thumb, we recommend setting pop.size
between 1 to 10 times the number of nodes,
and n.gen
between 10 to 100 times pop.size
. tol.Shannon
may be decreased in case of premature stop. The other
parameters should only be modified with care.
A list with the following elements:
f.best
Best fitness value.
P.best
Best node order (vector of length p).
T.best
Corresponding best edges values (vector of length p).
G.best
Best graph (matrix form).
f.best.evol
Evolution of the best fitness value across the iterations (if return.level=1).
P.best.evol
Evolution of the best node order across the iterations (if return.level=1).
T.best.evol
Evolution of the best edges values across the iterations (if return.level=1).
fmin.evol
Evolution of the minimal fitness value of the population across the iterations (if return.level=1).
fmean.evol
Evolution of the averaged fitness value of the population across the iterations (if return.level=1).
fp10.evol
Evolution of the quantiles of the fitness value across the iterations (if return.level=1).
fp90.evol
Evolution of the quantiles of the fitness value across the iterations (if return.level=1).
Shannon.evol
Evolution of the Shannon entropy of the population across the iterations (if return.level=1).
Magali Champion, Victor Picheny and Matthieu Vignes
GADAG
, GADAG_Run
, GADAG_Analyze
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | #############################################################
# Loading toy data
#############################################################
data(toy_data)
# toy_data is a list of two matrices corresponding to a "star"
# DAG (node 1 activates all other nodes):
# - toy_data$X is a 100x10 design matrix
# - toy_data$G is the 10x10 adjacency matrix (ground trough)
#############################################################
# Running GADAG
#############################################################
# Simple run, with only the penalty term specified
GADAG_results <- GADAG_Run(X=toy_data$X, lambda=0.1)
print(GADAG_results$G.best) # optimal adjacency matrix graph
# Expensive run with many evaluations if we refine the
# termination conditions
## Not run:
n.gen <- 1e10 # we allow a very large number of iterations
tol.Shannon <- 1e-10 # the entropy of Shannon of the population
# has to be very small
pop.size <- 5*ncol(toy_data$G) # this is usually a good
# population size
max.eval <- n.gen * pop.size # maximal number of nested
# evaluation
GADAG_results <- GADAG_Run(X=toy_data$X, lambda=0.1,
GADAG.control=list(n.gen=n.gen, tol.Shannon=tol.Shannon,
pop.size = pop.size, max.eval=max.eval))
print(GADAG_results$G.best) # optimal adjacency matrix graph
## End(Not run)
# Expensive run if we also increase the population size
## Not run:
pop.size <- 10*ncol(toy_data$G)
GADAG_results <- GADAG_Run(X=toy_data$X, lambda=0.1,
GADAG.control=list(pop.size=pop.size))
print(GADAG_results$G.best) # optimal adjacency matrix graph
## End(Not run)
# You can have more information about the evolution of the
# algorithm by turning return.level on
## Not run:
return.level <- 1
GADAG_results <- GADAG_Run(X=toy_data$X, lambda=0.1, return.level = return.level)
print(GADAG_results$f.best.evol) # this shows the evolution of the fitness
# across the iterations
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.