HEMDAG: Hierarchical Ensemble Methods for Directed Acyclic Graphs

Documented in Do.TPR.DAG Do.TPR.DAG.holdout TPR.DAG

##******************##
## TPR-DAG VARIANTS ##
##******************##
#' @name TPR-DAG-variants
#' @seealso \code{\link{GPAV}}, \code{\link{HTD-DAG}}
#' @title TPR-DAG Ensemble Variants
#' @description Function gathering the true-path-rule-based hierarchical learning ensemble algorithms and its variants. 
#' In their more general form the \code{TPR-DAG} algorithms adopt a two step learning strategy:
#' \enumerate{
#'  \item in the first step they compute a \emph{per-level bottom-up} visit from the leaves to the root to propagate positive predictions 
#'  across the hierarchy;
#'  \item in the second step they compute a \emph{per-level top-down} visit from the root to the leaves in order to assure the hierarchical 
#'  consistency of the predictions;
#' }
#' @details The \emph{vanilla} \code{TPR-DAG} adopts a per-level bottom-up traversal of the DAG to correct the flat predictions \eqn{\hat{y}_i}:
#' \deqn{
#'  \bar{y}_i := \frac{1}{1 + |\phi_i|} (\hat{y}_i + \sum_{j \in \phi_i} \bar{y}_j)
#' }
#' where \eqn{\phi_i} are the positive children of \eqn{i}.
#' Different strategies to select the positive children \eqn{\phi_i} can be applied:
#' \enumerate{
#'  \item \strong{Threshold-Free} strategy: the positive nodes are those children that can increment the score of the node \eqn{i}, that is those nodes 
#'  that achieve a score higher than that of their parents:
#'  \deqn{
#'      \phi_i := \{ j \in child(i) | \bar{y}_j > \hat{y}_i \}
#'  }
#'  \item \strong{Threshold} strategy: the positive children are selected on the basis of a threshold that can be selected in two different ways:
#'  \enumerate{
#'      \item for each node a constant threshold \eqn{\bar{t}} is a priori selected:
#'      \deqn{
#'          \phi_i := \{ j \in child(i) | \bar{y}_j > \bar{t} \}
#'      }
#'      For instance if the predictions represent probabilities it could be meaningful to a priori select \eqn{\bar{t}=0.5}.
#'      \item the threshold is selected to maximize some performance metric \eqn{\mathcal{M}} estimated on the training data, as for instance
#'      the F-score or the AUPRC. In other words the threshold is selected to maximize some measure of accuracy of the predictions 
#'      \eqn{\mathcal{M}(j,t)} on the training data for the class \eqn{j} with respect to the threshold \eqn{t}. 
#'      The corresponding set of positives \eqn{\forall i \in V} is:
#'      \deqn{
#'          \phi_i := \{ j \in child(i) | \bar{y}_j > t_j^*,  t_j^* = \arg \max_{t} \mathcal{M}(j,t) \}
#'      }
#'      For instance \eqn{t_j^*} can be selected from a set of \eqn{t \in (0,1)} through internal cross-validation techniques.
#'  }
#' }
#' The weighted \code{TPR-DAG} version can be designed by adding a weight \eqn{w \in [0,1]} to balance between the 
#' contribution of the node \eqn{i} and that of its positive children \eqn{\phi}, through their convex combination:
#' \deqn{
#'  \bar{y}_i := w \hat{y}_i + \frac{(1 - w)}{|\phi_i|} \sum_{j \in \phi_i} \bar{y}_j
#' }
#' If \eqn{w=1} no weight is attributed to the children and the \code{TPR-DAG} reduces to the \code{HTD-DAG} algorithm, since in this
#' way only the prediction for node \eqn{i} is used in the bottom-up step of the algorithm. If \eqn{w=0} only the predictors 
#' associated to the children nodes vote to predict node \eqn{i}. In the intermediate cases we attribute more importance to the predictor for the
#' node \eqn{i} or to its children depending on the values of \eqn{w}.
#'
#' The contribution of the descendants of a given node decays exponentially with their distance from the node itself. To enhance the 
#' contribution of the most specific nodes to the overall decision of the ensemble we designed a novel variant that we named \code{DESCENS}. 
#' The novelty of \code{DESCENS} consists in strongly considering the contribution of all the descendants of each node instead of 
#' only that of its children. Therefore \code{DESCENS} predictions are more influenced by the information embedded in the leaves nodes, 
#' that are the classes containing the most informative and meaningful information from a biological and medical standpoint. 
#' For the choice of the ``positive'' descendants we use the same strategies adopted for the selection of the ``positive'' 
#' children shown above. Furthermore, we designed a variant specific only for \code{DESCENS}, that we named \code{DESCENS}-\eqn{\tau}.
#' The \code{DESCENS}-\eqn{\tau} variants balances the contribution between the ``positives'' children of a node \eqn{i} 
#' and that of its ``positives'' descendants excluding its children by adding a weight \eqn{\tau \in [0,1]}:
#' \deqn{
#' \bar{y}_i := \frac{\tau}{ 1 +|\phi_i|} ( \hat{y}_i + \sum_{j \in \phi_i} \bar{y}_j ) + \frac{1-\tau}{1+|\delta_i|} ( \hat{y}_i + \sum_{j\in \delta_i} \bar{y}_j )
#' }
#' where \eqn{\phi_i} are the ``positive'' children of \eqn{i} and \eqn{\delta_i=\Delta_i \setminus \phi_i} the descendants of \eqn{i} without its children. 
#' If \eqn{\tau=1} we consider only the contribution of the ``positive'' children of \eqn{i}; if \eqn{\tau=0} only the descendants that are not
#' children contribute to the score, while for intermediate values of \eqn{\tau} we can balance the contribution of \eqn{\phi_i} and 
#' \eqn{\delta_i} positive nodes.
#'
#' Simply by replacing the \code{HTD} (\code{\link{HTD-DAG}}) top-down step with the \code{GPAV} approach (\code{\link{GPAV}}) we can design the
#' \code{TPR-DAG} variant \code{ISO-TPR}. The most important feature of \code{ISO-TPR} is that it maintains the hierarchical constraints by
#' construction and selects the closest solution (in the least square sense) to the bottom-up predictions that obeys the true path rule.
#' Obviously, any aforementioned strategy for the selection of ``positive'' children or descendants can be applied before executing the \code{GPAV} correction.
#' @param S a named flat scores matrix with examples on rows and classes on columns.
#' @param g a graph of class \code{graphNEL}. It represents the hierarchy of the classes.
#' @param root name of the class that it is on the top-level of the hierarchy (\code{def. root="00"}).
#' @param positive choice of the \emph{positive} nodes to be considered in the bottom-up strategy. Can be one of the following values:
#' \itemize{
#'  \item \code{children} (\code{def.}): for each node are considered its positive children;
#'  \item \code{descendants}: for each node are considered its positive descendants;
#' }
#' @param bottomup strategy to enhance the flat predictions by propagating the positive predictions from leaves to root. 
#' It can be one of the following values:
#' \itemize{
#'  \item \code{threshold.free} (\code{def.}): positive nodes are selected on the basis of the \code{threshold.free} strategy (\code{def.});
#'  \item \code{threshold}: positive nodes are selected on the basis of the \code{threshold} strategy;
#'  \item \code{weighted.threshold.free}: positive nodes are selected on the basis of the \code{weighted.threshold.free} strategy;
#'  \item \code{weighted.threshold}: positive nodes are selected on the basis of the \code{weighted.threshold} strategy;
#'  \item \code{tau}: positive nodes are selected on the basis of the \code{tau} strategy. 
#'  NOTE: \code{tau} is only a \code{DESCENS} variants. If you use \code{tau} strategy you must set the parameter \code{positive=descendants};
#' }
#' @param topdown strategy to make the scores hierarchy-consistent. It can be one of the following values:
#' \itemize{
#'  \item \code{HTD} (\code{def.}): \code{HTD-DAG} strategy is applied (\code{\link{HTD-DAG}});
#'  \item \code{GPAV}: \code{GPAV} strategy is applied (\code{\link{GPAV}});
#' }
#' @param t threshold for the choice of positive nodes (\code{def. t=0}). Set \code{t} only for the variants that requiring 
#' a threshold for the selection of the positive nodes, otherwise set \code{t} to zero.
#' @param w weight to balance between the contribution of the node \eqn{i} and that of its positive nodes. Set \code{w} only for the
#' \emph{weighted} variants, otherwise set \code{w} to zero.
#' @param W vector of weight relative to a single example. If the vector \code{W} is not specified (by \code{def. W=NULL}), \code{W} is a unitary 
#' vector of the same length of the columns' number of the flat scores matrix (root node included). Set \code{W} only if \code{topdown=GPAV}.
#' @param parallel boolean value:
#' \itemize{
#'  \item \code{TRUE}: execute the parallel implementation of GPAV (\code{\link{GPAV.parallel}});
#'  \item \code{FALSE} (def.): execute the sequential implementation of GPAV (\code{\link{GPAV.over.examples}});
#' }
#' Use \code{parallel} if and only if \code{topdown=GPAV}; otherwise set \code{parallel=FALSE}.
#' @param ncores number of cores to use for parallel execution (\code{def. 8}). Set \code{ncores=1} if \code{parallel=FALSE}, 
#' otherwise set \code{ncores} to the desired number of cores.
#' Use \code{ncores} if and only if \code{topdown=GPAV}; otherwise set \code{parallel=1}.
#' @return a named matrix with the scores of the classes corrected according to the chosen algorithm.
#' @export 
#' @examples
#' data(graph);
#' data(scores);
#' data(labels);
#' root <- root.node(g);
#' S.hier <- TPR.DAG(S, g, root, positive="children", bottomup="threshold.free", topdown="HTD", 
#' t=0, w=0, W=NULL, parallel=FALSE, ncores=1);
TPR.DAG <- function(S, g, root="00", positive="children", bottomup="threshold.free", topdown="HTD",
    t=0, w=0, W=NULL, parallel=FALSE, ncores=1){
    
    ## Setting Check
    if(positive!="children" && positive!="descendants" || bottomup!="threshold" && bottomup!="threshold.free" && 
        bottomup!="weighted.threshold" && bottomup!="weighted.threshold.free" && bottomup!="tau" || topdown!="HTD" && topdown!="GPAV")
        stop("TPR-DAG: positive or bottomup or topdown value misspelled", call.=FALSE);
    if(positive=="children" && bottomup=="tau")
        stop("TPR-DAG: tau is a descendants variants. Please set positive to descendants", call.=FALSE);
    if(bottomup=="threshold" || bottomup=="tau")
        w <- 0;
    if(bottomup=="threshold.free"){
        t <- 0;
        w <- 0;
    }
    if(bottomup=="weighted.threshold.free")
        t <-0;
    if(t==1 || w==1)
        warning("TPR-DAG: when t or w is equal to 1, TPR-DAG is reduced to HTD-DAG", call.=FALSE);  
    if(topdown=="GPAV" && parallel==TRUE && ncores<2)
        warning("GPAV: set ncores greater than 2 to exploit the GPAV parallel version", call.=FALSE);
    if(topdown=="GPAV" && parallel==FALSE && ncores>=2)
        warning("TPR-DAG: no GPAV parallel version is running, but ncores is higher or equal to 2.", " Set 'ncores' to 1 to run the sequential version or set 'parallel' to TRUE to run the parallel version", call.=FALSE);
    if(topdown=="HTD" && (parallel==TRUE || ncores>=2))
        warning("TPR-DAG: does not exist a parallel version of HTD. The parameters 'parallel' and 'ncores' do not effect on 'HTD'", ". Set 'parallel' to FALSE and/or 'ncores' to 1 to avoid this warning message", call.=FALSE);   

    ## add root node to S if it does not exist
    if(!(root %in% colnames(S))){
        max.score <- max(S);
        z <- rep(max.score,nrow(S));
        S <- cbind(z,S);
        colnames(S)[1] <- root;
    }
    ## check consistency between nodes of g and classes of S
    class.check <- ncol(S)!=numNodes(g);
    if(class.check)
        stop("TPR-DAG: mismatch between the number of nodes of the graph and the number of class of the scores matrix", call.=FALSE);

    ## computing graph levels
    levels <- graph.levels(g,root);

    ## bottom-up visit: positive children selection
    if(positive=="children"){
        chd.bup <- get.children.bottom.up(g,levels);
        for(i in 1:length(chd.bup)){
            if(length(chd.bup[[i]])!=0){
                parent <- S[,names(chd.bup[i])];
                children <- as.matrix(S[,chd.bup[[i]]]);
                # colnames(children) <- chd.bup[[i]]
                for(j in 1:length(parent)){
                    if(bottomup=="threshold"){
                        child.set <- children[j,] > t;    # positive children selection
                        child.pos <- children[j,][child.set];
                        parent[j] <- (parent[j] + sum(child.pos))/(1+length(child.pos));  # flat scores correction
                    }else if(bottomup=="threshold.free"){
                        child.set <- children[j,] > parent[j]; # positive children selection
                        child.pos <- children[j,][child.set];
                        parent[j] <- (parent[j] + sum(child.pos))/(1+length(child.pos));  # flat score correction
                    }else if(bottomup=="weighted.threshold.free"){
                        child.set <- children[j,] > parent[j];    # positive children selection
                        child.pos <- children[j,][child.set];
                        if(length(child.pos)!=0){
                            parent[j] <- w*parent[j] + (1-w)*sum(child.pos)/length(child.pos);  # flat score correction
                        }
                    }else if(bottomup=="weighted.threshold"){
                        child.set <- children[j,] > t;    # positive children selection
                        child.pos <- children[j,][child.set];
                        if(length(child.pos)!=0){
                            parent[j] <- w*parent[j] + (1-w)*sum(child.pos)/length(child.pos);  # flat score prediction
                        }
                    }
                }   
                S[,names(chd.bup[i])] <- parent;
            }
        }
    ## bottom-up visit: positive descendants selection
    }else if(positive=="descendants"){
        if(bottomup=="tau")
            chd.bup <- get.children.bottom.up(g,levels);
        desc.bup <- build.descendants.bottom.up(g,levels);
        nodes <- names(desc.bup);
        for(i in 1:length(desc.bup)){
            if(length(desc.bup[[i]])!=1){
                node.curr <- nodes[i];
                parent <- S[,names(desc.bup[i])];
                tmp <- setdiff(desc.bup[[i]],node.curr);
                if(bottomup=="tau"){
                    delta <- setdiff(tmp, chd.bup[[i]]);  # descendants without children 
                    children <- as.matrix(S[,chd.bup[[i]]]);    # genes considering children node 
                    desc <-  as.matrix(S[,delta]);      # genes considering descendants nodes without children
                }else{
                    desc <- as.matrix(S[,tmp]);
                }
                for(j in 1:length(parent)){
                    if(bottomup=="threshold"){
                        desc.set <- desc[j,] > t;    # positive descendants selection
                        desc.pos <- desc[j,][desc.set];
                        parent[j] <- (parent[j] + sum(desc.pos))/(1+length(desc.pos));   # flat scores correction
                    }else if(bottomup=="threshold.free"){
                        desc.set <- desc[j,] > parent[j];   # positive descendants selection
                        desc.pos <- desc[j,][desc.set];
                        parent[j] <- (parent[j] + sum(desc.pos))/(1+length(desc.pos));   # flat scores correction
                    }else if(bottomup=="weighted.threshold.free"){
                        desc.set <- desc[j,] > parent[j];
                        desc.pos <- desc[j,][desc.set];
                        if(length(desc.pos)!=0){
                            parent[j] <- w*parent[j] + (1-w)*sum(desc.pos)/length(desc.pos);  # flat scores correction
                        }
                    }else if(bottomup=="weighted.threshold"){
                        desc.set <- desc[j,] > t;
                        desc.pos <- desc[j,][desc.set];
                        if(length(desc.pos)!=0){
                            parent[j] <- w*parent[j] + (1-w)*sum(desc.pos)/length(desc.pos);  # flat scores correction
                        }
                    }else if(bottomup=="tau"){
                        desc.set <- desc[j,] > parent[j];           # positive descendants (without children) selection
                        desc.pos <- desc[j,][desc.set];
                        child.set <- children[j,] > parent[j];      # positive children selection
                        child.pos <- children[j,][child.set];
                        parent[j] <- t * ((parent[j] + sum(child.pos))/(1+length(child.pos))) + (1-t) * ((parent[j] + sum(desc.pos))/(1+length(desc.pos)));
                    }
                    S[,names(desc.bup[i])] <- parent;
                }
            }
        }
    }
    # top-down visit
    if(topdown=="HTD"){
        par.tod <- get.parents.top.down(g,levels,root);
        for(i in 1:length(par.tod)){
            child <- S[,names(par.tod[i])];
            parents <- as.matrix(S[,par.tod[[i]]]);
            # colnames(parents) <- par.tod[[i]]
            # Note: the version with an apply and an ifelse statement is slower ...
            for(j in 1:length(child)){
                x <- min(parents[j,]);
                if(x < child[j]){
                    child[j] <- x;    # hierarchical correction
                }
            }
            S[,names(par.tod[i])] <- child;
        }
    }else if(topdown=="GPAV"){
        if(parallel){
            S <- GPAV.parallel(S, g, W=W, ncores=ncores);
        }else{
            S <- GPAV.over.examples(S, W=W, g);
        }
    } 
    return(S);
}


#' @name TPR-DAG-cross-validation
#' @title TPR-DAG cross-validation experiments
#' @seealso \code{\link{TPR-DAG-variants}}
#' @description High level function to correct the computed scores in a hierarchy according to the chosen ensemble algorithm. 
#' @details The parametric hierarchical ensemble variants are cross-validated by maximizing in according to the metric
#' chosen in the parameter \code{metric}, that is F-measure (\code{\link{Multilabel.F.measure}}) or AUPRC (\code{\link{AUPRC}}).
#' @details The function checks if the number of classes between the flat scores matrix and the annotations matrix mismatched.
#' If so, the number of terms of the annotations matrix is shrunk to the number of terms of the flat scores matrix and
#' the corresponding subgraph is computed as well. N.B.: it is supposed that all the nodes of the subgraph are accessible from the root.
#' @details We excluded the predictions of the root node in computing all the performances, since it is a \emph{dummy} node added 
#' to the ontology for practical reasons (e.g. some graph-based software may require a single root node to work). However, the root node scores 
#' are stored in the hierarchical scores matrix.
#' @param threshold range of threshold values to be tested in order to find the best threshold (\code{def:} \code{from:0.1}, 
#' \code{to:0.9}, \code{by:0.1}).
#' The denser the range is, the higher the probability to find the best threshold is, but obviously the execution time will be higher.
#' Set this parameter only for the \emph{thresholded} variants; for the \emph{threshold-free} variants, 
#' the parameter \code{threshold} is automatically set to zero.
#' @param weight range of weight values to be tested in order to find the best weight (\code{def:} \code{from:0.1}, \code{to:0.9}, \code{by:0.1}).
#' The denser the range is, the higher the probability to find the best threshold is, but obviously the execution time will be higher.
#' Set this parameter only for the \emph{weighted} variants; for the \emph{weight-free} variants,
#' the parameter \code{weight} is automatically set to zero.
#' @param kk number of folds of the cross validation (\code{def: kk=5}) on which tuning the parameters \code{threshold}, \code{weight} and 
#' \code{tau} of the parametric variants of the hierarchical ensemble algorithms. For the non-parametric variants
#' (i.e. if \code{bottomup = threshold.free}), the parameter \code{kk} is automatically set to zero. 
#' @param folds number of folds of the cross validation on which computing the performance metrics averaged across folds (\code{def. 5}).
#' If \code{folds=NULL}, the performance metrics are computed one-shot, otherwise the performance metrics are averaged across folds.
#' If \code{compute.performance} is set to \code{FALSE}, \code{folds} is automatically set to \code{NULL}.
#' @param seed initialization seed for the random generator to create folds (\code{def. 23}). If \code{NULL} folds are generated without seed 
#' initialization. The parameter \code{seed} controls both the parameter \code{kk} and the parameter \code{folds}.
#' If \code{compute.performance} is set to \code{FALSE} and \code{bottomup} is set to \code{threshold.free}, then 
#' \code{seed} is automatically set to \code{NULL}.
#' @param norm boolean value: should the flat scores matrix be normalized?
#' \itemize{
#' \item \code{TRUE} (\code{def.}): the flat scores matrix has been already normalized in according to a normalization method;  
#' \item \code{FALSE}: the flat scores matrix has not been normalized yet. See the parameter \code{norm.type} to set the on the fly 
#'  normalization method to apply among those possible;
#' }
#' @param norm.type can be one of the following three values: 
#'  \enumerate{
#'  \item \code{NULL} (\code{def.}): set \code{norm.type} to \code{NULL} if and only if the parameter \code{norm} is set to \code{TRUE};
#'  \item \code{MaxNorm}: each score is divided for the maximum of each class;
#'  \item \code{Qnorm}: quantile normalization. \pkg{preprocessCore} package is used;
#'  }
#' @param positive choice of the \emph{positive} nodes to be considered in the bottom-up strategy. Can be one of the following values:
#' \itemize{
#'  \item \code{children} (\code{def.}): for each node are considered its positive children;
#'  \item \code{descendants}: for each node are considered its positive descendants;
#' }
#' @param bottomup strategy to enhance the flat predictions by propagating the positive predictions from leaves to root. 
#' It can be one of the following values:
#' \itemize{
#'  \item \code{threshold.free} (\code{def.}): positive nodes are selected on the basis of the \code{threshold.free} strategy (\code{def.});
#'  \item \code{threshold}: positive nodes are selected on the basis of the \code{threshold} strategy;
#'  \item \code{weighted.threshold.free}: positive nodes are selected on the basis of the \code{weighted.threshold.free} strategy;
#'  \item \code{weighted.threshold}: positive nodes are selected on the basis of the \code{weighted.threshold} strategy;
#'  \item \code{tau}: positive nodes are selected on the basis of the \code{tau} strategy;
#'  NOTE: \code{tau} is only a \code{DESCENS} variants. If you use \code{tau} strategy you must set the parameter \code{positive=descendants};
#' }
#' @param topdown strategy to make the scores hierarchy-consistent. It can be one of the following values:
#' \itemize{
#'  \item \code{HTD} (\code{def.}): \code{HTD-DAG} strategy is applied (\code{\link{HTD-DAG}});
#'  \item \code{GPAV}: \code{GPAV} strategy is applied (\code{\link{GPAV}});
#' }
#' @param W vector of weight relative to a single example. If the vector \code{W} is not specified (by \code{def. W=NULL}), \code{W} is a unitary 
#' vector of the same length of the columns' number of the flat scores matrix (root node included). Set \code{W} only if \code{topdown=GPAV}.
#' @param parallel boolean value:
#' \itemize{
#'  \item \code{TRUE}: execute the parallel implementation of GPAV (\code{\link{GPAV.parallel}});
#'  \item \code{FALSE} (\code{def.}): execute the sequential implementation of GPAV (\code{\link{GPAV.over.examples}});
#' }
#' Use \code{parallel} if and only if \code{topdown=GPAV}; otherwise set \code{parallel=FALSE}.
#' @param ncores number of cores to use for parallel execution (\code{def. 8}). Set \code{ncores=1} if \code{parallel=FALSE}, 
#' otherwise set \code{ncores} to the desired number of cores. Use \code{ncores} only if \code{topdown=GPAV}; otherwise set \code{parallel=1}.
#' @param n.round number of rounding digits to be applied to the hierarchical scores matrix (\code{def. 3}). It is used for choosing 
#' the best threshold on the basis of the best F-measure.
#' If \code{compute.performance} is set to \code{FALSE} and \code{bottomup} is set to \code{threshold.free}, then 
#' \code{n.round} is automatically set to \code{NULL}.
#' @param f.criterion character. Type of F-measure to be used to select the best F-measure. Two possibilities:
#' \enumerate{
#' \item \code{F} (def.): corresponds to the harmonic mean between the average precision and recall;
#' \item \code{avF}: corresponds to the per-example \code{F-score} averaged across all the examples;
#' }
#' If \code{compute.performance} is set to \code{FALSE} and \code{bottomup} is set to \code{threshold.free}, then 
#' \code{f.criterion} is automatically set to \code{NULL}.
#' @param recall.levels a vector with the desired recall levels (\code{def:} \code{from:0.1}, \code{to:0.9}, \code{by:0.1}) to compute the 
#' Precision at fixed Recall level (PXR). If \code{compute.performance=FALSE} the parameter \code{recall.levels} is automatically set to \code{NULL}.
#' @param compute.performance boolean value: should the flat and hierarchical performance (\code{AUPRC}, \code{AUROC}, \code{PXR}, 
#' \code{multilabel F-score}) be returned?  
#' \itemize{
#' \item \code{FALSE} (\code{def.}): performance are not computed and just the hierarchical scores matrix is returned;
#' \item \code{TRUE}: both performance and hierarchical scores matrix are returned;
#' }
#' @param metric a string character specifying the performance metric on which to maximize the parametric ensemble variant. 
#' It can be one of the following values:
#' \enumerate{
#' \item \code{PRC}: the parametric ensemble variant is maximized on the basis of AUPRC (\code{\link{AUPRC}});
#' \item \code{FMAX}: the parametric ensemble variant is maximized on the basis of Fmax (\code{\link{Multilabel.F.measure}};
#' \item \code{NULL}: on the \code{threshold.free} variant none parameter optimization is needed, since the variant is non-parametric.
#' So, if \code{bottomup=threshold.free} set \code{metric=NULL} (\code{def.});
#' }
#' @param flat.file name of the file containing the flat scores matrix to be normalized or already normalized (without rda extension).
#' @param ann.file name of the file containing the label matrix of the examples (without rda extension).
#' @param dag.file name of the file containing the graph that represents the hierarchy of the classes (without rda extension).
#' @param flat.dir relative path where flat scores matrix is stored.
#' @param ann.dir relative path where annotation matrix is stored.
#' @param dag.dir relative path where graph is stored.
#' @param hierScore.dir relative path where the hierarchical scores matrix must be stored.
#' @param perf.dir relative path where the performance measures must be stored. If \code{compute.performance=FALSE}, 
#' the parameter \code{perf.dir} is automatically set to \code{NULL}.
#' @return Two \code{rda} files stored in the respective output directories:
#' \enumerate{
#'  \item \code{Hierarchical Scores Results}: a matrix with examples on rows and classes on columns representing the computed hierarchical scores 
#'  for each example and for each considered class. It is stored in the \code{hierScore.dir} directory;
#'  \item \code{Performance Measures}: \emph{flat} and \emph{hierarchical} performance results:
#'  \enumerate{
#'      \item AUPRC results computed though \code{AUPRC.single.over.classes} (\code{\link{AUPRC}});
#'      \item AUROC results computed through \code{AUROC.single.over.classes} (\code{\link{AUROC}}); 
#'      \item PXR results computed though \code{precision.at.given.recall.levels.over.classes} (\code{\link{PXR}});
#'      \item FMM results computed though \code{compute.Fmeasure.multilabel} (\code{\link{FMM}}); 
#' }}
#' It is stored in the \code{perf.dir} directory.
#' @export
#' @examples
#' data(graph);
#' data(scores);
#' data(labels);
#' tmpdir <- paste0(tempdir(),"/");
#' save(g, file=paste0(tmpdir,"graph.rda"));
#' save(L, file=paste0(tmpdir,"labels.rda"));
#' save(S, file=paste0(tmpdir,"scores.rda"));
#' dag.dir <- flat.dir <- ann.dir <- tmpdir;
#' hierScore.dir <- perf.dir <- tmpdir;
#' dag.file <- "graph";
#' flat.file <- "scores";
#' ann.file <- "labels";
#' threshold <- weight <- 0;
#' norm.type <- "MaxNorm";
#' positive <- "children";
#' bottomup <- "threshold.free";
#' topdown <- "HTD";
#' recall.levels <- seq(from=0.25, to=1, by=0.25);
#' Do.TPR.DAG(threshold=threshold, weight=weight, kk=NULL, folds=NULL, seed=NULL, 
#' norm=FALSE, norm.type=norm.type, positive=positive, bottomup=bottomup, topdown=topdown, 
#' W=NULL, compute.performance=TRUE, parallel=FALSE, ncores=1, n.round=3, f.criterion="F", 
#' metric=NULL, recall.levels=recall.levels, flat.file=flat.file, ann.file=ann.file, 
#' dag.file=dag.file, flat.dir=flat.dir, ann.dir=ann.dir, dag.dir=dag.dir, 
#' hierScore.dir=hierScore.dir, perf.dir=perf.dir);
Do.TPR.DAG <- function(threshold=seq(from=0.1, to=0.9, by=0.1), weight=seq(from=0.1, to=0.9, by=0.1), 
    kk=5, folds=5, seed=23, norm=TRUE, norm.type=NULL, positive="children", bottomup="threshold.free", 
    topdown="HTD", W=NULL, parallel=FALSE, ncores=1, n.round=3, f.criterion="F", metric=NULL, 
    recall.levels=seq(from=0.1, to=1, by=0.1), compute.performance=FALSE, flat.file=flat.file, 
    ann.file=ann.file, dag.file=dag.file, flat.dir=flat.dir, ann.dir=ann.dir, dag.dir=dag.dir, 
    hierScore.dir=hierScore.dir, perf.dir=perf.dir){
    
    ## Setting Check
    if(positive!="children" && positive!="descendants" || bottomup!="threshold" && bottomup!="threshold.free" && bottomup!="weighted.threshold" && bottomup!="weighted.threshold.free" && bottomup!="tau" || topdown!="HTD" && topdown!="GPAV")
        stop("TPR-DAG: positive or bottomup or topdown value misspelled", call.=FALSE);
    if(positive=="children" && bottomup=="tau")
        stop("TPR-DAG: tau is a descendants variants. Please set positive to descendants", call.=FALSE);
    if(bottomup=="threshold" || bottomup=="tau")
        weight <- 0;
    if(bottomup=="threshold.free"){
        threshold <- 0; 
        weight <- 0;
    }
    if(bottomup=="weighted.threshold.free")
        threshold <- 0;
    if(norm==FALSE && is.null(norm.type))
        stop("TPR-DAG: If norm is set to FALSE, you need to specify a normalization method among those available", call.=FALSE);
    if(norm==TRUE && !is.null(norm.type))
        warning("TPR-DAG: If norm is set to TRUE, the input flat matrix is already normalized.", paste0(" Set norm.type to NULL and not to '", norm.type, "' to avoid this warning message"), call.=FALSE);
    if((is.null(kk) || kk<=1) && bottomup!="threshold.free")
        stop("TPR-DAG: Smallest number of folds to define test and training set is 2. Set kk larger or equal to 2", call.=FALSE);
    if(!is.null(kk) && bottomup=="threshold.free")
        kk <- NULL; 
    if(f.criterion!="F" && f.criterion!="avF" && compute.performance==TRUE)
        stop("TPR-DAG: value of parameter 'f.criterion' misspelled");   
    if(metric!="FMAX" && metric!="PRC" && !is.null(metric))
        stop("TPR-DAG: value of parameter 'metric' misspelled", call.=FALSE);
    if(is.null(metric) && bottomup!="threshold.free")
        stop(paste0("TPR-DAG: metric cannot be NULL. The bottom-up approach '", bottomup, "' is parametric"), ". Please select the metric on which maximize according to those available", call.=FALSE); 
    if(!is.null(metric) && bottomup=="threshold.free") 
        warning("TPR-DAG: the bottom-up approach 'threshold.free' is non-parametric.", paste0(" The chosen parameter metric '", metric, "' does not effect on 'threshold.free' bottom-up approach"), ". Set metric to NULL to avoid this warning message", call.=FALSE);
    if(is.null(seed) && bottomup!="threshold.free")
        warning("TPR-DAG: folds are generate without seed initialization", call.=FALSE);
    if(compute.performance==FALSE && (!is.null(recall.levels) || !is.null(perf.dir) || !is.null(folds))){
        perf.dir <- NULL;
        recall.levels <- NULL;
        folds <- NULL;
    }
    if(compute.performance==FALSE && bottomup=="threshold.free" && (!is.null(seed) || !is.null(n.round) || !is.null(f.criterion))){
        seed <- NULL;
        n.round <- NULL;
        f.criterion <- NULL;
    }

    ## loading dag
    dag.path <- paste0(dag.dir, dag.file,".rda");
    g <- get(load(dag.path));
    root <- root.node(g);

    ## loading annotation matrix
    ann.path <- paste0(ann.dir, ann.file,".rda");
    ann <- get(load(ann.path));
    if(root %in% colnames(ann))
        ann <- ann[,-which(colnames(ann)==root)];

    ## loading flat matrix
    flat.path <- paste0(flat.dir, flat.file,".rda");
    if(norm){
        S <- get(load(flat.path));
        if(root %in% colnames(S)){
            classes.names <- colnames(S)
            root.scores <- S[,which(colnames(S)==root)];
            S <- S[,-which(colnames(S)==root)];
        }
    }else{
        S <- get(load(flat.path));
        S <- scores.normalization(norm.type=norm.type, S);
        cat(norm.type, "NORMALIZATION: DONE", "\n");
        if(root %in% colnames(S)){
            classes.names <- colnames(S)
            root.scores <- S[,which(colnames(S)==root)];
            S <- S[,-which(colnames(S)==root)];
        }
    }

    ## check if |flat matrix classes| = |annotation matrix classes| 
    ## if not the classes of annotation matrix are shrunk to those of flat matrix
    class.check <- ncol(S)!=ncol(ann);
    if(class.check){
        ann <- ann[,colnames(S)];
        nd <- c(root, colnames(S));
        g <- do.subgraph(nd, g, edgemode="directed");
    }

    ## Compute FLAT PRC, AUC, PXR (average and per class) and FMM (average and per-example) one-shoot or cross-validated 
    if(compute.performance){
        PRC.flat <- AUPRC.single.over.classes(ann, S, folds=folds, seed=seed);
        AUC.flat <- AUROC.single.over.classes(ann, S, folds=folds, seed=seed);
        PXR.flat <- precision.at.given.recall.levels.over.classes(ann, S, folds=folds, seed=seed, recall.levels=recall.levels);
        FMM.flat <- compute.Fmeasure.multilabel(ann, S, n.round=n.round, f.criterion=f.criterion, verbose=FALSE, b.per.example=TRUE, folds=folds, seed=seed);
        cat("FLAT PERFORMANCE: DONE", "\n");
    }

    ## Hierarchical Correction 
    if(bottomup=="threshold.free"){
        if(exists("root.scores")){
            S <- cbind(root.scores, S);
            colnames(S)[1] <- root;
            S <- S[,classes.names];
        }
        S <- TPR.DAG(S, g, root=root, positive=positive, bottomup=bottomup, topdown=topdown, t=0, w=0, W=W, parallel=parallel, ncores=ncores);
        S.hier <- S; 
        cat("HIERARCHICAL CORRECTION: DONE", "\n");
        if(compute.performance){
            if(root %in% colnames(S))
                S <- S[,-which(colnames(S)==root)];
            ## Compute HIER PRC, AUC, PXR (average and per class) and FMM (average and per-example) one-shoot or cross-validated
            PRC.hier <- AUPRC.single.over.classes(ann, S, folds=folds, seed=seed);
            AUC.hier <- AUROC.single.over.classes(ann, S, folds=folds, seed=seed);
            PXR.hier <- precision.at.given.recall.levels.over.classes(ann, S, folds=folds, seed=seed, recall.levels=recall.levels);
            FMM.hier <- compute.Fmeasure.multilabel(ann, S, n.round=n.round, f.criterion=f.criterion, verbose=FALSE, b.per.example=TRUE, folds=folds, seed=seed);
            cat("HIERARCHICAL PERFORMANCE: DONE", "\n");
        }
        rm(S);
    }else{
        if(exists("root.scores")){
            S <- cbind(root.scores, S);
            colnames(S)[1] <- root;
            S <- S[,classes.names];
        }
        ## Let's start k-fold crossing validation for choosing best threshold and weight maximizing on the selected metric
        testIndex <- do.unstratified.cv.data(S, kk=kk, seed=seed); 
        S.hier <- c(); # variable to host the k-assembled sub-matrix  
        # training.top <- vector(mode="list", length=kk); ## for check
        for(k in 1:kk){
            ## training test
            training <- S[-testIndex[[k]],];
            target.training <- ann[-testIndex[[k]],];           
            ## test set
            test <- S[testIndex[[k]],]; 
            target.test <- ann[testIndex[[k]],];
            ## metric initialization        
            top.metric <- 0;
            bestT <- 0;
            bestW <- 0;
            for(t in threshold){
                for(w in weight){
                    pred.training <- TPR.DAG(training, g, root=root, positive=positive, bottomup=bottomup, topdown=topdown, w=w, t=t, W=W, parallel=parallel, ncores=ncores);
                    if(metric=="FMAX"){
                        if(root %in% colnames(pred.training))
                            pred.training <- pred.training[,-which(colnames(pred.training)==root)];
                        training.metric <- find.best.f(target.training, pred.training, n.round=n.round, f.criterion=f.criterion, verbose=FALSE, b.per.example=FALSE)[["F"]];
                    }else{
                        if(root %in% colnames(pred.training))
                            pred.training <- pred.training[,-which(colnames(pred.training)==root)];
                        training.metric <- AUPRC.single.over.classes(target.training, pred.training, folds=NULL, seed=NULL)$average;
                    }
                    if(training.metric > top.metric){
                        top.metric <- training.metric;
                        bestT <- t;
                        bestW <- w;
                        # training.top[[k]] <- c(metric=top.metric, best.thres=bestT, best.weight=bestW);
                        if(bottomup=="threshold" || bottomup=="tau"){
                            cat("training fold:", k, paste0("top ", metric," avg found:"), top.metric, "best threshold:", bestT, sep="\t", "\n");
                        }else if(bottomup=="weighted.threshold.free"){
                            cat("training fold:", k, paste0("top ", metric," avg found:"), top.metric, "best weight:", bestW, sep="\t", "\n");
                        }
                        else{
                            cat("training fold:", k, paste0("top ", metric," avg found:"), top.metric, "best threshold:",bestT, "best weight:", bestW, sep="\t", "\n");
                        }
                    }
                }
            }
            ## test set 
            pred.test <- TPR.DAG(test, g, root=root, positive=positive, bottomup=bottomup, topdown=topdown, t=bestT, w=bestW, W=W, parallel=parallel, ncores=ncores);
            ## assembling the hierarchical scores of each k sub-matrix
            S.hier <- rbind(S.hier, pred.test); 
        }
        ## put the rows (i.e. genes) of assembled k sub-matrix in the same order of the beginning matrix
        S.hier <- S.hier[rownames(S),];
        cat("HIERARCHICAL CORRECTION: DONE", "\n");
        if(compute.performance){
            S <- S.hier; ## trick: recycle S to store hierarchical scores matrix with root node
            if(root %in% colnames(S.hier))
                S.hier <- S.hier[,-which(colnames(S.hier)==root)];
            ## Compute HIER PRC, AUC, PXR (average and per class) and FMM (average and per-example) one-shoot or cross-validated
            PRC.hier <- AUPRC.single.over.classes(ann, S.hier, folds=folds, seed=seed);
            AUC.hier <- AUROC.single.over.classes(ann, S.hier, folds=folds, seed=seed);
            PXR.hier <- precision.at.given.recall.levels.over.classes(ann, S.hier, folds=folds, seed=seed, recall.levels=recall.levels);
            FMM.hier <- compute.Fmeasure.multilabel(ann, S.hier, n.round=n.round, f.criterion=f.criterion, verbose=FALSE, b.per.example=TRUE, folds=folds, seed=seed);
            S.hier <- S;
            cat("HIERARCHICAL PERFORMANCE: DONE", "\n");
        }    
        ## remove no longer useful variables..
        rm(S, testIndex, pred.test, test, training, target.test, target.training); gc();
    }

    ## Storing Results 
    if(positive=="children" && bottomup=="threshold.free" && topdown=="HTD")
        meth.name <- "tprTF";
    if(positive=="children" && bottomup=="threshold" && topdown=="HTD")
        meth.name <- "tprT";
    if(positive=="children" && bottomup=="weighted.threshold.free" && topdown=="HTD")
        meth.name <- "tprW";
    if(positive=="children" && bottomup=="weighted.threshold" && topdown=="HTD")
        meth.name <- "tprWT";
    if(positive=="descendants" && bottomup=="threshold.free" && topdown=="HTD")
        meth.name <- "descensTF";
    if(positive=="descendants" && bottomup=="threshold" && topdown=="HTD")
        meth.name <- "descensT";
    if(positive=="descendants" && bottomup=="weighted.threshold.free" && topdown=="HTD")
        meth.name <- "descensW";
    if(positive=="descendants" && bottomup=="weighted.threshold" && topdown=="HTD")
        meth.name <- "descensWT";
    if(positive=="descendants" && bottomup=="tau" && topdown=="HTD")
        meth.name <- "descensTAU";

    if(positive=="children" && bottomup=="threshold.free" && topdown=="GPAV")
        meth.name <- "ISOtprTF";
    if(positive=="children" && bottomup=="threshold" && topdown=="GPAV")
        meth.name <- "ISOtprT";
    if(positive=="children" && bottomup=="weighted.threshold.free" && topdown=="GPAV")
        meth.name <- "ISOtprW";
    if(positive=="children" && bottomup=="weighted.threshold" && topdown=="GPAV")
        meth.name <- "ISOtprWT";
    if(positive=="descendants" && bottomup=="threshold.free" && topdown=="GPAV")
        meth.name <- "ISOdescensTF";
    if(positive=="descendants" && bottomup=="threshold" && topdown=="GPAV")
        meth.name <- "ISOdescensT";
    if(positive=="descendants" && bottomup=="weighted.threshold.free" && topdown=="GPAV")
        meth.name <- "ISOdescensW";
    if(positive=="descendants" && bottomup=="weighted.threshold" && topdown=="GPAV")
        meth.name <- "ISOdescensWT";
    if(positive=="descendants" && bottomup=="tau" && topdown=="GPAV")
        meth.name <- "ISOdescensTAU";

    if(norm){
        save(S.hier, file=paste0(hierScore.dir, flat.file, ".hierScores.",meth.name,".rda"), compress=TRUE);
        if(compute.performance){
            save(PRC.flat, PRC.hier, AUC.flat, AUC.hier, PXR.flat, PXR.hier, FMM.flat, FMM.hier, file=paste0(perf.dir, "PerfMeas.", flat.file, ".hierScores.",meth.name,".rda"), compress=TRUE);
        }
    }else{
        save(S.hier, file=paste0(hierScore.dir, norm.type,".", flat.file, ".hierScores.",meth.name,".rda"), compress=TRUE);
        if(compute.performance){
            save(PRC.flat, PRC.hier, AUC.flat, AUC.hier, PXR.flat, PXR.hier, FMM.flat, FMM.hier, file=paste0(perf.dir, "PerfMeas.", norm.type, ".", flat.file, ".hierScores.",meth.name,".rda"), compress=TRUE);
        }
    }
}

#' @name TPR-DAG-holdout
#' @title TPR-DAG holdout experiments
#' @seealso \code{\link{TPR-DAG-variants}}
#' @description High level function to correct the computed scores in a hierarchy according to the chosen ensemble algorithm. 
#' @details The parametric hierarchical ensemble variants are cross-validated by maximizing in according to the metric
#' chosen in the parameter \code{metric}, that is F-measure (\code{\link{Multilabel.F.measure}}) or AUPRC (\code{\link{AUPRC}}).
#' @details The function checks if the number of classes between the flat scores matrix and the annotations matrix mismatched.
#' If so, the number of terms of the annotations matrix is shrunk to the number of terms of the flat scores matrix and
#' the corresponding subgraph is computed as well. N.B.: it is supposed that all the nodes of the subgraph are accessible from the root.
#' @details We excluded the predictions of the root node in computing all the performances, since it is a \emph{dummy} node added 
#' to the ontology for practical reasons (e.g. some graph-based software may require a single root node to work). However, the root node scores 
#' are stored in the hierarchical scores matrix.
#' @param threshold range of threshold values to be tested in order to find the best threshold (\code{def:} \code{from:0.1}, 
#' \code{to:0.9}, \code{by:0.1}).
#' The denser the range is, the higher the probability to find the best threshold is, but obviously the execution time will be higher.
#' Set this parameter only for the \emph{thresholded} variants; for the \emph{threshold-free} variants, 
#' the parameter \code{threshold} is automatically set to zero.
#' @param weight range of weight values to be tested in order to find the best weight (\code{def:} \code{from:0.1}, \code{to:0.9}, \code{by:0.1}).
#' The denser the range is, the higher the probability to find the best threshold is, but obviously the execution time will be higher.
#' Set this parameter only for the \emph{weighted} variants; for the \emph{weight-free} variants,
#' the parameter \code{weight} is automatically set to zero.
#' @param kk number of folds of the cross validation (\code{def: kk=5}) on which tuning the parameters \code{threshold}, \code{weight} and 
#' \code{tau} of the parametric variants of the hierarchical ensemble algorithms. For the non-parametric variants
#' (i.e. if \code{bottomup = threshold.free}), the parameter \code{kk} is automatically set to zero.
#' @param folds number of folds of the cross validation on which computing the performance metrics averaged across folds (\code{def. 5}).
#' If \code{folds=NULL}, the performance metrics are computed one-shot, otherwise the performance metrics are averaged across folds.
#' If \code{compute.performance} is set to \code{FALSE}, \code{folds} is automatically set to \code{NULL}.
#' @param seed initialization seed for the random generator to create folds (\code{def. 23}). If \code{NULL} folds are generated without seed 
#' initialization. The parameter \code{seed} controls both the parameter \code{kk} and the parameter \code{folds}.
#' If \code{compute.performance} is set to \code{FALSE} and \code{bottomup} is set to \code{threshold.free}, then 
#' \code{seed} is automatically set to \code{NULL}.
#' @param norm boolean value: should the flat scores matrix be normalized?
#' \itemize{
#' \item \code{TRUE} (\code{def.}): the flat scores matrix has been already normalized in according to a normalization method;  
#' \item \code{FALSE}: the flat scores matrix has not been normalized yet. See the parameter \code{norm.type} to set the on the fly 
#'  normalization method to apply among those possible;
#' }
#' @param norm.type can be one of the following three values: 
#'  \enumerate{
#'  \item \code{NULL} (\code{def.}): set \code{norm.type} to \code{NULL} if and only if the parameter \code{norm} is set to \code{TRUE};
#'  \item \code{MaxNorm}: each score is divided for the maximum of each class;
#'  \item \code{Qnorm}: quantile normalization. \pkg{preprocessCore} package is used;
#'  }
#' @param positive choice of the \emph{positive} nodes to be considered in the bottom-up strategy. Can be one of the following values:
#' \itemize{
#'  \item \code{children} (\code{def.}): for each node are considered its positive children;
#'  \item \code{descendants}: for each node are considered its positive descendants;
#' }
#' @param bottomup strategy to enhance the flat predictions by propagating the positive predictions from leaves to root. 
#' It can be one of the following values:
#' \itemize{
#'  \item \code{threshold.free} (\code{def.}): positive nodes are selected on the basis of the \code{threshold.free} strategy (\code{def.});
#'  \item \code{threshold}: positive nodes are selected on the basis of the \code{threshold} strategy;
#'  \item \code{weighted.threshold.free}: positive nodes are selected on the basis of the \code{weighted.threshold.free} strategy;
#'  \item \code{weighted.threshold}: positive nodes are selected on the basis of the \code{weighted.threshold} strategy;
#'  \item \code{tau}: positive nodes are selected on the basis of the \code{tau} strategy; 
#'  NOTE: \code{tau} is only a \code{DESCENS} variants. If you use \code{tau} strategy you must set the parameter \code{positive=descendants};
#' }
#' @param topdown strategy to make the scores hierarchy-consistent. It can be one of the following values:
#' \itemize{
#'  \item \code{HTD} (\code{def.}): \code{HTD-DAG} strategy is applied (\code{\link{HTD-DAG}});
#'  \item \code{GPAV}: \code{GPAV} strategy is applied (\code{\link{GPAV}});
#' }
#' @param W vector of weight relative to a single example. If the vector \code{W} is not specified (by \code{def.} \code{W=NULL}), \code{W} is a unitary 
#' vector of the same length of the columns' number of the flat scores matrix (root node included). Set \code{W} only if \code{topdown=GPAV}.
#' @param parallel boolean value:
#' \itemize{
#'  \item \code{TRUE}: execute the parallel implementation of GPAV (\code{\link{GPAV.parallel}});
#'  \item \code{FALSE} (\code{def.}): execute the sequential implementation of GPAV (\code{\link{GPAV.over.examples}});
#' }
#' Use \code{parallel} if and only if \code{topdown=GPAV}; otherwise set \code{parallel=FALSE}.
#' @param ncores number of cores to use for parallel execution (\code{def. 8}). Set \code{ncores=1} if \code{parallel=FALSE}, 
#' otherwise set \code{ncores} to the desired number of cores.
#' Use \code{ncores} if and only if \code{topdown=GPAV}; otherwise set \code{parallel=1}.
#' @param recall.levels a vector with the desired recall levels (\code{def:} \code{from:0.1}, \code{to:0.9}, \code{by:0.1}) to compute the 
#' Precision at fixed Recall level (PXR). If \code{compute.performance=FALSE} then \code{recall.levels} is automatically set to \code{NULL}.
#' @param n.round number of rounding digits to be applied to the hierarchical scores matrix (\code{def. 3}). It is used for choosing 
#' the best threshold on the basis of the best F-measure.
#' If \code{compute.performance} is set to \code{FALSE} and \code{bottomup} is set to \code{threshold.free}, then 
#' \code{n.round} is automatically set to \code{NULL}.
#' @param f.criterion character. Type of F-measure to be used to select the best F-measure. Two possibilities:
#' \enumerate{
#' \item \code{F} (def.): corresponds to the harmonic mean between the average precision and recall;
#' \item \code{avF}: corresponds to the per-example \code{F-score} averaged across all the examples;
#' }
#' If \code{compute.performance} is set to \code{FALSE} and \code{bottomup} is set to \code{threshold.free}, then 
#' \code{f.criterion} is automatically set to \code{NULL}.
#' @param compute.performance boolean value: should the flat and hierarchical performance (\code{AUPRC}, \code{AUROC}, \code{PXR}, 
#' \code{multilabel F-score}) be returned?  
#' \itemize{
#' \item \code{FALSE}: performance are not computed and just the hierarchical scores matrix is returned;
#' \item \code{TRUE} (\code{def.}): both performance and hierarchical scores matrix are returned;
#' }
#' @param metric a string character specifying the performance metric on which to maximize the parametric ensemble variant. 
#' It can be one of the following values:
#' \enumerate{
#' \item \code{PRC}: the parametric ensemble variant is maximized on the basis of AUPRC (\code{\link{AUPRC}});
#' \item \code{FMAX}: the parametric ensemble variant is maximized on the basis of Fmax (\code{\link{Multilabel.F.measure}});
#' \item \code{NULL}: on the \code{threshold.free} variant none parameter optimization is needed, since the variant is non-parametric.
#' So, if \code{bottomup=threshold.free} set \code{metric=NULL} (\code{def.});
#' }
#' @param flat.file name of the file containing the flat scores matrix to be normalized or already normalized (without rda extension).
#' @param ann.file name of the file containing the label matrix of the examples (without rda extension).
#' @param dag.file name of the file containing the graph that represents the hierarchy of the classes (without rda extension).
#' @param ind.test.set name of the file containing a vector of integer numbers corresponding to the indices of the elements (rows) of scores 
#' matrix to be used in the test set. 
#' @param ind.dir relative path to folder where \code{ind.test.set} is stored.
#' @param flat.dir relative path where flat scores matrix is stored.
#' @param ann.dir relative path where annotation matrix is stored.
#' @param dag.dir relative path where graph is stored.
#' @param hierScore.dir relative path where the hierarchical scores matrix must be stored.
#' @param perf.dir relative path where the performance measures must be stored. If \code{compute.performance=FALSE}, 
#' the parameter \code{perf.dir} is automatically set to \code{NULL}.
#' @return Two \code{rda} files stored in the respective output directories:
#' \enumerate{
#'  \item \code{Hierarchical Scores Results}: a matrix with examples on rows and classes on columns representing the computed hierarchical scores 
#'  for each example and for each considered class. It is stored in the \code{hierScore.dir} directory;
#'  \item \code{Performance Measures}: \emph{flat} and \emph{hierarchical} performace results:
#'  \enumerate{
#'      \item AUPRC results computed though \code{AUPRC.single.over.classes} (\code{\link{AUPRC}});
#'      \item AUROC results computed through \code{AUROC.single.over.classes} (\code{\link{AUROC}}); 
#'      \item PXR results computed though \code{precision.at.given.recall.levels.over.classes} (\code{\link{PXR}});
#'      \item FMM results computed though \code{compute.Fmeasure.multilabel} (\code{\link{FMM}}); 
#' }}
#' It is stored in the \code{perf.dir} directory.
#' @export
#' @examples
#' data(graph);
#' data(scores);
#' data(labels);
#' data(test.index);
#' tmpdir <- paste0(tempdir(),"/");
#' save(g, file=paste0(tmpdir,"graph.rda"));
#' save(L, file=paste0(tmpdir,"labels.rda"));
#' save(S, file=paste0(tmpdir,"scores.rda"));
#' save(test.index, file=paste0(tmpdir,"test.index.rda"));
#' ind.dir <- dag.dir <- flat.dir <- ann.dir <- tmpdir;
#' hierScore.dir <- perf.dir <- tmpdir;
#' ind.test.set <- "test.index";
#' dag.file <- "graph";
#' flat.file <- "scores";
#' ann.file <- "labels";
#' threshold <- weight <- 0;
#' norm.type <- "MaxNorm";
#' positive <- "children";
#' bottomup <- "threshold.free";
#' topdown <- "HTD";
#' recall.levels <- seq(from=0.25, to=1, by=0.25);
#' Do.TPR.DAG.holdout(threshold=threshold, weight=weight, kk=NULL, folds=NULL, seed=NULL, norm=FALSE, 
#' norm.type=norm.type, positive=positive, bottomup=bottomup, topdown=topdown, W=NULL, 
#' compute.performance=TRUE, parallel=FALSE, ncores=1, recall.levels=recall.levels, n.round=3, 
#' f.criterion="F", metric=NULL, flat.file=flat.file, ann.file=ann.file, dag.file=dag.file, 
#' ind.test.set=ind.test.set, ind.dir=ind.dir, flat.dir=flat.dir, ann.dir=ann.dir, dag.dir=dag.dir, 
#' hierScore.dir=hierScore.dir, perf.dir=perf.dir);
Do.TPR.DAG.holdout <- function(threshold=seq(from=0.1, to=0.9, by=0.1), weight=seq(from=0.1, to=1, by=0.1), kk=5, 
    folds=5, seed=23, norm=TRUE, norm.type=NULL, positive="children", bottomup="threshold.free", topdown="HTD",
    W=NULL, parallel=FALSE, ncores=1, recall.levels=seq(from=0.1, to=1, by=0.1), n.round=3, f.criterion="F", metric=NULL,
    compute.performance=FALSE, flat.file=flat.file, ann.file=ann.file, dag.file=dag.file, ind.test.set=ind.test.set, 
    ind.dir=ind.dir, flat.dir=flat.dir, ann.dir=ann.dir, dag.dir=dag.dir, hierScore.dir=hierScore.dir, perf.dir=perf.dir){

    ## Setting Check
    if(positive!="children" && positive!="descendants" || bottomup!="threshold" && bottomup!="threshold.free" && bottomup!="weighted.threshold" && bottomup!="weighted.threshold.free" && bottomup!="tau" || topdown!="HTD" && topdown!="GPAV")
        stop("TPR-DAG: positive or bottomup or topdown value misspelled", call.=FALSE);
    if(positive=="children" && bottomup=="tau")
        stop("TPR-DAG: tau is a descendants variants. Please set positive to descendants", call.=FALSE);
    if(bottomup=="threshold" || bottomup=="tau")
        weight<-0;
    if(bottomup=="threshold.free"){
        threshold <- 0; 
        weight <- 0;
    }
    if(bottomup=="weighted.threshold.free")
        threshold<-0;
    if(norm==FALSE && is.null(norm.type))
        stop("TPR-DAG: If norm is set to FALSE, you need to specify a normalization method among those available", call.=FALSE);
    if(f.criterion!="F" && f.criterion!="avF" && compute.performance==TRUE)
        stop("TPR-DAG: value of parameter 'f.criterion' misspelled", call.=FALSE);  
    if(metric!="FMAX" && metric!="PRC" && !is.null(metric))
        stop("TPR-DAG: value of parameter 'metric' misspelled", call.=FALSE);
    if(norm==TRUE && !is.null(norm.type))
        warning("TPR-DAG: If norm is set to TRUE, the input flat matrix is already normalized.", paste0(" Set norm.type to NULL and not to '", norm.type, "' to avoid this warning message"), call.=FALSE);
    if((is.null(kk) || kk<=1) && bottomup!="threshold.free")
        stop("TPR-DAG: Smallest number of folds to define test and training set is 2. Set kk larger or equal to 2", call.=FALSE);
    if(!is.null(kk) && bottomup=="threshold.free")
        kk <- NULL;         
    if(is.null(metric) && bottomup!="threshold.free")
        stop(paste0("TPR-DAG: metric cannot be NULL. The bottom-up approach '", bottomup, "' is parametric"), ". Please select the metric on which maximize according to those available", call.=FALSE); 
    if(!is.null(metric) && bottomup=="threshold.free") 
        warning("TPR-DAG: the bottom-up approach 'threshold.free' is non-parametric.", paste0(" The chosen parameter metric '", metric, "' does not effect on 'threshold.free' bottom-up approach"), ". Set metric to NULL to avoid this warning message", call.=FALSE);
    if(is.null(seed) && bottomup!="threshold.free")
        warning("TPR-DAG: folds are generate without seed initialization", call.=FALSE);
    if(compute.performance==FALSE && (!is.null(recall.levels) || !is.null(perf.dir) || !is.null(folds))){
        perf.dir <- NULL;
        recall.levels <- NULL;
        folds <- NULL;
    }
    if(compute.performance==FALSE && bottomup=="threshold.free" && (!is.null(seed) || !is.null(n.round) || !is.null(f.criterion))){
        seed <- NULL;
        n.round <- NULL;
        f.criterion <- NULL;
    }
            
    ## Loading Data
    ## loading examples indices of the test set
    ind.set <- paste0(ind.dir, ind.test.set, ".rda");
    ind.test <- get(load(ind.set));

    ## loading dag
    dag.path <- paste0(dag.dir, dag.file,".rda");
    g <- get(load(dag.path));
    root <- root.node(g);

    ## loading annotation matrix
    ann.path <- paste0(ann.dir, ann.file,".rda");
    ann <- get(load(ann.path));
    if(root %in% colnames(ann))
        ann <- ann[,-which(colnames(ann)==root)];

    ## loading flat matrix
    flat.path <- paste0(flat.dir, flat.file,".rda");
    if(norm){
        S <- get(load(flat.path));
        if(root %in% colnames(S)){
            classes.names <- colnames(S)
            root.scores <- S[,which(colnames(S)==root)];
            S <- S[,-which(colnames(S)==root)];
        }
    }else{
        S <- get(load(flat.path));
        S <- scores.normalization(norm.type=norm.type, S);
        cat(norm.type, "NORMALIZATION: DONE", "\n");
        if(root %in% colnames(S)){
            classes.names <- colnames(S)
            root.scores <- S[,which(colnames(S)==root)];
            S <- S[,-which(colnames(S)==root)];
        }
    }

    ## check if |flat matrix classes| = |annotation matrix classes| 
    ## if not the classes of annotation matrix are shrunk to those of flat matrix
    class.check <- ncol(S)!=ncol(ann);
    if(class.check){
        ann <- ann[,colnames(S)];
        nd <- c(root, colnames(S));
        g <- do.subgraph(nd, g, edgemode="directed");
    }

    ## scores flat matrix are shrunk to test and training test respectively
    S.test <- S[ind.test,];
    S.training <- S[-ind.test,];

    ## annotation table are shrunk to test and training test respectively.
    ann.test <- ann[ind.test,];
    ann.training <- ann[-ind.test,];

    ## Compute FLAT PRC, AUC, PXR (average and per class) and FMM (average and per-example) one-shoot or cross-validated 
    if(compute.performance){
        PRC.flat <- AUPRC.single.over.classes(ann.test, S.test, folds=folds, seed=seed);
        AUC.flat <- AUROC.single.over.classes(ann.test, S.test, folds=folds, seed=seed);
        PXR.flat <- precision.at.given.recall.levels.over.classes(ann.test, S.test, folds=folds, seed=seed, recall.levels=recall.levels);
        FMM.flat <- compute.Fmeasure.multilabel(ann.test, S.test, n.round=n.round, f.criterion=f.criterion, verbose=FALSE, b.per.example=TRUE, folds=folds, seed=seed);
        cat("FLAT PERFORMANCE: DONE", "\n");
    }

    ## Hierarchical Correction 
    if(bottomup=="threshold.free"){
        if(exists("root.scores")){
            S.test <- cbind(root.scores[ind.test], S.test);
            colnames(S.test)[1] <- root;
            S.test <- S.test[,classes.names];
        }
        S.test <- TPR.DAG(S.test, g, root=root, positive=positive, bottomup=bottomup, topdown=topdown, t=0, w=0, W=W, parallel=parallel, ncores=ncores);
        S.hier <- S.test;
        cat("HIERARCHICAL CORRECTION: DONE", "\n");
        ## Compute HIER PRC, AUC, PXR (average and per class) and FMM (average and per-example) one-shoot or cross-validated 
        if(compute.performance){
            if(root %in% colnames(S.test))
                S.test <- S.test[,-which(colnames(S.test)==root)];
            PRC.hier <- AUPRC.single.over.classes(ann.test, S.test, folds=folds, seed=seed);
            AUC.hier <- AUROC.single.over.classes(ann.test, S.test, folds=folds, seed=seed);
            PXR.hier <- precision.at.given.recall.levels.over.classes(ann.test, S.test, folds=folds, seed=seed, recall.levels=recall.levels);
            FMM.hier <- compute.Fmeasure.multilabel(ann.test, S.test, n.round=n.round, f.criterion=f.criterion, verbose=FALSE, b.per.example=TRUE, folds=folds, seed=seed);
            cat("HIERARCHICAL PERFORMANCE: DONE", "\n");
        }
        rm(S, S.test);
    }else{
        if(exists("root.scores")){
            S.training <- cbind(root.scores[ind.test], S.training);
            colnames(S.training)[1] <- root;
            S.training <- S.training[,classes.names];

            S.test <- cbind(root.scores[ind.test], S.test);
            colnames(S.test)[1] <- root;
            S.test <- S.test[,classes.names];
        }
        ## Let's start k-fold crossing validation for choosing best threshold and weight maximizing on the selected metric
        foldIndex <- do.unstratified.cv.data(S.training, kk=kk, seed=seed);
        # training.top <- vector(mode="list", length=kk); ## for check
        for(k in 1:kk){
            ## training and test set
            training <- S.training[foldIndex[[k]],];
            target.training <- ann.training[foldIndex[[k]],];
            ## metric initialization        
            top.metric <- 0;
            bestT <- 0;
            bestW <- 0;
            for(t in threshold){
                for(w in weight){                    
                    pred.training <- TPR.DAG(training, g, root=root, positive=positive, bottomup=bottomup, topdown=topdown, w=w, t=t, W=W, parallel=parallel, ncores=ncores);
                    if(metric=="FMAX"){
                        if(root %in% colnames(pred.training))
                            pred.training <- pred.training[,-which(colnames(pred.training)==root)];
                        training.metric <- find.best.f(target.training, pred.training, n.round=n.round, f.criterion=f.criterion, verbose=FALSE, b.per.example=FALSE)[["F"]];
                    }else{
                        if(root %in% colnames(pred.training))
                            pred.training <- pred.training[,-which(colnames(pred.training)==root)];
                        training.metric <- AUPRC.single.over.classes(target.training, pred.training, folds=NULL, seed=NULL)$average;     
                    }
                    if(training.metric > top.metric){
                        top.metric <- training.metric;
                        bestT <- t;
                        bestW <- w;
                        # training.top[[k]] <- c(metric=top.metric, best.thres=bestT, best.weight=bestW);
                        if(bottomup=="threshold" || bottomup=="tau"){
                            cat("training fold:", k, paste0("top ", metric," avg found:"), top.metric, "best threshold:", bestT, sep="\t", "\n");
                        }else if(bottomup=="weighted.threshold.free"){
                            cat("training fold:", k, paste0("top ", metric," avg found:"), top.metric, "best weight:", bestW, sep="\t", "\n");
                        }
                        else{
                            cat("training fold:", k, paste0("top ", metric," avg found:"), top.metric, "best threshold:",bestT, "best weight:", bestW, sep="\t", "\n");
                        }
                    }
                }
            }
        }
        S.test <- TPR.DAG(S.test, g, root=root, positive=positive, bottomup=bottomup, topdown=topdown, t=bestT, w=bestW, W=W, parallel=parallel, ncores=ncores);
        S.hier <- S.test;
        cat("HIERARCHICAL CORRECTION: DONE", "\n");
        ## Compute HIER PRC, AUC, PXR (average and per class) and FMM (average and per-example) one-shoot or cross-validated 
        if(compute.performance){
            if(root %in% colnames(S.test))
                S.test <- S.test[,-which(colnames(S.test)==root)];
            PRC.hier <- AUPRC.single.over.classes(ann.test, S.test, folds=folds, seed=seed);
            AUC.hier <- AUROC.single.over.classes(ann.test, S.test, folds=folds, seed=seed);
            PXR.hier <- precision.at.given.recall.levels.over.classes(ann.test, S.test, folds=folds, seed=seed, recall.levels=recall.levels);
            FMM.hier <- compute.Fmeasure.multilabel(ann.test, S.test, n.round=n.round, f.criterion=f.criterion, verbose=FALSE, b.per.example=TRUE, folds=folds, seed=seed);
            cat("HIERARCHICAL PERFORMANCE: DONE", "\n");
        }
        rm(S, S.test, S.training, training);
    }
    ## Storing Results 
    if(positive=="children" && bottomup=="threshold.free" && topdown=="HTD")
        meth.name <- "tprTF";
    if(positive=="children" && bottomup=="threshold" && topdown=="HTD")
        meth.name <- "tprT";
    if(positive=="children" && bottomup=="weighted.threshold.free" && topdown=="HTD")
        meth.name <- "tprW";
    if(positive=="children" && bottomup=="weighted.threshold" && topdown=="HTD")
        meth.name <- "tprWT";

    if(positive=="descendants" && bottomup=="threshold.free" && topdown=="HTD")
        meth.name <- "descensTF";
    if(positive=="descendants" && bottomup=="threshold" && topdown=="HTD")
        meth.name <- "descensT";
    if(positive=="descendants" && bottomup=="weighted.threshold.free" && topdown=="HTD")
        meth.name <- "descensW";
    if(positive=="descendants" && bottomup=="weighted.threshold" && topdown=="HTD")
        meth.name <- "descensWT";
    if(positive=="descendants" && bottomup=="tau" && topdown=="HTD")
        meth.name <- "descensTAU";

    if(positive=="children" && bottomup=="threshold.free" && topdown=="GPAV")
        meth.name <- "ISOtprTF";
    if(positive=="children" && bottomup=="threshold" && topdown=="GPAV")
        meth.name <- "ISOtprT";
    if(positive=="children" && bottomup=="weighted.threshold.free" && topdown=="GPAV")
        meth.name <- "ISOtprW";
    if(positive=="children" && bottomup=="weighted.threshold" && topdown=="GPAV")
        meth.name <- "ISOtprWT";
    
    if(positive=="descendants" && bottomup=="threshold.free" && topdown=="GPAV")
        meth.name <- "ISOdescensTF";
    if(positive=="descendants" && bottomup=="threshold" && topdown=="GPAV")
        meth.name <- "ISOdescensT";
    if(positive=="descendants" && bottomup=="weighted.threshold.free" && topdown=="GPAV")
        meth.name <- "ISOdescensW";
    if(positive=="descendants" && bottomup=="weighted.threshold" && topdown=="GPAV")
        meth.name <- "ISOdescensWT";
    if(positive=="descendants" && bottomup=="tau" && topdown=="GPAV")
        meth.name <- "ISOdescensTAU";

    if(norm){
        save(S.hier, file=paste0(hierScore.dir, flat.file, ".hierScores.",meth.name,".rda"), compress=TRUE);
        if(compute.performance){
            save(PRC.flat, PRC.hier, AUC.flat, AUC.hier, PXR.flat, PXR.hier, FMM.flat, FMM.hier, file=paste0(perf.dir, "PerfMeas.", flat.file, ".hierScores.",meth.name,".rda"), compress=TRUE);
        }
    }else{
        save(S.hier, file=paste0(hierScore.dir, norm.type,".", flat.file, ".hierScores.",meth.name,".rda"), compress=TRUE);
        if(compute.performance){
            save(PRC.flat, PRC.hier, AUC.flat, AUC.hier, PXR.flat, PXR.hier, FMM.flat, FMM.hier, file=paste0(perf.dir, "PerfMeas.", norm.type, ".", flat.file, ".hierScores.",meth.name,".rda"), compress=TRUE);
        }
    }
}