Nothing
#' Tools for information-based feature selection and scoring
#'
#' Praznik is a collection of tools for information theory-based feature selection and scoring.
#'
#' The first part of the package consists of efficient implementations of several popular information filters, feature selection approaches based on greedy optimisation of certain feature inclusion criterion.
#' In a nutshell, an algorithm of this class requires an information system \eqn{(X,Y)} and a predefined number of features to selected \eqn{k}, and works like this.
#' To start, it estimates mutual information between each feature and the decision, find a feature with maximal such score and stores it as a first on a list of selected features, \eqn{S}.
#' Then, it estimates a value of a certain criterion function \eqn{J(X,Y,S)} for each feature \eqn{X}; this function also depends on \eqn{Y} and the set of already selected features \eqn{S}.
#' As in the first step, the previously unselected feature with a greatest value of the criterion function is selected next.
#' This is repeated until the method would gather \eqn{k} features, or, in case of some methods, when no more informative features can be found.
#' The methods implemented in praznik consider the following criteria.
#'
#' The mutual information maximisation filter, \code{\link{MIM}}, simply selects top-\eqn{k} features of best mutual information, that is
#' \deqn{J_{MIM}=I(X;Y).}
#'
#' The minimal conditional mutual information maximisation proposed by F. Fleauret, \code{\link{CMIM}}, uses
#' \deqn{J_{CMIM}(X)=\min(I(X;Y),\min_{W\in S} I(X;Y|W));}
#' this method is also effectively identical to the information fragments method.
#'
#' The minimum redundancy maximal relevancy proposed by H. Peng et al., \code{\link{MRMR}}, uses
#' \deqn{J_{MRMR}=I(X;Y)-\frac{1}{|S|}\sum_{W\in S} I(X;W).}
#'
#' The joint mutual information filter by H. Yang and J. Moody, \code{\link{JMI}}, uses
#' \deqn{J_{JMI}=\sum_{W\in S} I(X,W;Y).}
#'
#' The double input symmetrical relevance filter by P. Meyer and G. Bontempi, \code{\link{DISR}}, uses
#' \deqn{J_{DISR}(X)=\sum_{W\in S} \frac{I(X,W;Y)}{H(X,W,Y)}.}
#'
#' The minimal joint mutual information maximisation filter by M. Bennasar, Y. Hicks and R. Setchi, \code{\link{JMIM}}, uses
#' \deqn{J_{JMIM}=\min_{W\in S} I(X,W;Y).}
#'
#' The minimal normalised joint mutual information maximisation filter by the same authors, \code{\link{NJMIM}}, uses
#' \deqn{J_{NJMIM}=\min_{W\in S} \frac{I(X,W;Y)}{H(X,W,Y)}.}
#'
#' The third-order joint mutual information filter by Sechidis et al., \code{\link{JMI3}}, uses
#' \deqn{J(X)=\frac{1}{2}\sum_{(U,W)\in S^2; U\neq W} I(X,U,W;Y).}
#'
#' While \code{\link{CMIM}}, \code{\link{JMIM}} and \code{\link{NJMIM}} consider minimal value over already selected features, they may use a somewhat more sophisticated and faster algorithm.
#'
#' The second part of the package provides methods for scoring features, useful on its own as well as building blocks for more sophisticated algorithms.
#' In particular, the package exposes the following functions:
#'
#' \code{\link{hScores}} returns
#' \deqn{H(X).}
#'
#' \code{\link{jhScores}} returns
#' \deqn{H(X,Y).}
#'
#' \code{\link{miScores}} returns
#' \deqn{I(X;Y).}
#'
#' \code{\link{cmiScores}} returns, for a given condition vector \eqn{Z},
#' \deqn{I(X;Y|Z).}
#'
#' \code{\link{jmiScores}} returns
#' \deqn{I(X,Z;Y).}
#'
#' \code{\link{njmiScores}} returns
#' \deqn{\frac{I(X,Z;Y)}{H(X,Y,Z)}.}
#'
#' \code{\link{minCmiScores}}, \code{\link{maxCmiScores}} and \code{\link{minMaxCmiScores}} return
#' \deqn{\min_j I(X_i;Y|X_j)} and/or \deqn{\max_j I(X_i;Y|X_j).}
#'
#' \code{\link{maxJmiScores}} returns
#' \deqn{\max_{j\neq i} I(X_i,X_j;Y).}
#'
#' \code{\link{triScores}} returns, for every triple of features,
#' \deqn{I(X_i;X_j;X_k).}
#'
#' These function generally also have their \code{*Matrix} counterparts, which efficiently build a matrix of scores between all pairs of features.
#' This is especially useful for network inference tasks.
#'
#' Estimation of mutual information and its generalisations is a hard task; still, praznik aims at speed and simplicity and hence only offers basic, maximum likelihood estimator applicable on discrete data.
#' For convenience, praznik automatically and silently coerces non-factor inputs into about ten equally-spaced bins, following the heuristic often used in literature.
#'
#' Furthermore, praznik provides \code{\link{kTransform}} function for converting continuous features into discrete ones with Kendall transformation, a novel approach based on Kendall correlation coefficient which allows for multivariate reasoning based on monotonicity agreement.
#'
#' Additionally, praznik has a limited, experimental support for replacing entropic statistics with Gini impurity-based; in such framework, entropy is replaced by Gini impurity
#' \deqn{g(X):=1-\sum_x p_x^2,}
#' which leads to an impurity gain
#' \deqn{G(X;Y):=g(Y)-E(g(Y)|X)=\sum_{xy}\frac{p_{xy}^2}{p_x}-\sum_{y} p_y^2,}
#' a counterpart of mutual information or information gain.
#' It does not possess most of elegant properties of mutual information, yet values of both are usually highly correlated; moreover, Gini gain is computationally easier to calculate, hence it often replaces MI in performance-sensitive applications, like optimising splits in decision trees.
#'
#' In a present version, praznik includes \code{\link{impScores}} for generating values of \eqn{G} for all features (an analog of \code{\link{miScores}}, as well as \code{\link{JIM}}, a Gini gain-based feature selection method otherwise identical to \code{\link{JMI}}.
#' @references "Praznik: High performance information-based feature selection" M.B. Kursa. SoftwareX 16, 100819 (2021).
#' @references "Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection" G. Brown et al. JMLR (2012).
"_PACKAGE"
# Load the native code
#' @useDynLib praznik, .registration=TRUE
NULL
# Some example data
#' Pre-discretised Madelon dataset
#'
#' Madelon is a synthetic data set from the NIPS 2003 feature selection challenge, generated by Isabelle Guyon.
#' It contains 480 irrelevant and 20 relevant features, including 5 informative and 15 redundant.
#' In this version, the originally numerical features have been pre-cut into 10 bins, as well as their names have been altered to reveal 20 relevant features (as identified by the Boruta method).
#' @format A list with two elements, \code{X} containing a data frame with predictors, and \code{Y}, the decision.
#' Features are in the same order as in the original data; the names of relevant ones start with \code{Rel}, while of irrelevant ones with \code{Irr}.
#' @source \url{https://archive.ics.uci.edu/ml/datasets/Madelon}
#' @usage data(MadelonD)
"MadelonD"
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.