evaluate_set: Inspect Set of Features
In ZytoHMGU/hetset: Identification of Heterogeneous Subsets in Data

Description Usage Arguments Details Author(s) Examples

Quantification of the importance of each selected feature in a hetset object.

1	evaluate_set(H, verbose)

`H`	`SummarizedExperiment` container with fitted two-component normal mixture model as returned by `scan_hetset`
`verbose`	should a short summary of the evaluated feature set be printed to console? (default = TRUE)

In scan_hetset a set of features is selected, which maximize the squared Hellinger's distance between the densities of the corresponding maximum likelihood fit of a two component normal mixture model.

In this function, the contribution of each feature to the Hellinger's dinstance is quantified by the absolut increase of distance by adding this feature. The calculation is done as following. Given a set of features (f_i) i in 1:k and a partitioning of the samples xi, let d be the distance of the k-dimensional normal densities of the fitted mixture model. For each i in 1:k, remove k from the set of features, particularly from the parameters mu and Sigma of the two components and calculate the remaining squared Hellinger distance. The difference d_i is attributed to the removed feature. Each features importance is quantified as explained distance (d_i/d) or relative importance (d_i/sum_j d_j).

Daniel Samaga

F1 <- c(rnorm(50,0,1),rnorm(50,3,1))
F2 <- c(rnorm(50,0,1),rnorm(50,2,1))
F3 <- c(rnorm(50,0,1),rnorm(50,2,2))
F4 <- c(rnorm(50,0,1),rnorm(50,1,1))
F5 <- c(rnorm(50,0,1),rnorm(50,1,2))
F0 <- matrix(data = rnorm(n = 1000,mean = 0,sd = 1),ncol = 100)

Hds <- hetset(D = rbind(F1,F2,F3,F4,F5,F0))
rm(F0,F1,F2,F3,F4,F5)
Hds <- scan_hetset(H = Hds,level = "univariate",min_size = 3,
    max_size = 4,rel_imp = 0,em_steps = 5)
evaluate_set(Hds,verbose = TRUE)