A Brief Introduction to D2MCS"

Introduction

*D2MCS* is an object-oriented framework able to identify and exploit the intrinsic characteristic of input data to (i) accurately distribute features in groups (feature clustering) and (ii) design and deploy effective MCS models. Below are included code snippets belonging to the different stages of *D2MCS* framework: (i) Data Manipulation, (ii) Feature Clustering, (iii) MCS Creation and (iv) Classification Results. Furthermore, the package provides the facility to read the descriptions and details of all functions through the *help(D2MCS)* command.

Stage 1: Data Manipulation

The first step starts using the *DatasetLoader* class to convert the data to be analyzed into the structure compatible with *D2MCS* called *Dataset* (or *HDDataset* in case the dataset cannot be stored in memory). The following code fragment shows the parameters included in the *load* function after instantiating the *DatasetLoader* class. ```{R, echo = TRUE, results = "hide", eval = FALSE} data.loader <- DatasetLoader$new() data <- data.loader$load(filepath, header = TRUE, sep = ",", skip.lines = 0, normalize.names = FALSE, ignore.columns = NULL) wzxhzdk:0 Using the *createPartitions()* method, the dataset is divided in order to use the divisions to create the data structures required in the following phases, using the *createSubset()* and *createTrain() * methods. While the first method performs the creation of a *Subset* object used both for clustering operations and for validation purposes. On the other hand, the second method is responsible for creating a *Trainset* object necessary to perform the model training stage.

Stage 2: Feature Clustering

After creating a Subset object, the stage two based on the distribution of features in clusters starts. The code snippet below exemplifies the three steps necessary to create and execute the clustering strategy called *DependencyBasedStrategy* included by default in *D2MCS*. ```{R, echo = TRUE, results = "hide", eval = FALSE} ## FEATURE-CLUSTERING ALGORITHM CREATION conf <- DependencyBasedStrategyConfiguration$new() dbs <- DependencyBasedStrategy$new(subset, heuristic, configuration = conf) ## FEATURE-CLUSTERING ALGORITHM EXECUTION dbs$execute(verbose = FALSE) ## FEATURE-CLUSTERING ALGORITHM FUNCTIONALITIES dbs$getBestClusterDistribution() dbs.train <- dbs$createTrain(subset, num.clusters = NULL, num.groups = NULL, include.unclustered = FALSE) wzxhzdk:1 Using the code fragment shown previously, the training of the ML models provided by the *caret* library is performed in order to find out which models offer the best performance for the dataset, taking into account the indicated parameters.

Stage 4: Classification Results

After building the MCS, starts the next stage related to the classification of the data. To this end, the *D2MCS* tool needs to use the *TrainOutput* structure obtained by training the MCS, the dataset to be predicted and the voting schemes chosen to combine the results of the different MSC models. ```{R, echo = TRUE, results = "hide", eval = FALSE} ## VOTING SCHEMES AVAILABLE IN THE CLASSIFICATION STAGE voting.types <- c(SingleVoting$new(voting.schemes, metrics), CombinedVoting$new(voting.schemes, combined.metrics, methodology, metrics)) ## EXECUTE THE CLASSIFICATION STAGE predictions <- d2mcs$classify(train.output, subset, voting.types, positive.class = NULL) ## COMPUTE THE PERFORMANCE OF EACH VOTING SCHEME predictions$getPerformances(test.set, measures, voting.names = NULL, metric.names = NULL, cutoff.values = NULL) ## OBTAIN THE PREDICTIONS OBTAINED OF EACH VOTING SCHEME USED prediction$getPredictions(voting.names = NULL, metric.names = NULL, cutoff.values = NULL, type = NULL, target = NULL, filter = FALSE) ``` When the classification stage is completed, the tool produces a *ClassificationOutput* object to allow the user to obtain information about the classification performance (*getPerformances()* method) and to observe in depth the predictions obtained (*getPredictions()* method).

Development

The D2MCS package is also available in a development version at the Github development page: github.com/drordas/D2MCS



Try the D2MCS package in your browser

Any scripts or data that you put into this service are public.

D2MCS documentation built on Aug. 23, 2022, 5:07 p.m.