Internal structures of CORElearn C++ part

Share:

Description

The package CORElearn is an R port of CORElearn data mining system. This document is a short description of the C++ part which can also serve as a standalone Linux or Windows data mining system, its organization and main classes and data structures.

Details

The C++ part is called from R functions collected in file Rinterface.R. The C++ functions called from R and providing interface to R are collected in Rfront.cpp and Rconvert.cpp. The front end for standalone version is in file frontend.cpp. For many parts of the code there are two variants, classification and regression one. Regression part usually has Reg somewhere in its name. The main classes are

  • marray, mmatrix are templates for storing vectors and matrixes

  • dataStore contains data storage and data manipulation methods, of which the most important are

    • mmatrix<int> DiscData, DiscPredictData contain values of discrete attributes and class for training and prediction (optional). In classification column 0 always stores class values.

    • mmatrix<double> ContData, ContPredictData contain values of numeric attribute and prediction values for training and prediction (optional). In regression column 0 always stores target values.

    • marray<attribute> AttrDesc with information about attributes' types, number of values, min, max, column index in DiscData or ContData, ...

  • estimation, estimationReg evaluate attributes with different purposes: decision/regression tree splitting, binarization, discretization, constructive induction, feature selection, etc. Because of efficiency these classes store its own data in

    • mmatrix<int> DiscValues containing discrete attributes and class values,

    • mmatrix<double> ContValues containing numeric attribute and prediction values.

  • Options stores and handles all the parameters of the system.

  • featureTree, regressionTree build all the models, predict with them, and create output.

Author(s)

Marko Robnik-Sikonja

See Also

CORElearn, CoreModel, predict.CoreModel, modelEval, attrEval, ordEval, plot.ordEval, helpCore, paramCoreIO, infoCore, versionCore.

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.