Recm: Recm Robust ensemble classifier machine (Recm)

RecmR Documentation

Recm Robust ensemble classifier machine (Recm)

Description

Recm Robust ensemble classifier machine (Recm)

Recm Robust ensemble classifier machine (Recm)

Details

An object that holds data and an ensemble of classifiers The ensemble is composed of XGBoost classifiers trained on binarized labels.

A list of classifiers, each trained on a random sample of the training data.

Public fields

name

the object's name

data_mode

string vector, dictates the types of data engineeering, acceptable values include combination of: original quartiles pairs ranks sigpairs

signatures

lists of variables, must be in data, that will be used together, and compared to other signatures

file_name

the data file

data

the data.table used to train and test

the

data column used as label, the target

data_split

numeric value indicating percent data to make into training data

train_data

the data.table used to train the ensemble

train_label

the vector used as the training label

test_data

the data.table used as test data

test_label

the vector used as test data labels

data_colnames

the column names used to train the models

unique_labels

the unique set of labels

the

ensemble of predictors

the

table of predictions, collecting from the ensbl list of predictors

the

table of predictions, collecting from the ensbl list of predictors

Active bindings

the

data column used as label, the target

the

ensemble of predictors

the

table of predictions, collecting from the ensbl list of predictors

the

table of predictions, collecting from the ensbl list of predictors

Methods

Public methods


Method new()

Create a new 'Recm' object.

Usage
Recm$new(name = NA)
Arguments
name

The object is named.

Returns

A new 'recm' object.


Method greet()

Creates a printable representation of the object.

Usage
Recm$greet()
Returns

A character string representing the object.


Method read_data()

Reads the data file.

Usage
Recm$read_data(file_name, sep, header)
Arguments
file_name

The name of the file.

sep

The separting character ',' or '\t'


Method read_train_data()

Reads the data file.

Usage
Recm$read_train_data(file_name, sep, header)
Arguments
file_name

The name of the file.

sep

The separting character ',' or '\t'


Method read_test_data()

Reads the data file.

Usage
Recm$read_test_data(file_name, sep, header)
Arguments
file_name

The name of the file.

sep

The separting character ',' or '\t'

header

boolean, whether the table has a header line


Method data_eng()

Data engineering, replaces the object's data.table.

Usage
Recm$data_eng(data_source = NULL)

Method data_setup()

Does some setup processing on the data file, drop columns, split data into train and test, and identify the label column.

Usage
Recm$data_setup(
  file_name = NULL,
  sep = NULL,
  data_mode = NULL,
  signatures = NULL,
  label_name = NULL,
  sample_id = NULL,
  drop_list = NULL,
  data_split = NULL
)
Arguments
file_name

string, the name of the file

sep

string, the separating character

data_mode

string,

label_name

string, the column name indicating the target label

drop_list

a vector of strings indicating what columns to drop

data_split

numeric value, the percent of data to use in training


Method train_data_setup()

Does some setup processing on the training data file, drop columns and identify the label column.

Usage
Recm$train_data_setup(
  file_name = NULL,
  sep = NULL,
  data_mode = NULL,
  signatures = NULL,
  label_name = NULL,
  sample_id = NULL,
  drop_list = NULL
)
Arguments
label_name

string, the column name indicating the target label

drop_list

a vector of strings indicating what columns to drop


Method test_data_setup()

Does some setup processing on the test data file, drop columns and identify the label column. The data_mode and signatures will have already been set in training.

Usage
Recm$test_data_setup(
  file_name = NULL,
  sep = NULL,
  label_name = NULL,
  sample_id = NULL,
  drop_list = NULL
)
Arguments
label_name

string, the column name indicating the target label

drop_list

a vector of strings indicating what columns to drop


Method binarize_label()

Usage
Recm$binarize_label(label, x)

Method build_label_ensemble()

Builds list of ensembles of XGBoost object, each classifying one binary label.

Usage
Recm$build_label_ensemble(size, params)
Arguments
size

numeric, number of classifiers

mode

character vector, what types of data modalities to make. possible: pairs, quartiles, set-pairs

label

string, the label vector of each data example

max_depth

numeric, the depth of the tree in XGBoost

eta

numeric, the eta param of XGBoost, speed of learning

nrounds

numeric, the number of training rounds

nthreads

numeric, the number of threads to use in processing

objective

string, binary:logistic, see xgboost docs

Returns

A ensemble object is added to the list of objects in recm$enbl.


Method build_pred_table()

Usage
Recm$build_pred_table()

Method remap_multiclass_labels()

Usage
Recm$remap_multiclass_labels(label)

Method unmap_multiclass_labels()

Usage
Recm$unmap_multiclass_labels(labels)

Method build_final_ensemble()

Usage
Recm$build_final_ensemble(size, params)

Method train_models()

Usage
Recm$train_models(perc)

Method train_final()

Usage
Recm$train_final(perc)

Method ensemble_predict()

Usage
Recm$ensemble_predict(data, combine_function)

Method predict()

Usage
Recm$predict(data, combine_function)

Method print_error()

Usage
Recm$print_error(label, root, threshold)

Method accuracy()

Usage
Recm$accuracy(labels, calls)

Method precision()

Usage
Recm$precision(cmdf, i)

Method sensitivity()

Usage
Recm$sensitivity(cmdf, i)

Method specificity()

Usage
Recm$specificity(cmdf, i)

Method classification_metrics()

Usage
Recm$classification_metrics()

Method importance()

Usage
Recm$importance()

Method results()

Usage
Recm$results(include_label = FALSE)

Method autopred()

Usage
Recm$autopred(
  data_file = NULL,
  sep = NULL,
  label_name = NULL,
  sample_id = NULL,
  drop_list = NULL,
  data_split = NULL,
  data_mode = NULL,
  signatures = NULL,
  size = NULL,
  params = NULL,
  train_perc = NULL,
  combine_function = NULL
)

Method clone()

The objects of this class are cloneable with this method.

Usage
Recm$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


Gibbsdavidl/RobustEnsembleClassifierMachine documentation built on Dec. 24, 2024, 1:53 a.m.