getImportanceFeaturesFBMobjects: Extract Important Features and Related Metrics from Final...

View source: R/analyzeImportantFeaturesFBM.R

getImportanceFeaturesFBMobjectsR Documentation

Extract Important Features and Related Metrics from Final Population

Description

This function processes the final population of models from a given classifier experiment ('clf_res'), selects the best population based on specified criteria, and computes the feature importance, prevalence, and effect sizes. The function returns a list of objects that can be used for plotting or further analysis of the most relevant features in the model.

Usage

getImportanceFeaturesFBMobjects(
  clf_res,
  X,
  y,
  verbose = TRUE,
  filter.cv.prev = 0.25,
  scaled.importance = FALSE,
  k_penalty = 0.75/100,
  k_max = 0
)

Arguments

clf_res

A classifier experiment result, as produced by the modeling function.

X

A feature matrix with rows representing features and columns representing samples.

y

A response variable, either a binary factor (for classification) or a continuous variable (for regression).

verbose

Logical. If 'TRUE', print detailed messages.

filter.cv.prev

Numeric threshold for filtering based on cross-validation prevalence (default is 0.25).

scaled.importance

Logical. If 'TRUE', scales the feature importance scores.

k_penalty

A numeric penalty factor applied to sparsity selection during model evaluation (default is '0.75/100').

k_max

Maximum allowed sparsity value during model selection (default is 0).

Details

**Workflow**: - Determines if the experiment is regression or classification based on the classifier's objective. - Filters the best models from the population based on sparsity and evaluation criteria. - Constructs data structures that capture the feature importance, prevalence, and effect sizes. - Returns a list of data objects for easy plotting or further analysis.

**Requirements**: - 'isExperiment' should be a function that checks if 'clf_res' is a valid experiment. - 'modelCollectionToPopulation', 'selectBestPopulation', and other helper functions should be defined for processing population and feature data.

Value

A list with the following components: - 'featprevFBM': A data frame containing feature prevalence data. - 'featImp': A summary of feature importance across cross-validation folds. - ‘effectSizes': A data frame with effect sizes for each feature (Cliff’s delta for classification or Spearman’s rho for regression). - 'featPrevGroups': Data used for plotting feature prevalence by group.

Examples

## Not run: 
# Extract feature importance and related metrics
feature_data <- getImportanceFeaturesFBMobjects(clf_res = my_experiment, X = my_data, y = my_labels)

## End(Not run)


predomics/predomicspkg documentation built on Dec. 11, 2024, 11:06 a.m.