getImportanceFeaturesFBMobjects: Extract Important Features and Related Metrics from Final...
In predomics/predomicspkg: Interpretable Prediction in Omics Data

View source: R/analyzeImportantFeaturesFBM.R

getImportanceFeaturesFBMobjects

R Documentation

Extract Important Features and Related Metrics from Final Population

Description

This function processes the final population of models from a given classifier experiment ('clf_res'), selects the best population based on specified criteria, and computes the feature importance, prevalence, and effect sizes. The function returns a list of objects that can be used for plotting or further analysis of the most relevant features in the model.

Usage

getImportanceFeaturesFBMobjects(
  clf_res,
  X,
  y,
  verbose = TRUE,
  filter.cv.prev = 0.25,
  scaled.importance = FALSE,
  k_penalty = 0.75/100,
  k_max = 0
)

Arguments

`clf_res`	A classifier experiment result, as produced by the modeling function.
`X`	A feature matrix with rows representing features and columns representing samples.
`y`	A response variable, either a binary factor (for classification) or a continuous variable (for regression).
`verbose`	Logical. If 'TRUE', print detailed messages.
`filter.cv.prev`	Numeric threshold for filtering based on cross-validation prevalence (default is 0.25).
`scaled.importance`	Logical. If 'TRUE', scales the feature importance scores.
`k_penalty`	A numeric penalty factor applied to sparsity selection during model evaluation (default is '0.75/100').
`k_max`	Maximum allowed sparsity value during model selection (default is 0).

Details

**Workflow**: - Determines if the experiment is regression or classification based on the classifier's objective. - Filters the best models from the population based on sparsity and evaluation criteria. - Constructs data structures that capture the feature importance, prevalence, and effect sizes. - Returns a list of data objects for easy plotting or further analysis.

**Requirements**: - 'isExperiment' should be a function that checks if 'clf_res' is a valid experiment. - 'modelCollectionToPopulation', 'selectBestPopulation', and other helper functions should be defined for processing population and feature data.

Value

A list with the following components: - 'featprevFBM': A data frame containing feature prevalence data. - 'featImp': A summary of feature importance across cross-validation folds. - ‘effectSizes': A data frame with effect sizes for each feature (Cliff’s delta for classification or Spearman’s rho for regression). - 'featPrevGroups': Data used for plotting feature prevalence by group.

Examples

## Not run: 
# Extract feature importance and related metrics
feature_data <- getImportanceFeaturesFBMobjects(clf_res = my_experiment, X = my_data, y = my_labels)

## End(Not run)

predomics/predomicspkg documentation built on Dec. 11, 2024, 11:06 a.m.