analyzePopulationFeatures | R Documentation |
This function analyzes features in a population of models, allowing for the visualization and examination of feature importance, prevalence, and model coefficients. It can generate a variety of plots to understand the distribution and importance of features in the given population.
analyzePopulationFeatures(
pop,
X,
y,
res_clf,
makeplot = TRUE,
name = "",
ord.feat = "importance",
make.network = TRUE,
network.layout = "circular",
network.alpha = 1e-04,
verbose = TRUE,
pdf.dims = c(width = 25, height = 20),
filter.perc = 0.05,
k_penalty = 0.75/100,
k_max = 0
)
pop |
A population of models, typically obtained from 'modelCollectionToPopulation' or similar functions. |
X |
The data matrix containing features (rows represent features, columns represent samples). |
y |
The response variable (class labels or continuous values depending on the model). |
res_clf |
The classifier used for the analysis, typically a result from a classification experiment. |
makeplot |
Logical. If 'TRUE', the function generates plots and saves them as a PDF. If 'FALSE', it returns the analysis results without plotting. |
name |
A string representing the name of the analysis or output (used for saving files). |
ord.feat |
A string indicating the ordering method for features. Options are: - "prevalence": Order by the prevalence of features across models. - "importance": Order by feature importance based on cross-validation. - "hierarchical": Order by hierarchical clustering of the feature-to-model coefficient matrix. |
make.network |
Logical. If 'TRUE', generates a network of feature co-occurrence across the population of models. |
network.layout |
A string indicating the layout of the network. Default is "circular". Other options may include "fr" for Fruchterman-Reingold layout. |
network.alpha |
A numeric value controlling the alpha transparency of the network plot. |
verbose |
Logical. If 'TRUE', prints additional information during execution. |
pdf.dims |
A vector of two numbers specifying the width and height of the PDF output (in inches). |
filter.perc |
A numeric value between 0 and 1 specifying the minimum prevalence of a feature to be included in the analysis. |
k_penalty |
A penalty value for model selection in the population filtering. |
k_max |
The maximum number of models to include in the final population after filtering. |
The function performs a variety of analyses on a population of models: - It filters models based on feature prevalence. - It orders features by various metrics such as prevalence, importance, or hierarchical clustering. - It generates plots of feature prevalence, model coefficients, and other characteristics. - If requested, it also generates a network of feature co-occurrence across the models.
If 'makeplot = TRUE', returns a PDF with visualizations of feature importance, prevalence, and model coefficients. If 'makeplot = FALSE', returns a list of the analysis results including the normalized scores and feature importance.
Edi Prifti (IRD)
## Not run:
# Assuming 'pop' is a valid population of models, 'X' is the feature matrix, and 'y' is the response variable
analyzePopulationFeatures(pop = pop, X = X, y = y, res_clf = res_clf, makeplot = TRUE, name = "population_analysis")
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.