plotAbundanceByClass: Plot Feature Abundance by Class
In predomics/predomicspkg: Interpretable Prediction in Omics Data

plotAbundanceByClass

R Documentation

Plot Feature Abundance by Class

Description

This function visualizes the abundance of features across different classes in a dataset. It creates boxplots to show the distribution of feature abundances in each class, along with statistical tests for the significance of differences between the classes. The function supports both classification and regression tasks.

Usage

plotAbundanceByClass(
  features,
  X,
  y,
  topdown = TRUE,
  main = "",
  plot = TRUE,
  col.pt = c("deepskyblue4", "firebrick4"),
  col.bg = c("deepskyblue1", "firebrick1")
)

Arguments

`features`	A character vector of feature names to be plotted.
`X`	A data matrix or data frame where each row represents an observation and each column represents a feature.
`y`	A vector of class labels (e.g., 1 and -1 for binary classification, or continuous for regression) corresponding to the rows in 'X'.
`topdown`	Logical; whether to arrange the features in a top-down order (default is 'TRUE').
`main`	A string for the title of the plot.
`plot`	Logical; if 'TRUE', the function will display the plot. If 'FALSE', the function will return the statistical results of the test.
`col.pt`	Colors for points in the plot (default is 'c("deepskyblue4", "firebrick4")').
`col.bg`	Colors for boxplots in the plot (default is 'c("deepskyblue1", "firebrick1")').

Details

This function computes and visualizes the abundance of features in each class (group) of the dataset. It creates a boxplot for each feature and computes a non-parametric test for differences in abundance between classes. The function supports both classification (e.g., binary or multi-class) and regression tasks (using continuous values in 'y').

In classification mode, the plot compares the two classes (e.g., 1 vs -1 for binary classification) and adds significance markers (e.g., asterisks) for features with significant differences in abundance between classes. In regression mode, it compares feature abundance across all observations and computes correlations with the response variable.

Value

If 'plot = TRUE', the function returns a ggplot object displaying the feature abundance by class. If 'plot = FALSE', it returns the statistical results of the test (p-values and q-values).

Author(s)

Edi Prifti (IRD)

Examples

# Example usage for classification task
features <- c("feature1", "feature2", "feature3")
X <- data.frame(feature1 = rnorm(100), feature2 = rnorm(100), feature3 = rnorm(100))
y <- sample(c(1, -1), 100, replace = TRUE)

# Plot feature abundance
plotAbundanceByClass(features, X, y, main = "Feature Abundance Plot")

# Get statistical results without plotting
plotAbundanceByClass(features, X, y, plot = FALSE)

predomics/predomicspkg documentation built on Dec. 11, 2024, 11:06 a.m.