plotPrevalence: Plot Feature Prevalence and Enrichment
In predomics/predomicspkg: Interpretable Prediction in Omics Data

plotPrevalence

R Documentation

Plot Feature Prevalence and Enrichment

Description

This function visualizes the prevalence of features across different groups in a dataset and computes feature enrichment, providing an optional statistical test for enrichment. The plot displays the prevalence of features in each group, and if enrichment data is available, it adds significance markers.

Usage

plotPrevalence(
  features,
  X,
  y,
  topdown = TRUE,
  main = "",
  plot = TRUE,
  col.pt = c("deepskyblue4", "firebrick4"),
  col.bg = c("deepskyblue1", "firebrick1"),
  zero.value = 0
)

Arguments

`features`	A character vector of feature names to be plotted.
`X`	A data matrix or data frame where each row is an observation and each column is a feature.
`y`	A vector of class labels (e.g., 1 and -1 for binary classification) corresponding to the rows in 'X'.
`topdown`	Logical; whether to arrange the features in a top-down order (default is 'TRUE').
`main`	A string for the title of the plot.
`plot`	Logical; if 'TRUE', the function will display the plot. If 'FALSE', the function will return the enrichment statistics.
`col.pt`	Colors for points in the plot (default is 'c("deepskyblue4", "firebrick4")').
`col.bg`	Colors for bars in the plot (default is 'c("deepskyblue1", "firebrick1")').
`zero.value`	The value to replace in 'y' for missing values (default is '0').

Details

The function computes and visualizes the prevalence of the specified features across different groups in the dataset, showing the percentage of occurrences in each class. If statistical enrichment tests are available, the function will display these using asterisks for significant features. The enrichment is computed using chi-square tests.

Value

If 'plot = TRUE', the function returns a ggplot object displaying the feature prevalence and enrichment. If 'plot = FALSE', the function returns the enrichment results.

Author(s)

Edi Prifti (IRD)

Examples

# Example usage
features <- c("feature1", "feature2", "feature3")
X <- data.frame(feature1 = rnorm(100), feature2 = rnorm(100), feature3 = rnorm(100))
y <- sample(c(1, -1), 100, replace = TRUE)

# Plot feature prevalence
plotPrevalence(features, X, y, main = "Feature Prevalence Plot")

# Get enrichment statistics without plotting
plotPrevalence(features, X, y, plot = FALSE)

predomics/predomicspkg documentation built on Dec. 11, 2024, 11:06 a.m.