plotecdf: Plot Empirical Cumulative Distribution Function (ECDF)

View source: R/plotecdf.R

plotecdfR Documentation

Plot Empirical Cumulative Distribution Function (ECDF)

Description

This function generates an ECDF plot to analyze transcription density relative to the distance from the transcription start site (TSS) across different conditions. The plot displays AUC values, Kolmogorov-Smirnov (KS) statistics, and knee points, with options to display or save the plot.

Usage

plotecdf(dfmeandiff, unigroupdf, expdf, genename,
   colvec = c("#90AFBB", "#10AFBB", "#FF9A04", "#FC4E07"),
   outfold = tempdir(), digits = 2, middlewind = 100, pval = 0.01,
   plot = FALSE, formatname = "pdf", verbose = TRUE)

Arguments

dfmeandiff

A data frame containing the mean differences of transcription levels and cumulative distribution values (Fx) for different windows around the TSS (see meandifference).

unigroupdf

A data frame containing gene-specific statistics, including their belonging to Universe or Group (see universegroup).

expdf

A data frame containing experiment data that should have columns named 'condition', 'replicate', 'strand', and 'path'.

genename

A string specifying the name of the gene of interest to plot.

colvec

A vector of colors used to distinguish different conditions in the plot. Default is c("#90AFBB", "#10AFBB", "#FF9A04", "#FC4E07").

outfold

A string specifying the output folder where the plot will be saved if plot = FALSE. Default is tempdir().

digits

The number of decimal places to round the AUC and KS values. Default is 2.

middlewind

The index of the middle window representing the region centered around the TSS. Default is 100.

pval

A numeric value for the p-value threshold to determine the significance of the KS test. Default is 0.01.

plot

A logical flag indicating whether to display the plot interactively (TRUE) or save it to a file (FALSE). Default is FALSE.

formatname

String of the format of the saved plot. Possible values are "eps", "ps", "tex" (pictex), "pdf", "jpeg", "tiff", "png", "bmp", and "svg". Default is "pdf".

verbose

A logical flag indicating whether to display detailed messages about the function's progress. Default is TRUE.

Details

The function processes data related to transcription levels and cumulative transcription density for a given gene across multiple experimental conditions. The ECDF plot is constructed with optional annotation of key statistics such as AUC values and significant KS test results. Knee points, representing significant changes in transcription density, are also displayed if the KS test passes the specified p-value threshold.

Colvec: The number of colors should be equal to the number of rows of expdf divided by two (a forward and reverse files are provided for each experiment).

Value

An ECDF plot showing the transcription density across windows around the TSS, with highlights for significant KS test results and knee points. The plot can either be displayed or saved as a file.

See Also

[meandifference], [universegroup]

Examples

exppath <-  system.file("extdata", "exptab.csv", package="tepr")
transpath <- system.file("extdata", "cugusi_6.tsv", package="tepr")
expthres <- 0.1

## Calculating necessary results
expdf <- read.csv(exppath)
transdf <- read.delim(transpath, header = FALSE)
avfilt <- averageandfilterexprs(expdf, transdf, expthres,
       showtime = FALSE, verbose = FALSE)
rescountna <- countna(avfilt, expdf, nbcpu = 1, verbose = FALSE)
ecdf <- genesECDF(avfilt, expdf, verbose = FALSE)
resecdf <- ecdf[[1]]
nbwindows <- ecdf[[2]]
resmeandiff <- meandifference(resecdf, expdf, nbwindows,
    verbose = FALSE)
bytranslistmean <- split(resmeandiff, factor(resmeandiff$transcript))
resknee <- kneeid(bytranslistmean, expdf, verbose = FALSE)
resauc <- allauc(bytranslistmean, expdf, nbwindows, verbose = FALSE)
resatt <- attenuation(resauc, resknee, rescountna, bytranslistmean, expdf,
        resmeandiff, verbose = FALSE)
resug <- universegroup(resatt, expdf, verbose = FALSE)

## Testing plotecdf
colvec <- c("#90AFBB", "#10AFBB", "#FF9A04", "#FC4E07")
plotecdf(resmeandiff, resug, expdf, "EGFR", colvec, plot = TRUE, verbose = FALSE)


tepr documentation built on June 8, 2025, 10:46 a.m.