Key Features

p_toolkit is a package designed to help adjust and visualize p-values when using multiple comparisons. As computing power has become powerful enough to run hundreds or even thousands of statistical tests, it is important to look at small p-values and try to understand whether the result is small simply by chance, or whether it truly is significant. There are many tools to help decide when to reject a Null hypothesis, which can control either:

We can use the p-values alone, or an adjustment method such as the Bonferroni or the Benjamini-Hochberg (BH) methods. We can also use visualization methods such as QQ-plots or a scatter plot of the p-values, to try and detect patterns.

This package aims to combine these methods in a simple-to-use format, which works by outputting dataframes, which contain results from several adjustment methods.

Package Functions

Credits

Package Dependencies

This package requires dplyr and ggplot2

Similar Packages and Functions

Some packages already exist for the p-value adjustment in both environments, R and Python:

R:

The p.adjust function comes in the base stats library in R. It's a function designed for adjusting an array of p-values using six methods, some for controlling the family-wise error ("holm", "Hochberg", "Hommel", "Bonferroni") and the others for controlling the false discovery rate ("BH", "BY","fdr"). The advantage of this function is its simplicity and that it comes in the stats library, which is built in in the default environments in R, so the user doesn't need to install external packages. It doesn't let the user analyze deeper what is going on with the tests; this is a key element of p_toolkit.

fdrtool is a package designed for analyzing the False Discovery Rate in statistical tests and not limited exclusively to p-value adjustment. Has some functions related to p_toolkit like fdrtool, which calculates and plots the false discovery rate and pval.estimate.eta0, which outputs the proportion of null p-values in a list.

Python:

This function is part of the statsmodels library, a complete set of functions for implementing statistical methods in Python. It works similar to R's p.adjust, receiving an array of p-values as inputs and returning two arrays: one with the corrected p-values and another one with boolean values corresponding to the new logical values after correction. It has no diagnostics and analysis of the results.

License

MIT License

Interested in contributing? See our Contributing Guidelines and Code of Conduct.


Created by

Amy Goldlist  ·  Esteban Angel  ·  Veronique Mulholland



UBC-MDS/ptoolkit documentation built on May 25, 2019, 1:36 p.m.