In tpq/propr: An R package to calculate proportionality and other measures for compositional data

knitr::opts_chunk$set(echo = TRUE)

Attention!!

If you have used propr previously, you will notice some changes. In version 5.0.0, I have completed a major revision of the code base to simplify the maintenance of propr going forward, including a restructure of back-end and front-end functionality. From the user's perspective, you will notice a few changes. First, all supporting visualization functions are gone. They were poorly implemented and not backwards compatible. Instead, you can use the unified getResults wrapper to pull data from propr and propd objects to pipe to ggplot2 for visualization. Second, many experimental functions have been removed, with the remaining ones all sharing the prefix run. The core routines called by the propr and propd functions remain unchanged.

Introduction

The propr package provides an interface for 4 distinct approaches to compositional data analysis (CoDA): proportionality, differential proportionality, logratio partial correlation with basis shrinkage, and ratio analysis.

If you use this software, please cite our work. We don't get paid to make software, but your citations help us to negotiate support for software maintenance and development.

citation("propr")

OK, now let's get started.

counts <- matrix(rpois(20*50, 100), 20, 50)
group <- sample(c("A", "B"), size = 20, replace = TRUE)
devtools::install_github("tpq/propr")
library(propr)

Proportionality

There are a few proportionality statistics available. Select one with the 'metric' argument.

pr <- propr(
        counts,  # rows as samples, like it should be
        metric = "rho",  # or "phi", "phs", "vlr"
        ivar = "clr",  # or can use another gene as reference, by giving the name or index
        alpha = NA,  # use to handle zeros
        p = 100  # used for permutation tests
      )

You can determine the "signficance" of proportionality using a built-in permutation procedure. It estimates the false discovery rate (FDR) for any cutoff. This method can take a while to run, but is parallelizable.

pr <- updateCutoffs(
        pr,
        number_of_cutoffs = 100,  # number of cutoffs to estimate FDR
        custom_cutoffs = NULL,  # or specify custom cutoffs
        tails = 'right',  # consider only the positive values ('right') or both sides ('both')
        ncores = 4  # parallelize here
      )

Choose the largest cutoff with an acceptable FDR.

Logratio partial correlation with basis shrinkage

There are many ways to calculate partial correlations, with or without shrinkage. The recommended one for datasets with p>>n and influenced by compositional bias is "pcor.bshrink".

pr <- propr(
        counts,  # rows as samples, like it should be
        metric = "pcor.bshrink",  # partial correlation without shrinkage "pcor" is also available
        ivar = "clr",  # "clr" is recommended
        p = 100  # used for permutation tests
      )

You can also determine the "significance" of logratio partial correlations with the built-in permutation approach.

pr <- updateCutoffs(
        pr,
        number_of_cutoffs = 100,  # number of cutoffs to estimate FDR
        custom_cutoffs = NULL,  # or specify custom cutoffs
        tails = 'right',  # consider only the positive values ('right') or both sides ('both')
        ncores = 4  # parallelize here
      )

Differential Proportionality

There are also a few differential proportionality statistics, but they all get calculated at once.

pd <- propd(
        counts,
        group,  # a vector of 2 or more groups
        alpha = NA,  # whether to handle zeros
        p = 100,  # used for permutation tests
        weighted = TRUE  # whether to weight log-ratios
      )

You can switch between the "disjointed" and "emergent" statistics.

setDisjointed(pd)

setEmergent(pd)

You can again permute an FDR with the updateCutoffs method. Alternatively, you can calculate an exact p-value for $\theta$ based on a F-test. This is handled by the updateF method.

pd <- updateF(
        pd,
        moderated = FALSE,  # moderate stats with limma-voom
        ivar = "clr"  # used for moderation
      )

Getters

Both functions return S4 objects. This package includes several helper functions that work for both the propr and propd output.

?getMatrix # get results as a square matrix
?getResults # get propr or propd results in long-format
?getRatios # get samples by ratios matrix

Use getResults to pipe to ggplot2 for visualization.

We also provide accesory functions to get the significant pairs.

?getSignificantResultsFDR
?getSignificantResultsFstat
?getAdjacencyFDR
?getAdjacencyFstat
?getCutoffFDR
?getCutoffFstat

Notice that for the getter functions to work properly on propd objects, you have to set the target theta value active:

setActive(pd, "theta_d")
setActive(pd, "theta_e")
setActive(pd, "theta_mod")

Ratio Methods

COMING SOON!!

tpq/propr documentation built on Feb. 3, 2025, 4:23 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

tpq/propr
An R package to calculate proportionality and other measures for compositional data

In tpq/propr: An R package to calculate proportionality and other measures for compositional data

Attention!!

Introduction

Proportionality

Logratio partial correlation with basis shrinkage

Differential Proportionality

Getters

Ratio Methods

R Package Documentation

Browse R Packages

We want your feedback!

tpq/propr An R package to calculate proportionality and other measures for compositional data

In tpq/propr: An R package to calculate proportionality and other measures for compositional data

Attention!!

Introduction

Proportionality

Logratio partial correlation with basis shrinkage

Differential Proportionality

Getters

Ratio Methods

R Package Documentation

Browse R Packages

We want your feedback!

tpq/propr
An R package to calculate proportionality and other measures for compositional data