Calculates cell type proportions, performs a variance stabilising transformation on the proportions and determines whether the cell type proportions are statistically significant between different groups using linear modelling.

propeller( x = NULL, clusters = NULL, sample = NULL, group = NULL, trend = FALSE, robust = TRUE, transform = "logit" )

`x` |
object of class |

`clusters` |
a factor specifying the cluster or cell type for every cell.
For |

`sample` |
a factor specifying the biological replicate for each cell.
For |

`group` |
a factor specifying the groups of interest for performing the
differential proportions analysis. For |

`trend` |
logical, if true fits a mean variance trend on the transformed proportions |

`robust` |
logical, if true performs robust empirical Bayes shrinkage of the variances |

`transform` |
a character scalar specifying which transformation of the proportions to perform. Possible values include "asin" or "logit". Defaults to "logit". |

This function will take a `SingleCellExperiment`

or `Seurat`

object and extract the `group`

, `sample`

and `clusters`

cell
information. The user can either state these factor vectors explicitly in
the call to the `propeller`

function, or internal functions will
extract them from the relevants objects. The user must ensure that
`group`

and `sample`

are columns in the metadata assays of the
relevant objects (any combination of upper/lower case is acceptable). For
`Seurat`

objects the clusters are extracted using the `Idents`

function. For `SingleCellExperiment`

objects, `clusters`

needs to
be a column in the `colData`

assay.

The `propeller`

function calculates cell type proportions for each
biological replicate, performs a variance stabilising transformation on the
matrix of proportions and fits a linear model for each cell type or cluster
using the `limma`

framework. There are two options for the
transformation: arcsin square root or logit. Propeller tests whether there
is a difference in the cell type proportions between multiple groups.
If there are only 2 groups, a t-test is used to calculate p-values, and if
there are more than 2 groups, an F-test (ANOVA) is used. Cell type
proportions of 1 or 0 are accommodated. Benjamini and Hochberg false
discovery rates are calculated to account to multiple testing of
cell types/clusters.

produces a dataframe of results

Belinda Phipson

Smyth, G.K. (2004). Linear models and empirical Bayes methods
for assessing differential expression in microarray experiments.
*Statistical Applications in Genetics and Molecular Biology*, Volume
**3**, Article 3.

Benjamini, Y., and Hochberg, Y. (1995). Controlling the false discovery
rate: a practical and powerful approach to multiple testing. *Journal
of the Royal Statistical Society Series*, B, **57**, 289-300.

`propeller.ttest`

`propeller.anova`

`lmFit`

, `eBayes`

,
`getTransformedProps`

library(speckle) library(ggplot2) library(limma) # Make up some data # True cell type proportions for 4 samples p_s1 <- c(0.5,0.3,0.2) p_s2 <- c(0.6,0.3,0.1) p_s3 <- c(0.3,0.4,0.3) p_s4 <- c(0.4,0.3,0.3) # Total numbers of cells per sample numcells <- c(1000,1500,900,1200) # Generate cell-level vector for sample info biorep <- rep(c("s1","s2","s3","s4"),numcells) length(biorep) # Numbers of cells for each of the 3 clusters per sample n_s1 <- p_s1*numcells[1] n_s2 <- p_s2*numcells[2] n_s3 <- p_s3*numcells[3] n_s4 <- p_s4*numcells[4] # Assign cluster labels for 4 samples cl_s1 <- rep(c("c0","c1","c2"),n_s1) cl_s2 <- rep(c("c0","c1","c2"),n_s2) cl_s3 <- rep(c("c0","c1","c2"),n_s3) cl_s4 <- rep(c("c0","c1","c2"),n_s4) # Generate cell-level vector for cluster info clust <- c(cl_s1,cl_s2,cl_s3,cl_s4) length(clust) # Assume s1 and s2 belong to group 1 and s3 and s4 belong to group 2 grp <- rep(c("grp1","grp2"),c(sum(numcells[1:2]),sum(numcells[3:4]))) propeller(clusters = clust, sample = biorep, group = grp, robust = FALSE, trend = FALSE, transform="asin")

