testPseudotime: Test for differences along pseudotime
In scran: Methods for Single-Cell RNA-Seq Data Analysis

Description Usage Arguments Details Value Author(s) See Also Examples

Implements a simple method of testing for significant differences with respect to pseudotime, based on fitting linear models with a spline basis matrix. This function is now deprecated as it has been moved to the TSCAN package itself.

testPseudotime(x, ...)

## S4 method for signature 'ANY'
testPseudotime(
  x,
  pseudotime,
  df = 5,
  get.lfc = TRUE,
  get.spline.coef = FALSE,
  trend.only = TRUE
)

## S4 method for signature 'SummarizedExperiment'
testPseudotime(x, ..., assay.type = "logcounts")

`x`	A numeric matrix-like object containing log-expression values for cells (columns) and genes (rows). Alternatively, a SummarizedExperiment containing such a matrix.
`...`	For the generic, further arguments to pass to specific method. For the ANY method, further arguments to pass to `fitLinearModel`. For the SummarizedExperiment method, further arguments to pass to the ANY method.
`pseudotime`	A numeric matrix with one row per cell in `x` and one column per path (i.e., lineage). A vector is treated the same as a 1-column matrix.
`df`	Integer scalar specifying the degrees of freedom for the splines.
`get.lfc`	Logical scalar indicating whether to return an overall log-fold change along each path.
`get.spline.coef`	Logical scalar indicating whether to return the estimates of the spline coefficients.
`trend.only`	Logical scalar indicating whether only differences in the trend should be considered when testing for differences between paths.
`assay.type`	String or integer scalar specifying the assay containing the log-expression matrix.

For a single path in pseudotime, this function fits a natural spline to the expression of each gene with respect to pseudotime. It then does an ANOVA to test whether any of the spline coefficients are non-zero. In this manner, genes exhibiting a significant (and potentially non-linear) trend with respect to the pseudotime can be detected as those with low p-values.

For multiple paths in pseudotime, the null hypothesis is that all paths have the same trend (if trend.only=TRUE) or the same trend and intercept (if FALSE). This is done by effectively fitting a separate trend to each path and performing an ANOVA to detect differences in the trend alone or in the trend and intercept. In this manner, genes exhibiting differences in behavior between paths can be detected.

The expected format of pseudotime is the same as that returned by orderClusterMST. Each cell is assigned to a path if it has a non-NA value in the corresponding column. For single path testing, cells with NA values in pseudotime are ignored; for multiple path testing, cells assigned to multiple paths are ignored.

By default, estimates of the spline coefficients are not returned as they are difficult to interpret. Rather, a log-fold change of expression along each path is estimated to provide some indication of the overall magnitude and direction of any change.

A DataFrame is returned containing the statistics for each gene (row), including the p-value and its BH-adjusted equivalent. If get.lfc=TRUE, an overall log-fold change is returned for each path.

If get.spline.coef=TRUE, the estimated spline coefficients are also returned (single path) or the differences in the spline fits to the first path are returned (multiple paths).

Aaron Lun

orderClusterMST, to generate the pseudotime matrix.

testLinearModel, which performs the tests under the hood.

y <- matrix(rnorm(10000), ncol=100)

# Testing for a difference in a single path:
u <- runif(100)
testPseudotime(y, u)

# Testing for differences in multiple paths
# by mocking up a pseudotime matrix.
p <- cbind(path1=u, path2=u)
path1 <- rbinom(length(u), 1, 0.5)==0
p[!path1,1] <- NA
p[path1,2] <- NA

testPseudotime(y, p)