# weitrix_sd_confects: Find rows with confidently excessive variability in a... In pfh/weitrix: Tools for matrices with precision weights, test and explore weighted or sparse data

 weitrix_sd_confects R Documentation

## Find rows with confidently excessive variability in a calibrated weitrix

### Description

Find rows with confident excess standard deviation beyond what is expected based on the weights of a calibrated weitrix. This may be used, for example, to find potential marker genes.

### Usage

``````weitrix_sd_confects(
weitrix,
design = ~1,
fdr = 0.05,
step = 0.001,
assume_normal = TRUE
)
``````

### Arguments

 `weitrix` A weitrix object, or an object that can be converted to a weitrix with `as_weitrix`. `design` A formula in terms of `colData(weitrix` or a design matrix, which will be fitted to the weitrix on each row. Can also be a pre-existing Components object, in which case the existing fits (`design\$row`) are used. `fdr` False Discovery Rate to control for. `step` Granularity of effect sizes to test. `assume_normal` Assume weighted residuals are normally distributed? Assumption of normality is quite a strong assemption here. If TRUE, tests are based on the weighted squared residuals following a chi-squared distribution. If FALSE, tests are based on assuming the dispersion follows an asymptotically normal distribution, with variance estimated from the weighted squared residuals. If FALSE, a reasonably large number of columns is required. Defaults to TRUE.

### Details

Important note: With the default setting of `assume_normal=TRUE`, the "confect" values produced by this method are only valid if the weighted residuals are close to normally distributed. If you have a reasonably large number of columns (eg single cell data), you can and should relax this assumption by specifying `assume_normal=FALSE`.

This is a conversion of the "dispersion" statistic for each row into units that are more readily interpretable, accompanied by confidence bounds with a multiple testing correction.

We are looking for further perturbation of observed values beyond what is accounted for by a linear model and, further, beyond what is expected based on the observation weights (assumed to be calibrated and so interpreted as 1/variance). We are seeking to estimate the standard deviation of this further perturbation.

The weitrix must have been calibrated for results to make sense.

Top confident effect sizes are found using the `topconfects` method, based on the model that the observed weighted sum of squared residuals being non-central chi-square distributed.

Note that all calculations are based on weighted residuals, with a rescaling to place results on the original scale. When a row has highly variable weights, this is an approximation that is only sensible if the weights are unrelated to the values themselves.

### Value

A topconfects result. The `\$table` data frame contains columns:

• effect Estimated excess standard deviation, in the same units as the observations themselves. 0 if the dispersion is less than 1.

• confect A lower confidence bound on effect.

• row_mean Weighted mean of observations in this row.

• typical_obs_err Typical accuracy of each observation.

• dispersion Dispersion. Weighted sum of squared residuals divided by residual degrees of freedom.

• n_present Number of observations with non-zero weight.

• df Degrees of freedom. n minus the number of coefficients in the model.

• fdr_zero FDR-adjusted p-value for the null hypothesis that effect is zero.

Note that `dispersion = effect^2/typical_obs_err^2 + 1` for non-zero effect values.

### Examples

``````
# weitrix_sd_confects should only be used with a calibrated weitrix
calwei <- weitrix_calibrate_all(simwei, ~1, ~1)

weitrix_sd_confects(calwei, ~1)

``````

pfh/weitrix documentation built on Oct. 13, 2023, 1:01 p.m.