detect_outliers: Detect outliers by network reconstruction
In skinnider/modern: Model-free outlier detection for robust network inference

Description Usage Arguments Details Value

View source: R/detect_outliers.R

modern is a method to detect outliers in high-dimensional data based on their impact on network reconstruction. The core idea is that the topology of the network reconstructed from a matrix of data should be robust to the inclusion or exclusion of each individual data point in the matrix. Single points that have a large impact on the global interaction profile of a node (e.g., a gene, protein, or metabolite) compromise the robustness of network inference, and are likely to be outliers.

1 2	detect_outliers(mat, min_pairs = 10, method = c("pearson", "kendall", "spearman"), bins = NA)

`mat`	a numeric matrix, with nodes (e.g., analytes such as genes, proteins, or metabolites) in columns, and samples in rows
`min_pairs`	minimum number of paired, non-missing observations to calculate a correlation coefficient; correlations between vectors with fewer than this number of paired observations will be replaced with `NA`
`method`	the correlation coefficient to be computed; one of `"pearson"` (default), `"kendall"`, or `"spearman"`; can be abbreviated
`bins`	optionally, the number of bins into which to group nodes on the basis of the number of observations

The degree to which a single point compromises the robustness of the network inference is quantified using autocorrelation. For each observation of a given node, the correlations between that node and all of its possible neighbors in the network are calculated with and without the inclusion of that observation. This yields two vectors of correlation coefficients. The correlation between these vectors, or autocorrelation, reflects the impact of the observation on the global interaction profile of that node, where a low correlation is indicative of network inference that is strongly dependent on the inclusion or exclusion of that single data point. This situation is reflective of a likely outlier that compromises the robustness of network inference.

The matrix of autocorrelations is subsequently converted to a matrix of Z scores, such that the matrix has a mean of zero and a standard deviation of one. If the matrix contains missing values, this scaling is performed for each group of columns with equivalent numbers of missing values separately. Optionally, if there are many possible numbers of missing values, the z score can be calculated for approximately equal sized bins of missing value counts using the bins parameter.

a matrix with identical dimensions to the input matrix, containing the autocorrelation Z score assigned to each non-missing observation

skinnider/modern documentation built on Feb. 20, 2020, 1:52 p.m.