# Identifies outliers in a similarity matrix.

### Description

By default uses the Fisher z-transform for Pearson correlation (atanh), and identifies outliers as those above the quantile of a skew-t distribution with mean and standard deviation estimated from the z-transformed matrix. The quantile is calculated from the Bonferroni-corrected cumulative probability of the upper tail.

### Usage

1 2 |

### Arguments

`similarity.mat` |
A matrix of similarities - larger values mean more similar. |

`bonf.prob` |
Bonferroni-corrected probability. A raw.prob is calculated by dividing this by the number of non-missing values in similarity.mat, and the rejection threshold is qnorm(1-raw.prob, mean, sd) where mean and sd are estimated from the transFun-transformed similarity.mat. |

`transFun` |
A function applied to the numeric values of similarity.mat, that should result in normally-distributed values. |

`normal.upper.thresh` |
Instead of specifying bonf.prob and transFun, an upper similarity threshold can be set, and values above this will be considered likely duplicates. If specified, this over-rides bonf.prob. |

`tail` |
"upper" to look for samples with very high similarity values, "lower" to look for very low values, or "both" to look for both. |

### Value

Returns either NULL or a dataframe with three columns: sample1, sample2, and similarity.

### Author(s)

Levi Waldron, Markus Riester, Marcel Ramos

### Examples

1 2 3 4 |

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker. Vote for new features on Trello.