

The outlier concept is ubiquitous in real-world data analysis. Concern with exceptional values in scientific observation spans the entire history of science. The literature of statistical methods for analyzing data in the presence of outliers is extensive and growing.

The outlier concept frequently emerges in the definition of methods for the analysis of tumor genomes. Two examples are the outlier sums method of Tibshirani and Hastie (2007), and the DriverNet algorithm of Bashashati et al (2013). ....

There are two primary motivations for this paper. First, we describe more formal approaches to univariate and multivariate outlier identification in tumor expression profiles. We show that the formal and informal approaches lead to different enumerations of outlying cases.

The second motivation for this paper is the demonstration of effects of siloing of methods and data in cancer genomics. Siloing persists despite many efforts at federation. This is illustrated through detailing the steps of linking of outlying expression patterns to mutation and survival profiles.

vjcitn/conkout documentation built on May 7, 2019, 9:32 a.m.