Identifying differentially expressed genes between the same or different species is an urgent demand for biological and medical research. For RNA-seq data, systematic technical effects and different sequencing depths are usually encountered when conducting experiments. Normalization is regarded as an essential step in the discovery of biologically important changes in expression. The present methods usually involve normalization of the data with a scaling factor, followed by detection of significant genes. However, more than one scaling factor may exist because of the complexity of real data. Consequently, methods that normalize data by a single scaling factor may deliver suboptimal performance or may not even work. The development of modern machine learning techniques has provided a new perspective regarding discrimination between differentially expressed (DE) and non-DE genes. However, in reality, the non-DE genes comprise only a small set and may contain housekeeping genes (in same species) or conserved orthologous genes (in different species). Therefore, the process of detecting DE genes can be formulated as a one-class classification problem, where only non-DE genes are observed, while DE genes are completely absent from the training data. We transform the problem to an outlier detection problem by treating DE genes as outliers, and we propose a normalization-invariant minimum enclosing ball (NIMEB) method to construct a smallest possible ball to contain the known non-DE genes in a feature space. The genes outside the minimum enclosing ball can then be naturally considered to be DE genes. Compared with the existing methods, the proposed NIMEB method does not require data normalization, which is particularly attractive when the RNA-seq data include more than one scaling factor. Furthermore, the NIMEB method could be easily extended to different species without normalization.
Package details |
|
---|---|
Author | Yan Zhou, Jiadi Zhu |
Bioconductor views | Classification DifferentialExpression GeneExpression Normalization Sequencing |
Maintainer | Jiadi Zhu <2160090406@email.szu.edu.cn>, Yan Zhou <zhouy1016@szu.edu.cn> |
License | GPL-2 |
Version | 1.4.0 |
Package repository | View on Bioconductor |
Installation |
Install the latest version of this package by entering the following in R:
|
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.