POMA Normalization Methods

Compiled date: r Sys.Date()

Last edited: 2020-08-01

License: r packageDescription("POMA")[["License"]]

knitr::opts_chunk$set(
  collapse = TRUE,
  # fig.align = "center",
  comment = ">"
)

Installation

Run the following code to install the Bioconductor version of package.

# install.packages("BiocManager")
BiocManager::install("POMA")

Load Packages

library(POMA)
library(Biobase)
library(ggplot2)
library(patchwork)

Load Data and Imputation

Let's create a cleaned MSnSet object from example st000336 data to explore the normalization effects.

# load example data
data("st000336")

# imputation using the default method KNN
example_data <- st000336 %>% PomaImpute()
example_data

Normalization

Here we will evaluate ALL normalization methods that POMA offers on the same MSnSet object to compare them [@normalization].

none <- PomaNorm(example_data, method = "none")
auto_scaling <- PomaNorm(example_data, method = "auto_scaling")
level_scaling <- PomaNorm(example_data, method = "level_scaling")
log_scaling <- PomaNorm(example_data, method = "log_scaling")
log_transformation <- PomaNorm(example_data, method = "log_transformation")
vast_scaling <- PomaNorm(example_data, method = "vast_scaling")
log_pareto <- PomaNorm(example_data, method = "log_pareto")

Normalization effect on data dimensions

When we check for the dimension of the data after normalization we can see that ALL methods have the same effect on data dimension. PomaNorm only change the data dimension when the data have features that only have zeros and when the data have features with 0 variance. Only in these two cases PomaNorm will remove features of the data, changing the data dimensions.

dim(Biobase::exprs(none))
dim(Biobase::exprs(auto_scaling))
dim(Biobase::exprs(level_scaling))
dim(Biobase::exprs(log_scaling))
dim(Biobase::exprs(log_transformation))
dim(Biobase::exprs(vast_scaling))
dim(Biobase::exprs(log_pareto))

Normalization effect on samples

Here we can evaluate the different normalization effects on samples [@normalization].

a <- PomaBoxplots(none, group = "samples", jitter = FALSE) +
  ggtitle("Not Normalized")

b <- PomaBoxplots(auto_scaling, group = "samples", jitter = FALSE) +
  ggtitle("Auto Scaling") +
  theme(axis.text.x = element_blank(),
        legend.position = "none")

c <- PomaBoxplots(level_scaling, group = "samples", jitter = FALSE) +
  ggtitle("Level Scaling") +
  theme(axis.text.x = element_blank(),
        legend.position = "none")

d <- PomaBoxplots(log_scaling, group = "samples", jitter = FALSE) +
  ggtitle("Log Scaling") +
  theme(axis.text.x = element_blank(),
        legend.position = "none")

e <- PomaBoxplots(log_transformation, group = "samples", jitter = FALSE) +
  ggtitle("Log Transformation") +
  theme(axis.text.x = element_blank(),
        legend.position = "none")

f <- PomaBoxplots(vast_scaling, group = "samples", jitter = FALSE) +
  ggtitle("Vast Scaling") +
  theme(axis.text.x = element_blank(),
        legend.position = "none")

g <- PomaBoxplots(log_pareto, group = "samples", jitter = FALSE) +
  ggtitle("Log Pareto") +
  theme(axis.text.x = element_blank(),
        legend.position = "none")

a  
(b + c + d) / (e + f + g)

Normalization effect on features

Here we can evaluate the different normalization effects on features.

h <- PomaDensity(none, group = "features") +
  ggtitle("Not Normalized")

i <- PomaDensity(auto_scaling, group = "features") +
  ggtitle("Auto Scaling") +
  theme(axis.title.x = element_blank(),
        axis.title.y = element_blank())

j <- PomaDensity(level_scaling, group = "features") +
  ggtitle("Level Scaling") +
  theme(axis.title.x = element_blank(),
        axis.title.y = element_blank())

k <- PomaDensity(log_scaling, group = "features") +
  ggtitle("Log Scaling") +
  theme(axis.title.x = element_blank(),
        axis.title.y = element_blank())

l <- PomaDensity(log_transformation, group = "features") +
  ggtitle("Log Transformation") +
  theme(axis.title.x = element_blank(),
        axis.title.y = element_blank())

m <- PomaDensity(vast_scaling, group = "features") +
  ggtitle("Vast Scaling") +
  theme(axis.title.x = element_blank(),
        axis.title.y = element_blank())

n <- PomaDensity(log_pareto, group = "features") +
  ggtitle("Log Pareto") +
  theme(axis.title.x = element_blank(),
        axis.title.y = element_blank())

h  
(i + j + k) / (l + m + n)

Session Information

sessionInfo()

References



Try the POMA package in your browser

Any scripts or data that you put into this service are public.

POMA documentation built on Nov. 8, 2020, 6:26 p.m.