require(knitr)
require(pander)
knitr::opts_chunk$set(message=FALSE,warning=FALSE)

Introduction

The main aim of this vignette is to give a brief example of the LMWiRe package. This package has been created to analyze models with wide response and categorical parameters. The model used in this example is a three-way ANOVA with fixed effects. This document presents all the usual step of the analysis starting from the data importation to the results visualisation. The methods used come from the article of M. Thiel @Thiel2017 and S. Guisset @Guisset2019.

Installation

The package is actually in its developping stage and is available on GitHub at the url : https://github.com/FranceschiniS/LMWiRe. It can be installed via the devtools::install_github() function. The package needs to be loaded afterwards.

# devtools::install_github("FranceschiniS/LMWiRe")

library("LMWiRe")

Data importation

Before any analysis the UCH data needs to be loaded. The LMWiRe package contains the dataset and it can be load with the data() function. The load() function is also useful to import your own data.

data("UCH")

Data visualisation

The UCH dataset stands for Urine-Citrate-Hippurate and comes from \textit{Rousseau et al.} It contains 3 elements : a matrix outcomes with 34 observations of 600 response variables representing the spectra from the H-NMR spectroscopy, a formula for the GLM model used and design matrix with 34 observations and 5 explanatory variables.

Spectrum Visualization (= Outcomes)

The 600 response variables are the spectrum of the H-NMR for each observations.

LinePlot(UCH$outcomes[1:2,],main="H-NMR spectrum")

Formula

The formula is the formula of the ANOVA-GLM model used in this analysis.

UCH$formula

Design

The design matrix contains the information about each observation for the five variables: Hippurate, Citrate, Dilution, Day and Time. Only 3 of these variables are used in the model. The function table is useful to observe the design.

pander(head(UCH$design))

table(UCH$design$Hippurate,UCH$design$Citrate,UCH$design$Time)

Principal Component Analysis

The function SVDforPCA is useful to compute a PCA decomposition of the outcomes matrix. The scores and loadings can then be plotted with the function DrawScores and DrawLoadings.

ResPCA = SVDforPCA(UCH$outcomes)
eig.res = rbind(ResPCA$var[1:6], ResPCA$cumvar[1:6])
rownames(eig.res) = c("Variances", "Cum Var Values")
pander::pander(eig.res)

ggplot2::ggplot(data=as.data.frame(eig.res[1,]),ggplot2::aes(x=colnames(eig.res),y=eig.res[1,]))+
    ggplot2::geom_bar(stat="identity")+
    ggplot2::xlab("Principal Components")+
    ggplot2::ylab("Variance Percentage")
ScatterPlot(ResPCA$scores[,1],ResPCA$scores[,2])

DrawScores(ResPCA, type.obj = "PCA", drawNames = TRUE, createWindow = F, 
    main = "Reponse matrix score plot", color = UCH$design$Citrate, pch = UCH$design$Hippurate, 
    axes = c(1, 2), size = 2.5) + ggplot2::scale_color_discrete(name = "Citrate") + ggplot2::scale_shape_discrete(name = "Hippurate")
DrawLoadings(ResPCA,type.obj = "PCA", main="Loadings from the two first component")

GLM decomposition

The analysis consists in decomposing the model matrix into effect matrices and performing a PCA on each of the effect matrices. Here, we have 34 observations, 5 categorical explanatory variables, 600 response variables and 8 model terms.

The first step is to make the model matrix from the matrix of the experimental design. Each explanatory variavle is reencoded with multiple binary variables. The model matrix is a \emph(34xK) with K being the total number of new binary variables.

The function LMModelMatrix() encodes the design matrix as a model matrix.

ResLMModelMatrix = LMModelMatrix(as.formula(UCH$formula),UCH$design)
pander::pander(head(ResLMModelMatrix$ModelMatrix))

The model matrix can then be decompose into effect matrices for every model terms with the function LMEffectMatrices()

ResLMEffectMatrices = LMEffectMatrices(ResLMModelMatrix,UCH$outcomes)

Bootstrap test

We can use bootstrap to determine whether an effect is significant. We suggest the function LMBootstrapTest().

ResLMBootstrapTest= LMBootstrapTest(ResLMEffectMatrices = ResLMEffectMatrices,nboot=5)

# Print Pvalue

pander::pander(ResLMBootstrapTest$Pvalue)

ASCA/ASCA-E/APCA

These methods allow to represent the information from the effect matrices in a space of reduced dimensions. The function PCALMEffects() have a method argument to define which method to use.

ASCA

The ASCA method realises PCAs on the effect matrices.

ResASCA = PCALMEffects(ResLMEffectMatrices = ResLMEffectMatrices,method="ASCA")

The contributions from each effect and their component are estimated and reported in tables with the function PrintContributions(). Moreover the function also output a barplot with the ordered contributions.

ResPrintContributions = PrintContributions(ResASCA)

pander::pander(ResPrintContributions$EffectTable)
pander::pander(ResPrintContributions$ContribTable)

ResPrintContributions$Barplot

The score matrices are then represented on two components with the function PlotScoresXY().

PlotScoresXY(ResASCA,UCH$design,EffectVector = c("Hippurate","Citrate","Hippurate:Citrate"),varname.color = "Time",varname.pch = "Time")

Finally we can represent the scores with a matrix of plots. This graph allows to observe multiple variables simultaneously.

PlotScoresMatrix(ResASCA,
                 alleffect = FALSE,
                 EffectNames = c("Hippurate","Citrate","Time","Hippurate:Citrate","Hippurate:Time","Citrate:Time","Hippurate:Citrate:Time","Residuals"),
                 ModelAbbrev = TRUE,
                 PCdim=c(1,1,1,1,1,1,1,2),
                 design=UCH$design,
                 varname.colorup = "Citrate",
                 vec.colorup = c("blue","forestgreen","red"),
                 varname.colordown  = "Time",
                 vec.colordown = c("orange","black"),
                 varname.pchup="Hippurate",
                 vec.pchup = c(4,16,2),
                 varname.pchdown="Time",
                 vec.pchdown = c(1,3))

ASCA-E

The ASCA-E method realises PCAs on the effect matrices then add the residuals. The same functions are used.

ResASCAE = PCALMEffects(ResLMEffectMatrices = ResLMEffectMatrices,method="ASCA-E")
ResPrintContributions = PrintContributions(ResASCAE)

pander::pander(ResPrintContributions$EffectTable)
pander::pander(ResPrintContributions$ContribTable)

ResPrintContributions$Barplot
PlotScoresXY(ResASCAE,UCH$design,EffectVector = c("Hippurate","Hippurate:Citrate"),varname.color = "Time",varname.pch = "Time")
PlotScoresMatrix(ResASCAE,
                 alleffect = FALSE,
                 EffectNames = c("Hippurate","Citrate","Time","Hippurate:Citrate","Hippurate:Time","Citrate:Time","Hippurate:Citrate:Time","Residuals"),
                 ModelAbbrev = TRUE,
                 PCdim=c(1,1,1,1,1,1,1,2),
                 design=UCH$design,
                 varname.colorup = "Citrate",
                 vec.colorup = c("blue","forestgreen","red"),
                 varname.colordown  = "Time",
                 vec.colordown = c("orange","black"),
                 varname.pchup="Hippurate",
                 vec.pchup = c(4,16,2),
                 varname.pchdown="Time",
                 vec.pchdown = c(1,3))

APCA

The APCA method realises PCAs on the effect matrices augmented by the residuals. The same functions are used.

ResAPCA = PCALMEffects(ResLMEffectMatrices = ResLMEffectMatrices,method="APCA")
ResPrintContributions = PrintContributions(ResAPCA)

pander::pander(ResPrintContributions$EffectTable)
pander::pander(ResPrintContributions$ContribTable)

ResPrintContributions$Barplot
PlotScoresXY(ResAPCA,UCH$design,EffectVector = c("Hippurate","Citrate","Hippurate:Citrate"),varname.color = "Time",varname.pch = "Time")
PlotScoresMatrix(ResAPCA,
                 alleffect = FALSE,
                 EffectNames = c("Hippurate","Citrate","Time","Hippurate:Citrate","Hippurate:Time","Citrate:Time","Hippurate:Citrate:Time","Residuals"),
                 ModelAbbrev = TRUE,
                 PCdim=c(1,1,1,1,1,1,1,2),
                 design=UCH$design,
                 varname.colorup = "Citrate",
                 vec.colorup = c("blue","forestgreen","red"),
                 varname.colordown  = "Time",
                 vec.colordown = c("orange","black"),
                 varname.pchup="Hippurate",
                 vec.pchup = c(4,16,2),
                 varname.pchdown="Time",
                 vec.pchdown = c(1,3))

Session info

sessionInfo()

References



FranceschiniS/LMWiRe documentation built on Oct. 30, 2019, 6:20 p.m.