require(knitr) require(pander) knitr::opts_chunk$set(message=FALSE,warning=FALSE)
The main aim of this vignette is to give a brief example of the LMWiRe package. This package has been created to analyze models with wide response and categorical parameters. The model used in this example is a three-way ANOVA with fixed effects. This document presents all the usual step of the analysis starting from the data importation to the results visualisation. The methods used come from the article of M. Thiel @Thiel2017 and S. Guisset @Guisset2019.
The package is actually in its developping stage and is available on GitHub at the url : https://github.com/FranceschiniS/LMWiRe. It can be installed via the devtools::install_github()
function. The package needs to be loaded afterwards.
# devtools::install_github("FranceschiniS/LMWiRe") library("LMWiRe")
Before any analysis the UCH data
needs to be loaded. The LMWiRe package contains the dataset and it can be load with the data()
function. The load()
function is also useful to import your own data.
data("UCH")
The UCH dataset stands for Urine-Citrate-Hippurate and comes from \textit{Rousseau et al.} It contains 3 elements : a matrix outcomes
with 34 observations of 600 response variables representing the spectra from the H-NMR spectroscopy, a formula for the GLM model used and design
matrix with 34 observations and 5 explanatory variables.
The 600 response variables are the spectrum of the H-NMR for each observations.
LinePlot(UCH$outcomes[1:2,],main="H-NMR spectrum")
The formula is the formula of the ANOVA-GLM model used in this analysis.
UCH$formula
The design matrix contains the information about each observation for the five variables: Hippurate, Citrate, Dilution, Day and Time. Only 3 of these variables are used in the model. The function table
is useful to observe the design.
pander(head(UCH$design)) table(UCH$design$Hippurate,UCH$design$Citrate,UCH$design$Time)
The function SVDforPCA
is useful to compute a PCA decomposition of the outcomes
matrix. The scores
and loadings
can then be plotted with the function DrawScores
and DrawLoadings
.
ResPCA = SVDforPCA(UCH$outcomes)
eig.res = rbind(ResPCA$var[1:6], ResPCA$cumvar[1:6]) rownames(eig.res) = c("Variances", "Cum Var Values") pander::pander(eig.res) ggplot2::ggplot(data=as.data.frame(eig.res[1,]),ggplot2::aes(x=colnames(eig.res),y=eig.res[1,]))+ ggplot2::geom_bar(stat="identity")+ ggplot2::xlab("Principal Components")+ ggplot2::ylab("Variance Percentage")
ScatterPlot(ResPCA$scores[,1],ResPCA$scores[,2]) DrawScores(ResPCA, type.obj = "PCA", drawNames = TRUE, createWindow = F, main = "Reponse matrix score plot", color = UCH$design$Citrate, pch = UCH$design$Hippurate, axes = c(1, 2), size = 2.5) + ggplot2::scale_color_discrete(name = "Citrate") + ggplot2::scale_shape_discrete(name = "Hippurate")
DrawLoadings(ResPCA,type.obj = "PCA", main="Loadings from the two first component")
The analysis consists in decomposing the model matrix into effect matrices and performing a PCA on each of the effect matrices. Here, we have 34 observations, 5 categorical explanatory variables, 600 response variables and 8 model terms.
The first step is to make the model matrix from the matrix of the experimental design. Each explanatory variavle is reencoded with multiple binary variables. The model matrix is a \emph(34xK) with K being the total number of new binary variables.
The function LMModelMatrix()
encodes the design matrix as a model matrix.
ResLMModelMatrix = LMModelMatrix(as.formula(UCH$formula),UCH$design) pander::pander(head(ResLMModelMatrix$ModelMatrix))
The model matrix can then be decompose into effect matrices for every model terms with the function LMEffectMatrices()
ResLMEffectMatrices = LMEffectMatrices(ResLMModelMatrix,UCH$outcomes)
We can use bootstrap to determine whether an effect is significant. We suggest the function LMBootstrapTest()
.
ResLMBootstrapTest= LMBootstrapTest(ResLMEffectMatrices = ResLMEffectMatrices,nboot=5) # Print Pvalue pander::pander(ResLMBootstrapTest$Pvalue)
These methods allow to represent the information from the effect matrices in a space of reduced dimensions. The function PCALMEffects()
have a method argument to define which method to use.
The ASCA method realises PCAs on the effect matrices.
ResASCA = PCALMEffects(ResLMEffectMatrices = ResLMEffectMatrices,method="ASCA")
The contributions from each effect and their component are estimated and reported in tables with the function PrintContributions()
. Moreover the function also output a barplot with the ordered contributions.
ResPrintContributions = PrintContributions(ResASCA) pander::pander(ResPrintContributions$EffectTable) pander::pander(ResPrintContributions$ContribTable) ResPrintContributions$Barplot
The score matrices are then represented on two components with the function PlotScoresXY()
.
PlotScoresXY(ResASCA,UCH$design,EffectVector = c("Hippurate","Citrate","Hippurate:Citrate"),varname.color = "Time",varname.pch = "Time")
Finally we can represent the scores with a matrix of plots. This graph allows to observe multiple variables simultaneously.
PlotScoresMatrix(ResASCA, alleffect = FALSE, EffectNames = c("Hippurate","Citrate","Time","Hippurate:Citrate","Hippurate:Time","Citrate:Time","Hippurate:Citrate:Time","Residuals"), ModelAbbrev = TRUE, PCdim=c(1,1,1,1,1,1,1,2), design=UCH$design, varname.colorup = "Citrate", vec.colorup = c("blue","forestgreen","red"), varname.colordown = "Time", vec.colordown = c("orange","black"), varname.pchup="Hippurate", vec.pchup = c(4,16,2), varname.pchdown="Time", vec.pchdown = c(1,3))
The ASCA-E method realises PCAs on the effect matrices then add the residuals. The same functions are used.
ResASCAE = PCALMEffects(ResLMEffectMatrices = ResLMEffectMatrices,method="ASCA-E")
ResPrintContributions = PrintContributions(ResASCAE) pander::pander(ResPrintContributions$EffectTable) pander::pander(ResPrintContributions$ContribTable) ResPrintContributions$Barplot
PlotScoresXY(ResASCAE,UCH$design,EffectVector = c("Hippurate","Hippurate:Citrate"),varname.color = "Time",varname.pch = "Time")
PlotScoresMatrix(ResASCAE, alleffect = FALSE, EffectNames = c("Hippurate","Citrate","Time","Hippurate:Citrate","Hippurate:Time","Citrate:Time","Hippurate:Citrate:Time","Residuals"), ModelAbbrev = TRUE, PCdim=c(1,1,1,1,1,1,1,2), design=UCH$design, varname.colorup = "Citrate", vec.colorup = c("blue","forestgreen","red"), varname.colordown = "Time", vec.colordown = c("orange","black"), varname.pchup="Hippurate", vec.pchup = c(4,16,2), varname.pchdown="Time", vec.pchdown = c(1,3))
The APCA method realises PCAs on the effect matrices augmented by the residuals. The same functions are used.
ResAPCA = PCALMEffects(ResLMEffectMatrices = ResLMEffectMatrices,method="APCA")
ResPrintContributions = PrintContributions(ResAPCA) pander::pander(ResPrintContributions$EffectTable) pander::pander(ResPrintContributions$ContribTable) ResPrintContributions$Barplot
PlotScoresXY(ResAPCA,UCH$design,EffectVector = c("Hippurate","Citrate","Hippurate:Citrate"),varname.color = "Time",varname.pch = "Time")
PlotScoresMatrix(ResAPCA, alleffect = FALSE, EffectNames = c("Hippurate","Citrate","Time","Hippurate:Citrate","Hippurate:Time","Citrate:Time","Hippurate:Citrate:Time","Residuals"), ModelAbbrev = TRUE, PCdim=c(1,1,1,1,1,1,1,2), design=UCH$design, varname.colorup = "Citrate", vec.colorup = c("blue","forestgreen","red"), varname.colordown = "Time", vec.colordown = c("orange","black"), varname.pchup="Hippurate", vec.pchup = c(4,16,2), varname.pchdown="Time", vec.pchdown = c(1,3))
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.