NormQcmets: Normalisation methods based on quality control metabolites
In NormalizeMets: Analysis of Metabolomics Data

Description Usage Arguments Details Value Author(s) References Examples

Normalise a metabolomic data matrix using internal, external standards and other quality control metabolites

NormQcmets(featuredata, factors = NULL, factormat = NULL, method = c("is",
  "nomis", "ccmn", "ruv2", "ruvrand", "ruvrandclust"), isvec = NULL,
  ncomp = NULL, k = NULL, plotk = TRUE, lambda = NULL, qcmets = NULL,
  maxIter = 200, nUpdate = 100, lambdaUpdate = TRUE, p = 2,
  saveoutput = FALSE, outputname = NULL, ...)

`featuredata`	featuredata A data frame in the featuredata format. This is a dataframe with metabolites in columns and samples in rows. Unique sample names should be provided as row names.
`factors`	For the ccmn method. A vector or a dataframe containing biological factors.
`factormat`	For the ruv2 method. A design matrix for the linear model, consisting of biological factors of interest.
`method`	A character string indicating the required normalization method. Must be one of "`is`", "`nomis`", "`ccmn`", "`ruv2`", "`ruvrand` " or "`ruvrandclust`". See Details for information.
`isvec`	A vector of internal standards to be used with the method "`is`".
`ncomp`	Number of PCA components to be used for the "`ccmn`" method. If `NULL`, this will be determined by cross validation as described by Redestig (2012).
`k`	Number of factors of unwanted variation to be included in the "`ruv`" models.
`plotk`	For the "`ruvrand`" method. A logical indicating whether a bargraph for the proportion of variance explained by the factors of unwanted variation needs to be plotted
`lambda`	The regularization parameter for the "`ruvrand`" method which depends on k. If not entered, it will be estimated. See DeLivera et al, 2015 for details.
`qcmets`	A vector indicating which metabolites should be used as the internal, external standards or other quality control metabolites in the "`ruv`" models, or as multiple internal standards in the "`ccmn`" and "`nomis`" methods.
`maxIter`	For the "`ruvrandclust`" method. Maximum number of iterations for "`ruvrandclust`" method.
`nUpdate`	For the "`ruvrandclust`" method. Update the unwanted variation component every nUpdate iterations.
`lambdaUpdate`	For the "`ruvrandclust`" method. A logical indicating whether the regularization parameter needs to be updated
`p`	For the "`ruvrandclust`" method. The number of clusters to be used in the k-means clustering.
`saveoutput`	A logical indicating whether the normalised data matrix should be saved as a .csv file.
`outputname`	The name of the output file if the output has to be saved.
`...`	Other arguments to be passed onto `LinearModelFit`.

These normalisation methods include "is" which uses a single standard, Cross-contribution Compensating Multiple internal standard Normalisation, "ccmn" (Redestig et al., 2009); normalization using optimal selection of multiple internal standards, "nomis" (Sysi-aho et al. 2007), "ruv2" (De Livera et al. 2012a), and "ruvrand", "ruvrandclust" (De Livera et al. 2015).

An overview of these normalisation methods are given by De Livera et al. (2015).

If the method is ‘ruv2’, the function will return an object of class MArrayLM, containing F statistics, t statistics, corresponding confidence intervals, and adjusted and unadjusted p-values. See LinearModelFit. For all other methods, the result is an object of class alldata. Additionally, the list also contains the removed unwanted variation component (UVcomp),and the results from the optimization algorithm (opt) for the "ruvrandclust" method @seealso normFit.

Alysha M De Livera, Gavriel Olshansky

De Livera, Alysha M De, M. Aho-Sysi, Laurent Jacob, J. Gagnon-Bartch, Sandra Castillo, J.A. Simpson, and Terence P. Speed. 2015. Statistical Methods for Handling Unwanted Variation in Metabolomics Data. Analytical Chemistry 87 (7). American Chemical Society: 3606-3615.

De Livera, A. M., Dias, D. A., De Souza, D., Rupasinghe, T., Pyke, J., Tull, D., Roessner, U., McConville, M., Speed, T. P. (2012a) Normalising and integrating metabolomics data. Analytical Chemistry 84(24): 1076-10776.

De Livera, A.M., Olshansky, M., Speed, T. P. (2013) Statistical analysis of metabolomics data. Methods in Molecular Biology In press.

Gagnon-Bartsch, Johann A., Speed, T. P. (2012) Using control genes to correct for unwanted variation in microarray data. Biostatistics 13(3): 539-552.

Redestig, H., Fukushima, A., Stenlund, H., Moritz, T., Arita, M., Saito, K., Kusano, M. (2009) Compensation for systematic cross-contribution improves normalization of mass spectrometry based metabolomics data. Analytical Chemistry 81(19): 7974-7980.

Sysi-Aho, M., Katajamaa, M., Yetukuri, L., Oresic, M. (2007) Normalization method for metabolomics data using optimal selection of multiple internal standards. BMC Bioinformatics 8(1); 93.

   ## Reading the data
     data(mixdata)
     featuredata <- mixdata$featuredata
     sampledata<-mixdata$sampledata
     metabolitedata<-mixdata$metabolitedata
     isvec<-featuredata[,which(metabolitedata$type =="IS")[1]]
     factors<-sampledata$type
     qcmets<-which(metabolitedata$type =="IS")
    
    ## Normalise by an internal or an internal standard
    norm_is <- NormQcmets(featuredata, method = "is", isvec=isvec)
    PcaPlots(norm_is$featuredata, factors)
    
    ## Normalise by the NOMIS method
    norm_nomis <- NormQcmets(featuredata, method = "nomis", qcmets = qcmets)
    PcaPlots(norm_nomis$featuredata, factors)
    
    ## Normalise by the CCMN method
    norm_ccmn <- NormQcmets(featuredata, factors, method = "ccmn", qcmets = qcmets, ncomp = 2)
    PcaPlots(norm_ccmn$featuredata, factors)
    
    ## Normalise using RUV-random method
    norm_ruvrand <- NormQcmets(featuredata, method = "ruvrand", qcmets = qcmets, k = 2)
    PcaPlots(norm_ruvrand$featuredata, factors)
    PcaPlots(norm_ruvrand$uvdata, sampledata$batch, main = "Unwanted batch variation")
   
    ## Normalise using RUV-random clustering method
    ##Not run
    #norm_ruvrandclust <- NormQcmets(featuredata, method = "ruvrandclust", qcmets = qcmets, k = 2)
    #PcaPlots(norm_ruvrandclust$featuredata, factors)
    #PcaPlots(norm_ruvrandclust$uvdata, sampledata$batch, main = "Unwanted batch variation")