LinearModelFit: Linear models
In NormalizeMets: Analysis of Metabolomics Data

Description Usage Arguments Value Author(s) References See Also Examples

Fit a linear model to each metabolite in a metabolomics data matrix, and obtain the coefficients, 95 The featuredata must be log transformed, but does not have to be normalised a priori as the LinearModelFit function can be used to fit the ruv2 method to accommodate the unwanted variation in the model. Either ordinary statistics or empirical Bayes statistics can be obtained.

LinearModelFit(featuredata, factormat = NULL, contrastmat = NULL,
  ruv2 = TRUE, k = NULL, qcmets = NULL, moderated = FALSE,
  padjmethod = "BH", ci_alpha = 0.05, saveoutput = FALSE,
  outputname = "results", ...)

`featuredata`	featuredata A data frame in the featuredata format. This is a dataframe with metabolites in columns and samples in rows. Unique sample names should be provided as row names.
`factormat`	A design matrix for the linear model, consisting of biological factors of interest.
`contrastmat`	An optional contrast matrix indicating which contrasts need to be tested to answer the biological question of interest.
`ruv2`	A logical indicating whether to use the `ruv2` method for removing unwanted variation.
`k`	If `ruv2=TRUE`, the number of unwanted variation factors to be included in the model.
`qcmets`	If `ruv2=TRUE`, a vector indicating which metabolites should be used as the internal, external standards or other quality control metabolites.
`moderated`	A logical indicating whether moderated statistics should be computed.
`padjmethod`	A character string specifying p value adjustment method for multiple comparisons. Must be one of "`bonferroni`", "`holm`" (Holm 1979), "`hochberg`" (Hochberg 1988), "`hommel`" (Hommel 1988), "`BH`" (Benjamini and Hochberg 1995), "`BY`" (Benjamini and Yekutieli 2001), or "`none`". The default method is set to "`BH`".
`ci_alpha`	Significance level for the confidence intervals.
`saveoutput`	A logical indicating whether the normalised data matrix should be saved as a csv file.
`outputname`	The name of the output file if the output has to be saved.
`...`	further arguments to be passed to or from methods.

The result is an object of class MArrayLM, containing F statistics, t statistics, corresponding confidence intervals, and adjusted and unadjusted p-values (see De Livera et al., 2012a, 2012b). If moderated=TRUE, moderated statistics will be computed by empirical Bayes shrinkage of the standard errors towards a common value (Loennstedt et al 2002; Smyth 2004).

Alysha M De Livera, Gavriel Olshansky

Benjamini, Y., Hochberg, Y. (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological) 57(1): 289-300.

Benjamini, Y., Yekutieli, D. (2001) The Control of the False Discovery Rate in Multiple Testing under Dependency. The Annals of Statistics 29(4): 1165-1188.

De Livera, A. M., Dias, D. A., De Souza, D., Rupasinghe, T., Pyke, J., Tull, D., Roessner, U., McConville, M., Speed, T. P. (2012a) Normalising and integrating metabolomics data. Analytical Chemistry 84(24): 10768-10776.

De Livera, Alysha M De, M. Aho-Sysi, Laurent Jacob, J. Gagnon-Bartch, Sandra Castillo, J.A. Simpson, and Terence P. Speed. 2015. Statistical Methods for Handling Unwanted Variation in Metabolomics Data. Analytical Chemistry 87 (7). American Chemical Society: 3606-3615.

De Livera, A.M., Olshansky, M., Speed, T. P. (2013) Statistical analysis of metabolomics data. Methods in Molecular Biology (Clifton, N.J.) 1055: 291-307.

Gagnon-Bartsch, Johann A., Speed, T. P. (2012) Using control genes to correct for unwanted variation in microarray data. Biostatistics 13(3): 539-552.

Hochberg, Y. (1988) A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75(4): 800-802.

Holm, S. (1979) A simple sequentially rejective multiple test procedure. Scandinavian Journal of Statistics 6(2): 65-70.

Hommel, G. (1988) A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika 75(2): 383-386.

Loennstedt, I., Speed, T. P. (2002) Replicated microarray data. Statistica Sinica 12: 31-46.

Smyth, G. K. (2004). Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology 3(1): 3.

eBayes, ContrastMatrix

data("alldata_eg")
featuredata_eg<-alldata_eg$featuredata
dataview(featuredata_eg)
sampledata_eg<-alldata_eg$sampledata
dataview(sampledata_eg)
metabolitedata_eg<-alldata_eg$metabolitedata
dataview(metabolitedata_eg)

logdata <- LogTransform(featuredata_eg)
dataview(logdata$featuredata)
imp <-  MissingValues(logdata$featuredata,sampledata_eg,metabolitedata_eg,
                     feature.cutof=0.8, sample.cutoff=0.8, method="knn")
dataview(imp$featuredata)

#Linear model fit using unadjusted data
factormat<-model.matrix(~gender +Age +bmi, sampledata_eg)
unadjustedFit<-LinearModelFit(featuredata=imp$featuredata,
                             factormat=factormat,
                             ruv2=FALSE)
unadjustedFit

#Linear model fit using `is' normalized data 
Norm_is <-NormQcmets(imp$featuredata, method = "is", 
                    isvec = imp$featuredata[,which(metabolitedata_eg$IS ==1)[1]])
isFit<-LinearModelFit(featuredata=Norm_is$featuredata,
                     factormat=factormat,
                     ruv2=FALSE)
isFit

#Linear model fit with ruv-2 normalization
ruv2Fit<-LinearModelFit(featuredata=imp$featuredata,
                       factormat=factormat,
                       ruv2=TRUE,k=2,
                       qcmets = which(metabolitedata_eg$IS ==1))
ruv2Fit

#Linear model fit with ruv-2 normalization, obtaining moderated statistics
ruv2FitMod<-LinearModelFit(featuredata=imp$featuredata,
                          factormat=factormat,
                          ruv2=TRUE,k=2,moderated = TRUE,
                          qcmets = which(metabolitedata_eg$IS ==1))
ruv2FitMod