knitr::opts_chunk$set(echo = TRUE)
library('EVALFQ')
set.seed(135)

Introduction

The EVALFQ provides an open assess online service enabling (1) the label-free proteome quantification (LFQ) based on three quantification measurements SWATH-MS, Peak Intensity and Spectral Counting, (2) the evaluation of LFQ performances from multiple perspectives and (3) the identification of the optimal LFQs based on comprehensive performance ranking. The EVALFQ mainly includes two function lfq_access and lfq_spiked to realize not only AUTOMATICALLY detects the diverse formats of data generated by all quantification software, but also provides the most complete set of processing methods among available tools, which including the methods of transformation, pretreatment (centering, scaling & normalization) and missing value imputation.

This tutorial will walk the readers through an example analysis (as follow 'Examples').

Installation

# download the source package of EVALFQ_0.1.0.tar.gz and install it.
install.packages(pkg = 'EVALFQ_0.1.0.tar.gz')

# Alternatively EVALFQ can be installed from GitHub:
# install.packages("devtools")
devtools::install_github("idrblab/EVALFQ")
library(EVALFQ)

# EVALFQ package depends on several packages, which can be installed using the below commands:
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("Biobase")
BiocManager::install("BiocGenerics")
BiocManager::install("ROTS")
BiocManager::install("limma")
BiocManager::install("ProteoMM")
BiocManager::install("impute")
BiocManager::install("pcaMethods")
BiocManager::install("vsn")
BiocManager::install("affy")
devtools::install_github("cran/metabolomics")

Usage

library(EVALFQ)

1. Prepare input file for evaluating label-free proteome quantification.

```(r) my_data <- PrepareInuputFiles(acquisitionmethods, rawdataset, lable)

`acquisitionmethods` 
Input the corresponding "number" of acquisition techniques as follows: <br>
If set 1, the user chooses to process the data based on SWATH-MS. <br>
If set 2, the user chooses to process the data based on Peak Intensity. <br>
If set 3, the user chooses to process the data based on Spectral Counting.

`rawdataset` 
Input the name of your raw dataset directly obtained from software.<br>
EVALFQ supports a variety of data generated by 18 kinds of popular quantification software. <br>
The format of each software could be readily found as follows (<b>Right Click to Save</b>). <br>
(1) A list of software for pre-processing the data acquired based on SWATH-MS.<br>
<a href='https://idrblab.org/evalfq/download/SWATH_MS/DIAumpire_ProteinSummary.csv'>DIA-UMPIRE</a>; <a href='https://idrblab.org/evalfq/download/SWATH_MS/example_OpenSWATH.csv'>OpenSWATH</a>; <a href='https://idrblab.org/evalfq/download/SWATH_MS/ProtSummary_201604130950.csv'>PeakView</a>; <a href='https://idrblab.org/evalfq/download/SWATH_MS/Skyline_HYE124_TTOF6600_32fix_it1_IntLibFixed1603.tsv'>Skyline</a>; <a href='https://idrblab.org/evalfq/download/SWATH_MS/example_Spectronaut.tsv'>Spectronaut</a> <br>
(2) A list of software for pre-processing the data acquired based on Peak Intensity.<br>
<a href='https://idrblab.org/evalfq/download/Peak_Intensity/MaxQuant_proteinGroups_LFQ.txt'>MaxQuant</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/example_MFPaQ_Peak_Intensity.csv'>MFPaQ</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/example_OpenMS.csv'>OpenMS</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/proteinGroups_Peak_Intensity.txt'>PEAKS</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/Progenesis_guide_Output.csv'>Progenesis</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/example_Proteios_SE.csv'>Proteios SE</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/scaffold-label-free_Peak_Intensity.csv'>Scaffold</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/example_Proteome_Discoverer.csv'>Thermo Proteome Discoverer</a> <br>
(3) A list of software for pre-processing the data acquired based on Spectral Counting.<br>
<a href='https://idrblab.org/evalfq/download/Peak_Intensity/abacus_data_output.csv'>Abacus</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/example_Census.txt'>Census</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/example_DTASelect.csv'>DTASelect</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/example_IRMa-hEIDI.csv'>IRMa-hEIDI</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/MaxQuant_proteinGroups_SC.txt'>MaxQuant</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/example_MFPaQ_Spectral_Counting.csv'>MFPaQ</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/example_ProteinProphet.csv'>ProteinProphet</a>; <a href='https://idrblab.org/evalfq/download/Peak_Intensity/scaffold-label-free_Spectral_Counting.csv'>Scaffold</a> <br>

`lable` 
Input the label of your dataset.

#### 2. Conduct LFQ and assess performance of all possible LFQ workflows.

```(r)
allranks <- lfqevalueall(data_q,
                         assum_a = "Y",
                         assum_b = "Y",
                         assum_c = "Y",
                         Ca = "1", 
                         Cb = "1", 
                         Cc = "1", 
                         Cd = "1")

data_q This input file should be numeric type except the first and second column containing the names and label (control or case) of the studied samples, respectively. The intensity data should be provided in this input file with the following order: samples in row and proteins/peptides in column. Missing value (NA) of protein intensity are allowed.

assum_a All proteins were assumed to be equally important.
The authors will be asked to input a letter “Y” to indicate the corresponding assumption is held for the studied dataset and a letter “N” to denote the opposite.

assum_b The level of protein abundance was assumed to be constant among all samples.
The authors will be asked to input a letter “Y” to indicate the corresponding assumption is held for the studied dataset and a letter “N” to denote the opposite.

assum_c The intensities of the vast majority of the proteins were assumed to be unchanged under the studied conditions.
The authors will be asked to input a letter “Y” to indicate the corresponding assumption is held for the studied dataset and a letter “N” to denote the opposite.

Ca Criterion (a): precision of LFQ based on the proteomes among replicates1.
If set 1, the user chooses to assess LFQ workflows using Criterion (a).
If set 0, the user excludes Criterion (a) from performance assessment.
The default setting of this value is “1”.

Cb Criterion (b): classification ability of LFQ between distinct sample groups2.
If set 1, the user chooses to assess LFQ workflows using Criterion (b).
If set 0, the user excludes Criterion (b) from performance assessment.
The default setting of this value is “1”.

Cc Criterion (c): differential expression analysis by reproducibility-optimization3.
If set 1, the user chooses to assess LFQ workflows using Criterion (c).
If set 0, the user excludes Criterion (c) from performance assessment.
The default setting of this value is “1”.

Cd Criterion (d): reproducibility of the identified protein markers among different datasets4.
If set 1, the user chooses to assess LFQ workflows using Criterion (d).
If set 0, the user excludes Criterion (d) from performance assessment.
The default setting of this value is “1”.

3. Conduct LFQ and assess performance by collectively considering the spiked proteins.

```(r) allranks <- lfqspikedall(data_s, spiked, assum_a = "Y", assum_b = "Y", assum_c = "Y", Ca = "1", Cb = "1", Cc = "1", Cd = "1", Ce = "1")

`data_s` 
This input file should be numeric type except the first and second column containing the names and label (control or case) of the studied samples, respectively. The intensity data should be provided in this input file with the following order: samples in row and proteins/peptides in column. Missing value (NA) of protein intensity are allowed.

`spiked` 
The file should provide the concentrations of known proteins (such as spiked proteins). This file is required, if the user want to conduct assessment using criteria (e) This file should contain the class of samples and the Sample ID. The Sample ID should be unique and defined by the preference of EVALFQ users, and the class of samples refers to the group of Sample ID. The ID of the spiked proteins should be consistent in both data_s" and "spiked. Detail information are described in the online Example.

`assum_a` 
All proteins were assumed to be equally important.<br>
The authors will be asked to input a letter Y to indicate the corresponding assumption is held for the studied dataset and a letter N to denote the opposite.

`assum_b` 
The level of protein abundance was assumed to be constant among all samples.<br>
The authors will be asked to input a letter Y to indicate the corresponding assumption is held for the studied dataset and a letter N to denote the opposite.

`assum_c` 
The intensities of the vast majority of the proteins were assumed to be unchanged under the studied conditions.<br>
The authors will be asked to input a letter Y to indicate the corresponding assumption is held for the studied dataset and a letter N to denote the opposite.

`Ca` 
Criterion (a): precision of LFQ based on the proteomes among replicates<sup>1</sup>.<br>
If set 1, the user chooses to assess LFQ workflows using Criterion (a).<br>
If set 0, the user excludes Criterion (a) from performance assessment.<br>
The default setting of this value is 1.

`Cb` 
Criterion (b): classification ability of LFQ between distinct sample groups<sup>2</sup>.<br>
If set 1, the user chooses to assess LFQ workflows using Criterion (b).<br>
If set 0, the user excludes Criterion (b) from performance assessment.<br>
The default setting of this value is 1.

`Cc` 
Criterion (c): differential expression analysis by reproducibility-optimization<sup>3</sup>.<br>
If set 1, the user chooses to assess LFQ workflows using Criterion (c).<br>
If set 0, the user excludes Criterion (c) from performance assessment.<br>
The default setting of this value is 1.

`Cd` 
Criterion (d): reproducibility of the identified protein markers among different datasets<sup>4</sup>.<br>
If set 1, the user chooses to assess LFQ workflows using Criterion (d).<br>
If set 0, the user excludes Criterion (d) from performance assessment.<br>
The default setting of this value is 1.

`Ce` 
Criterion (e): accuracy of LFQ based on spiked and background proteins<sup>5</sup>.<br>
If set 1, the user chooses to assess LFQ workflows using Criterion (e).<br>
If set 0, the user excludes Criterion (e) from performance assessment.<br>
The default setting of this value is 1.

#### 4. Draw heatmap and save as EVALFQ-OUTPUT.Figure-Top.XXX.workflows.pdf.

```(r)
lfqvisualize(object, top = 100)

object The input is the output file of the lfq_access or lfq_spiked.

top The default 'top' value is 100.
You can view the top ranking heatmap you want.

5. Conduct LFQ and assess performance of one specific LFQ workflow.

```(r) res <- lfqevalupart(data_q, selectFile, Ca = "1", Cb = "1", Cc = "1", Cd = "1")

`data_q` 
Same as the description of the 'lfq_access' above.

`selectFile` 
Input the name of your prefered strategies. Sample data of this data type is in the working directory (in github) “idrblab/EVALFQ/data/selectworkflows.rda”.

`Ca` 
Criterion (a): precision of LFQ based on the proteomes among replicates<sup>1</sup>.<br>
If set 1, the user chooses to assess LFQ workflows using Criterion (a).<br>
If set 0, the user excludes Criterion (a) from performance assessment.<br>
The default setting of this value is “1”.

`Cb` 
Criterion (b): classification ability of LFQ between distinct sample groups<sup>2</sup>.<br>
If set 1, the user chooses to assess LFQ workflows using Criterion (b).<br>
If set 0, the user excludes Criterion (b) from performance assessment.<br>
The default setting of this value is “1”.

`Cc` 
Criterion (c): differential expression analysis by reproducibility-optimization<sup>3</sup>.<br>
If set 1, the user chooses to assess LFQ workflows using Criterion (c).<br>
If set 0, the user excludes Criterion (c) from performance assessment.<br>
The default setting of this value is “1”.

`Cd` 
Criterion (d): reproducibility of the identified protein markers among different datasets<sup>4</sup>.<br>
If set 1, the user chooses to assess LFQ workflows using Criterion (d).<br>
If set 0, the user excludes Criterion (d) from performance assessment.<br>
The default setting of this value is “1”.

#### 6. Conduct LFQ and assess performance by collectively considering the spiked proteins.

```(r)
res <- lfqspikepart(data_s,
                    spiked,
                    selectFile,
                    Ca = "1", 
                    Cb = "1", 
                    Cc = "1", 
                    Cd = "1")

data_s Same as the description of the 'lfq_spiked' above.

spiked Same as the description of the 'lfq_spiked' above.

selectFile Input the name of your prefered strategies. Sample data of this data type is in the working directory (in github) “idrblab/EVALFQ/data/selectworkflows.rda”.

Ca Criterion (a): precision of LFQ based on the proteomes among replicates1.
If set 1, the user chooses to assess LFQ workflows using Criterion (a).
If set 0, the user excludes Criterion (a) from performance assessment.
The default setting of this value is “1”.

Cb Criterion (b): classification ability of LFQ between distinct sample groups2.
If set 1, the user chooses to assess LFQ workflows using Criterion (b).
If set 0, the user excludes Criterion (b) from performance assessment.
The default setting of this value is “1”.

Cc Criterion (c): differential expression analysis by reproducibility-optimization3.
If set 1, the user chooses to assess LFQ workflows using Criterion (c).
If set 0, the user excludes Criterion (c) from performance assessment.
The default setting of this value is “1”.

Cd Criterion (d): reproducibility of the identified protein markers among different datasets4.
If set 1, the user chooses to assess LFQ workflows using Criterion (d).
If set 0, the user excludes Criterion (d) from performance assessment.
The default setting of this value is “1”.

Ce Criterion (e): accuracy of LFQ based on spiked and background proteins5.
If set 1, the user chooses to assess LFQ workflows using Criterion (e).
If set 0, the user excludes Criterion (e) from performance assessment.
The default setting of this value is “1”.

Examples

```(r)

Step 1: Prepare input file for evaluating label-free proteome quantification.

my_df <- PrepareInuputFiles(acquisitionmethods = "2", rawdataset = "MaxQuant_proteinGroups_LFQ.txt", lable = "MaxQuant_LFQ_Label.txt") OR

my_df <- read.csv(file = "EVALFQ_Unified_Data.csv", header = TRUE, stringsAsFactors = FALSE)

Note: the file should be in the format of Comma-Separated Values (CSV), which provides the intensity data of proteins/peptides. This input file should be numeric type except the first and second column containing the names and label (control or case) of the studied samples, respectively. The intensity data should be provided in this input file with the following order: samples in row and proteins/peptides in column. Missing value (NA) of protein intensity are allowed.

The format of input files could be readily found HERE:<br>
<a href='https://idrblab.org/evalfq/download/Peak_Intensity/MaxQuant_proteinGroups_LFQ.txt'>MaxQuant_proteinGroups_LFQ.txt</a><br>
<a href='https://idrblab.org/evalfq/download/Peak_Intensity/MaxQuant_LFQ_Label.txt'>MaxQuant_LFQ_Label.txt</a><br>
<a href='https://idrblab.org/evalfq/download/EVALFQ_Unified_Data.csv'>EVALFQ_Unified_Data.csv</a>

```r
load("../data/my_spiked.rda")
dim(my_spiked)
head(my_spiked[1:4])
load("../data/spiked_data.rda")
dim(spiked_data)
head(spiked_data[1:5])

```(r)

Step 1: conduct LFQ and assess performance of all possible LFQ workflows or assess performance by collectively considering the spiked proteins.

Note: the file should be in the format of Comma-Separated Values (CSV), which provides the intensity data of proteins/peptides. This input file should be numeric type except the first and second column containing the names and label (control or case) of the studied samples, respectively. The intensity data should be provided in this input file with the following order: samples in row and proteins/peptides in column. Missing value (NA) of protein intensity are allowed.

allranks <- lfqevalueall(data_q = my_df, assum_a = "Y", assum_b = "Y", assum_c = "Y", Ca = "1", Cb = "1", Cc = "1", Cd = "1")

OR

Note: the file should be in the format of Comma-Separated Values (CSV), which provides the concentrations of known proteins (such as spiked proteins). This file is required, if the user want to conduct assessment using criteria (e) This file should contain the class of samples and the Sample ID. The Sample ID should be unique and defined by the preference of EVALFQ users, and the class of samples refers to the group of Sample ID. The ID of the spiked proteins should be consistent in both "my_spiked" and "spiked_data".

allranks <- lfqspikedall(data_s = my_spiked, spiked = spiked_data, assum_a = "Y", assum_b = "Y", assum_c = "Y", Ca = "1", Cb = "1", Cc = "1", Cd = "1", Ce = "1")

Note: 'allranks' containing all information of performance assessment, criteria selected and ranking.

```r
load("../data/allranks.rda")
head(allranks)

```(r)

Step 2: a heatmap illustrating the performance ranking of all LFQ workflows based on the criteria selected by user.

lfqvisualize(object = "EVALFQ-OUTPUT.Data-Overall.Ranking.csv", top = 100)

Note: the 'EVALFQ-OUTPUT.Figure-Top.XXX.workflows.pdf' would be successfully saved in the current path. Please use 'getwd()' to find the current path!

```(r)
# Users can also use EVALFQ by selecting one specific LFQ as follows:

res <- lfqevalupart(data_q = my_df,
                    selectFile = selectworkflows,
                    Ca = "1", 
                    Cb = "1", 
                    Cc = "1", 
                    Cd = "1")

OR

res <- lfqspikepart(data_s = my_spiked,
                    spiked = spiked_data,
                    selectFile = selectworkflows,
                    Ca = "1", 
                    Cb = "1", 
                    Cc = "1", 
                    Cd = "1")            

Note: please select the appropriate number code represents transformation, centering, scaling, normalization, imputation methods (See above details).

Should you have any questions, please contact Jianbo Fu at fujianbo@zju.edu.cn

References

  1. Kuharev J, Navarro P, Distler U, et al. In-depth evaluation of software tools for data-independent acquisition based label-free quantification. Proteomics 2015;15:3140–3151.

  2. Griffin NM, Yu J, Long F, et al. Label-free, normalized quantification of complex mass spectrometry data for proteomic analysis. Nat Biotechnol 2010;28:83–89.

  3. Risso D, Ngai J, Speed TP, et al. Normalization of RNA-seq data using factor analysis of control genes or samples. Nat Biotechnol 2014;32:896–902.

  4. Wang X, Gardiner EJ, Cairns MJ. Optimal consistency in microRNA expression analysis using reference-gene-based normalization. Mol Biosyst 2015;11:1235–1240.

  5. Navarro P, Kuharev J, Gillet LC, et al. A multicenter study benchmarks software tools for label-free proteome quantification. Nat Biotechnol 2016;34:1130–1136.

  6. Lo, K., and Gottardo, R. (2012) Flexible mixture modeling via the multivariate t distribution with the Box-Cox transformation: An alternative to the skew-t distribution. Stat. Comput. 22, 33–52.

  7. Callister, S. J., Barry, R. C., Adkins, J. N., Johnson, E. T., Qian, W. J., Webb-Robertson, B. J., Smith, R. D., and Lipton, M. S. (2006) Normalization approaches for removing systematic biases associated with mass spectrometry and label-free proteomics. J. Proteome Res. 5, 277–286.

  8. van den Berg, R. A., Hoefsloot, H. C., Westerhuis, J. A., Smilde, A. K., and van der Werf, M. J. (2006) Centering, scaling, and transformations: Improving the biological information content of metabolomics data. BMC Genomics 7, 142.

  9. Wang, S. Y., Kuo, C. H., and Tseng, Y. J. (2013) Batch normalizer: A fast total abundance regression calibration method to simultaneously adjust batch and injection order effects in liquid chromatography/time-of-flight mass spectrometry-based metabolomics data and comparison with current calibration methods. Anal. Chem. 85, 1037–1046.

  10. Wang, X., Zhang, A., Han, Y., Wang, P., Sun, H., Song, G., Dong, T., Yuan, Y., Yuan, X., Zhang, M., Xie, N., Zhang, H., Dong, H., and Dong, W. (2012) Urine metabolomics analysis for biomarker discovery and detection of jaundice syndrome in patients with liver disease. Mol. Cell. Proteomics 11, 370 –380.

  11. Di Guida, R., Engel, J., Allwood, J. W., Weber, R. J., Jones, M. R., Sommer, U., Viant, M. R., and Dunn, W. B. (2016) Non-targeted UHPLC-MS metabolomic data processing methods: a comparative investigation of normalisation, missing value imputation, transformation and scaling. Metabolomics 12, 93.

  12. Smilde, A. K., van der Werf, M. J., Bijlsma, S., van der Werff-van der Vat, B. J., and Jellema, R. H. (2005) Fusion of mass spectrometry-based metabolomics data. Anal. Chem. 77, 6729 – 6736.

  13. Cox, J., Hein, M. Y., Luber, C. A., Paron, I., Nagaraj, N., and Mann, M. (2014) Accurate proteome-wide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol. Cell. Proteomics 13, 2513–2526.

  14. Ballman, K. V., Grill, D. E., Oberg, A. L., and Therneau, T. M. (2004) Faster cyclic loess: normalizing RNA arrays via linear models. Bioinformatics 20, 2778 –2786.

  15. Va¨ likangas, T., Suomi, T., and Elo, L. L. (2018) A systematic evaluation of normalization methods in quantitative label-free proteomics. Brief Bioinform. 19, 1–11.

  16. Leek, J. T., and Storey, J. D. (2007) Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet. 3, 1724 –1735.

  17. Karpievitch, Y. V., Taverner, T., Adkins, J. N., Callister, S. J., Anderson, G. A., Smith, R. D., and Dabney, A. R. (2009) Normalization of peak intensities in bottom-up MS-based proteomics using singular value decomposition. Bioinformatics 25, 2573–2580.

  18. Adriaens, M. E., Jaillard, M., Eijssen, L. M., Mayer, C. D., and Evelo, C. T. (2012) An evaluation of two-channel ChIP-on-chip and DNA methylation microarray normalization strategies. BMC Genomics 13, 42.

  19. Craig, A., Cloarec, O., Holmes, E., Nicholson, J. K., and Lindon, J. C. (2006) Scaling and normalization effects in NMR spectroscopic metabonomic data sets. Anal. Chem. 78, 2262–2267.

  20. Fundel, K., Haag, J., Gebhard, P. M., Zimmer, R., and Aigner, T. (2008) Normalization strategies for mRNA expression data in cartilage research. Osteoarthritis Cartilage 16, 947–955.

  21. De Livera, A. M., Dias, D. A., De Souza, D., Rupasinghe, T., Pyke, J., Tull, D., Roessner, U., McConville, M., and Speed, T. P. (2012) Normalizing and integrating metabolomics data. Anal. Chem. 84, 10768 –10776.

  22. Tobin, J., Walach, J., de Beer, D., Williams, P. J., Filzmoser, P., and Walczak, B. (2017) Untargeted analysis of chromatographic data for green and fermented rooibos: Problem with size effect removal. J. Chromatogr. A 1525, 109 –115.

  23. Wang, B., Wang, X. F., and Xi, Y. (2011) Normalizing bead-based microRNA expression data: A measurement error model-based approach. Bioinformatics 27, 1506 –1512.

  24. Smolinska, A., Hauschild, A. C., Fijten, R. R., Dallinga, J. W., Baumbach, J., and van Schooten, F. J. (2014) Current breathomics—A review on data pre-processing techniques and machine learning in metabolomics breath analysis. J. Breath Res. 8, 027105.

  25. Branson, O. E., and Freitas, M. A. (2016) A multi-model statistical approach for proteomic spectral count quantitation. J. Proteomics 144, 23–32.

  26. Rausch, T. K., Schillert, A., Ziegler, A., Lu¨ king, A., Zucht, H. D., and Schulz-Knappe, P. (2016) Comparison of pre-processing methods for multiplex bead-based immunoassays. BMC Genomics 17, 601.

  27. Lin, S. M., Du, P., Huber, W., and Kibbe, W. A. (2008) Model-based variance-stabilizing transformation for Illumina microarray data. Nucleic Acids Res. 36, e11.

  28. Stacklies, W., Redestig, H., Scholz, M., Walther, D., and Selbig, J. (2007) pcaMethods—A bioconductor package providing PCA methods for incomplete data. Bioinformatics 23, 1164 –1167.

  29. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., and Altman, R. B. (2001) Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520 –525.



idrblab/EVALFQ documentation built on Sept. 29, 2022, 6:34 p.m.