ELBOW-package: Evaluating foLd change By the lOgit Way

Description Details Author(s) References See Also Examples

Description

The "elbow" method an improved fold change test for determining cut off for biologically significant changes in expression levels in transcriptomics.

Details

Package: ELBOW
Type: Package
Version: 1.0
Date: 2013-08-08
License: Creative Commons 3.0 Attribution + ShareAlike
(see: http://creativecommons.org/licenses/by-sa/3.0/)

Elbow an improved fold change test that uses cluster analysis and pattern recognition to set cut off limits that are derived directly from intrareplicate variance without assuming a normal distribution for as few as 2 biological replicates. Elbow also provides the same consistency as fold testing in cross platform analysis. Elbow has lower false positive and false negative rates than standard fold testing when both are evaluated using T testing and Statistical Analysis of Microarray using 12 replicates (six replicates each for initial and final conditions). Elbow provides a null value based on initial condition replicates and gives error bounds for results to allow better evaluation of significance.

Abstract Reference:
Conference Proceeding: Zhang, X., Bjorklund, N. K., Rydzak, T., Sparling, R., Alvare, G., Fristensky, B. (April 2013) “The Elbow Method for deciding significant fold change cutoffs of differentially expressed genes.” Recomb 2013 17th International Conference on Research in Computational Biology.

Paper Reference:
Zhang, X., Bjorklund, N. K., Alvare, G., Rydzak, T., Sparling, R., Fristensky, B. (2013) “Elbow, an improved fold test method for transcriptomics.” Departments of Plant Science and Microbiology, University of Manitoba, Winnipeg, Canada, R3T 2N2

The corresponding author: Brian Fristensky frist@cc.umanitoba.ca

Author(s)

Xiangli Zhang, Natalie Bjorklund, Graham Alvare, Tom Ryzdak, Richard Sparling, Brian Fristensky

Maintainers: Graham Alvare alvare@cc.umanitoba.ca, Xiangli Zhang justinzhang.xl@gmail.com

References

Claeskens, G. and Hjort N. L. (2008) Model Selection and Model Averaging. Cambridge, England: Cambridge University Press.

Cui X. and Churchill G. A. (2003) “Statistical tests for differential expression in cDNA microarray experiments.” Genome Biol, Vol. 4, p. 210.

Dalman, M. R., Deeter, A., Nimishakavi, G. and Duan, Z. H. (2012) “Fold change and p-value cutoffs significantly alter microarray interpretations.” BMC Bioinformatics, Vol. 13 (Suppl. 2), p. S11

Faraway, J. J. (2006) Extending the Linear Model with R: Generalized Linear, Mixed Effects and Nonparametric Regression Models. Boca Raton, Florida: Chapman and Hall/CRC.

Guo, L., Lobenhofer, E. K., et al. (2006) “Rat toxicogenomic study reveals analytical consistency across microarray platforms.” Nature Biotechnol, Vol. 24, pp. 1162–1169.

Klebanov, L., Qiu, X., Welle, S., Yakovlev, A. (2007) “Statistical methods and microarray data.” Nature Biotechnol, Vol. 25, p.1.

MAQC Consortium (2006) “The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements.” Nature Biotechnol, September, Vol. 24, No. 9, pp. 1151–1161.

Minty, J. J., Lesnefsky, A. A., et al. (2011) “Evolution combined with genomic study elucidates genetic bases of isobutanol tolerance in Escherichia coli.” Microb Cell Fact. Vol. 10, p. 18.

NIST/SEMATECH e-Handbook of Statistical Methods, April 2012, http://www.itl.nist.gov/div898/handbook/.

Oshlack, A., Robinson, M. D., Young, M. D., (2010) “From RNA-seq reads to differential expression results.” Genome Biol, Vol. 11, p. 220.

Shi, L., Tong, W., et al. (2005) “Cross-platform comparability of microarray technology: intra-platform consistency and appropriate data analysis procedures are essential.” BMC Bioinformatics, Vol. 6 (Suppl. 2), p. S12.

Sjogren, A., Kristiansson, E., Rudemo, M., Nerman, O. (2007) “Weighted analysis of general microarray experiments.” BMC Bioinformatics, Vol. 8, p. 387.

Tan, P. K., Downey, T. J., et al. (2003) “Evaluation of gene expression measurements from commercial microarray platforms.” Nucleic Acids Res, Vol. 31, No.19, pp. 5676–5684.

Thorndike, R. L. (December 1953) “Who Belong in the Family?” Psychometrika, Vol. 18, No. 4, pp. 267–276.

Tilstone, C. (2003) “Vital statistics.” Nature, Vol. 424, p. 611

Tusher, V. G., Tibshirani, R., Chu, G. (2001) “Significance analysis of microarrays applied to the ionizing radiation response.” Proc Natl Acad Sci, Aug. 28, Vol. 98 No. 9: pp. 5116–5121.

See Also

See analyze_elbow for doing a full ELBOW analysis and plot.
See do_elbow if you want to extract only the ELBOW cut-off values.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
	# read in the EcoliMutMA sample data from the package
	data(EcoliMutMA, package="ELBOW")
	csv_data <- EcoliMutMA
	# - OR - Read in a CSV file (uncomment - remove the #'s
	#        - from the line below and replace 'filename' with
	#        the CSV file's filename)
	# csv_data <- read.csv(filename)
	
	# set the number of initial and final condition replicates both to three
	init_count  <- 3
	final_count <- 3
	
	# Parse the probes, intial conditions and final conditions
	# out of the CSV file.  Please see: extract_working_sets
	# for more information.
	#
	# init_count should be the number of columns associated with
	#       the initial conditions of the experiment.
	# final_count should be the number of columns associated with
	#       the final conditions of the experiment.
	working_sets <- extract_working_sets(csv_data, init_count, final_count)
	
	probes <- working_sets[[1]]
	initial_conditions <- working_sets[[2]]
	final_conditions <- working_sets[[3]]

	# Uncomment to output the plot to a PNG file (optional)
	# png(file="output_plot.png")

	# Analyze the elbow curve.
	sig <- analyze_elbow(probes, initial_conditions, final_conditions)

	# write the significant probes to 'signprobes.csv'
	write.table(sig,file="signprobes.csv",sep=",",row.names=FALSE)

Example output

[1] "rowsums"
[1] "fold"
[1]  0.06873333 -0.08933333  0.34013333  0.19313333  0.00940000 -0.02596667
[1] "bound data"
    ID_REF        fold
1 1001_115  0.06873333
2  1002_33 -0.08933333
3 1003_942  0.34013333
4 1004_552  0.19313333
5 1005_657  0.00940000
6 1006_393 -0.02596667
[1] "firsta_data"
    ID_REF        fold
1 1001_115  0.06873333
2  1002_33 -0.08933333
3 1003_942  0.34013333
4 1004_552  0.19313333
5 1005_657  0.00940000
6 1006_393 -0.02596667
[1] "sorted"
[1] "headers"
[1] "rowsums"
[1] "fold"
[1]  0.2138 -0.2044  0.1604  0.1214 -0.1342 -0.0477
[1] "bound data"
    ID_REF    fold
1 1001_115  0.2138
2  1002_33 -0.2044
3 1003_942  0.1604
4 1004_552  0.1214
5 1005_657 -0.1342
6 1006_393 -0.0477
[1] "firsta_data"
    ID_REF    fold
1 1001_115  0.2138
2  1002_33 -0.2044
3 1003_942  0.1604
4 1004_552  0.1214
5 1005_657 -0.1342
6 1006_393 -0.0477
[1] "sorted"
[1] "headers"
[1] "rowsums"
[1] "fold"
[1]  0.2071  0.0046  0.0323  0.3172  0.0230 -0.0754
[1] "bound data"
    ID_REF    fold
1 1001_115  0.2071
2  1002_33  0.0046
3 1003_942  0.0323
4 1004_552  0.3172
5 1005_657  0.0230
6 1006_393 -0.0754
[1] "firsta_data"
    ID_REF    fold
1 1001_115  0.2071
2  1002_33  0.0046
3 1003_942  0.0323
4 1004_552  0.3172
5 1005_657  0.0230
6 1006_393 -0.0754
[1] "sorted"
[1] "headers"
[1] "rowsums"
[1] "fold"
[1]  0.2611  0.0195  0.0503  0.3531 -0.0252 -0.0441
[1] "bound data"
    ID_REF    fold
1 1001_115  0.2611
2  1002_33  0.0195
3 1003_942  0.0503
4 1004_552  0.3531
5 1005_657 -0.0252
6 1006_393 -0.0441
[1] "firsta_data"
    ID_REF    fold
1 1001_115  0.2611
2  1002_33  0.0195
3 1003_942  0.0503
4 1004_552  0.3531
5 1005_657 -0.0252
6 1006_393 -0.0441
[1] "sorted"
[1] "headers"
[1] "rowsums"
[1] "fold"
[1]  0.0346 -0.2222  0.5502  0.0479 -0.0125  0.0560
[1] "bound data"
    ID_REF    fold
1 1001_115  0.0346
2  1002_33 -0.2222
3 1003_942  0.5502
4 1004_552  0.0479
5 1005_657 -0.0125
6 1006_393  0.0560
[1] "firsta_data"
    ID_REF    fold
1 1001_115  0.0346
2  1002_33 -0.2222
3 1003_942  0.5502
4 1004_552  0.0479
5 1005_657 -0.0125
6 1006_393  0.0560
[1] "sorted"
[1] "headers"
[1] "rowsums"
[1] "fold"
[1]  0.0279 -0.0132  0.4221  0.2437  0.1447  0.0283
[1] "bound data"
    ID_REF    fold
1 1001_115  0.0279
2  1002_33 -0.0132
3 1003_942  0.4221
4 1004_552  0.2437
5 1005_657  0.1447
6 1006_393  0.0283
[1] "firsta_data"
    ID_REF    fold
1 1001_115  0.0279
2  1002_33 -0.0132
3 1003_942  0.4221
4 1004_552  0.2437
5 1005_657  0.1447
6 1006_393  0.0283
[1] "sorted"
[1] "headers"
[1] "rowsums"
[1] "fold"
[1] 0.0819 0.0017 0.4401 0.2796 0.0965 0.0596
[1] "bound data"
    ID_REF   fold
1 1001_115 0.0819
2  1002_33 0.0017
3 1003_942 0.4401
4 1004_552 0.2796
5 1005_657 0.0965
6 1006_393 0.0596
[1] "firsta_data"
    ID_REF   fold
1 1001_115 0.0819
2  1002_33 0.0017
3 1003_942 0.4401
4 1004_552 0.2796
5 1005_657 0.0965
6 1006_393 0.0596
[1] "sorted"
[1] "headers"
[1] "rowsums"
[1] "fold"
[1] -0.0828 -0.2743  0.5480 -0.0174 -0.0913 -0.0621
[1] "bound data"
    ID_REF    fold
1 1001_115 -0.0828
2  1002_33 -0.2743
3 1003_942  0.5480
4 1004_552 -0.0174
5 1005_657 -0.0913
6 1006_393 -0.0621
[1] "firsta_data"
    ID_REF    fold
1 1001_115 -0.0828
2  1002_33 -0.2743
3 1003_942  0.5480
4 1004_552 -0.0174
5 1005_657 -0.0913
6 1006_393 -0.0621
[1] "sorted"
[1] "headers"
[1] "rowsums"
[1] "fold"
[1] -0.0895 -0.0653  0.4199  0.1784  0.0659 -0.0898
[1] "bound data"
    ID_REF    fold
1 1001_115 -0.0895
2  1002_33 -0.0653
3 1003_942  0.4199
4 1004_552  0.1784
5 1005_657  0.0659
6 1006_393 -0.0898
[1] "firsta_data"
    ID_REF    fold
1 1001_115 -0.0895
2  1002_33 -0.0653
3 1003_942  0.4199
4 1004_552  0.1784
5 1005_657  0.0659
6 1006_393 -0.0898
[1] "sorted"
[1] "headers"
[1] "rowsums"
[1] "fold"
[1] -0.0355 -0.0504  0.4379  0.2143  0.0177 -0.0585
[1] "bound data"
    ID_REF    fold
1 1001_115 -0.0355
2  1002_33 -0.0504
3 1003_942  0.4379
4 1004_552  0.2143
5 1005_657  0.0177
6 1006_393 -0.0585
[1] "firsta_data"
    ID_REF    fold
1 1001_115 -0.0355
2  1002_33 -0.0504
3 1003_942  0.4379
4 1004_552  0.2143
5 1005_657  0.0177
6 1006_393 -0.0585
[1] "sorted"
[1] "headers"
[1] "upper elbow limit = 0.82 (replicate variance error 0.67  to  1.05 )"
[1] "lower elbow limit = -0.45 ( replicate variance error -0.37  to  -0.63 )"
[1] "log chi squared p = 1.08e-44"

ELBOW documentation built on Nov. 8, 2020, 8:14 p.m.