Description Usage Format Source References
Wisconsin Breast Cancer Database
1 |
A list containing a training and test dataset. These come from a data frame with 699 observations on 11 variables, however the ID and class columns have been removed. There is a train to test ratio of 0.8.
Cl.thicknessClump Thickness
Cell.sizeUniformity of Cell Size
Cell.shapeUniformity of Cell Shape
Marg.adhesionMarginal Adhesion
Epith.c.sizeSingle Epithelial Cell Size
Bare.nucleiBare Nuclei
Bl.cromatinBland Chromatin
Normal.nucleoliNormal Nucleoli
MitosesMitoses
Creator: Dr. WIlliam H. Wolberg (physician); University of Wisconsin Hospital ;Madison; Wisconsin; USA
Donor: Olvi Mangasarian (mangasarian@cs.wisc.edu)
Received: David W. Aha (aha@cs.jhu.edu)
These data have been taken from the UCI Repository Of Machine Learning Databases at
and were converted to R format by Evgenia Dimitriadou.
1. Wolberg,W.H., \& Mangasarian,O.L. (1990). Multisurface method
of pattern separation for medical diagnosis applied to breast cytology. In
Proceedings of the National Academy of Sciences, 87, 9193-9196.
- Size of
data set: only 369 instances (at that point in time)
- Collected
classification results: 1 trial only
- Two pairs of parallel hyperplanes
were found to be consistent with 50% of the data
- Accuracy on remaining
50% of dataset: 93.5%
- Three pairs of parallel hyperplanes were found
to be consistent with 67% of data
- Accuracy on remaining 33% of
dataset: 95.9%
2. Zhang,J. (1992). Selecting typical instances in instance-based learning.
In Proceedings of the Ninth International Machine Learning Conference (pp.
470-479). Aberdeen, Scotland: Morgan Kaufmann.
- Size of data set: only
369 instances (at that point in time)
- Applied 4 instance-based learning
algorithms
- Collected classification results averaged over 10 trials
- Best accuracy result:
- 1-nearest neighbor: 93.7%
- trained on 200
instances, tested on the other 169
- Also of interest:
- Using only
typical instances: 92.2% (storing only 23.1 instances)
- trained on 200
instances, tested on the other 169
Newman, D.J. & Hettich, S. & Blake, C.L. & Merz, C.J. (1998). UCI Repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. Irvine, CA: University of California, Department of Information and Computer Science.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.