Description Format Details Note Source References Examples
The data contain measurements on cells in suspicious lumps in a woman's breast. Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. All samples are classified as either benign or malignant.
wdbc
is a data.frame
with 31 columns. The first column
indicates whether the sample is classified as benign (B
) or malignant
(M
). The remaining columns contain measurements for 30 features.
Ten real-valued features are computed for each cell nucleus:
a) radius (mean of distances from center to points on the perimeter)
b)
texture (standard deviation of gray-scale values)
c) perimeter
d)
area
e) smoothness (local variation in radius lengths)
f)
compactness (perimeter^2 / area - 1.0)
g) concavity (severity of concave
portions of the contour)
h) concave points (number of concave portions
of the contour)
i) symmetry
j) fractal dimension ("coastline
approximation" - 1)
The references listed below contain detailed descriptions of how these features are computed.
The mean, standard error, and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features.
This breast cancer database was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg.
http://mlr.cs.umass.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
Bache, K. & Lichman, M. (2013).
UCI Machine Learning Repository.
Irvine, CA: University of California, School of Information and Computer
Science.
O. L. Mangasarian and W. H. Wolberg: "Cancer diagnosis via
linear programming",
SIAM News, Volume 23, Number 5, September 1990, pp 1
& 18.
William H. Wolberg and O.L. Mangasarian: "Multisurface method of pattern
separation for medical diagnosis applied to breast cytology",
Proceedings of the National Academy of Sciences, U.S.A., Volume 87, December
1990, pp 9193-9196.
K. P. Bennett & O. L. Mangasarian: "Robust linear programming discrimination
of two linearly inseparable sets",
Optimization Methods and Software 1,
1992, 23-34 (Gordon & Breach Science Publishers).
1 2 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.