diabetes: Pima Indians Diabetes Data Set

Description Format Details Source References Examples

Description

From National Institute of Diabetes and Digestive and Kidney Diseases.

Format

X is a data frame of 768 female patients with 8 attributes.

no.pregnant number of pregnancies.
glucose plasma glucose concentration in an oral glucose tolerance test
blood.press diastolic blood pressure (mm Hg)
triceps.thick triceps skin fold thickness (mm)
insulin 2-Hour serum insulin (mu U/ml)
BMI body mass index (weight in kg/(height in m)\^2)
pedigree diabetes pedigree function
age age in years

y contains the class labels: Yes or No, for diabetic according to WHO criteria.

The training set diabetes.tr contains a randomly selected set of 600 subjects, and diabetes.te contains the remaining 168 subjects. diabetes contains all 768 objects.

Details

Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage.

Source

Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.

References

Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., & Johannes, R.S. (1988). Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care (pp. 261–265). IEEE Computer Society Press.

Examples

1
2
3

SVMMaj documentation built on May 2, 2019, 9:58 a.m.