PimaIndiansDiabetes_long: Pima Indians Diabetes Dataset, long

PimaIndiansDiabetes_longR Documentation

Pima Indians Diabetes Dataset, long

Description

The data set PimaIndiansDiabetes2 contains a corrected version of the original data set. While the UCI repository index claims that there are no missing values, closer inspection of the data shows several physical impossibilities, e.g., blood pressure or body mass index of 0. In PimaIndiansDiabetes2, all zero values of glucose, pressure, triceps, insulin and mass have been set to NA, see also Wahba et al (1995) and Ripley (1996).

Usage

PimaIndiansDiabetes_long

Format

A data frame with 724 observations of 6 numeric variables, and target factor diabetes.

  • pregnant, Number of times pregnant

  • glucose, Plasma glucose concentration (glucose tolerance test)

  • pressure, Diastolic blood pressure (mm Hg)

  • mass, Body mass index (weight in kg/(height in m, squared))

  • pedigree, Diabetes pedigree function

  • age, Age (years)

  • diabetes, Class variable (test for diabetes), either "pos" or "neg"

Details

This is a cleaned subset of ⁠mlbench's⁠ PimaIndiansDiabetes2. See help(PimaIndiansDiabetes2, package = "mlbench").

Replicating this dataset:

require("mlbench")
data(PimaIndiansDiabetes2)

d <- PimaIndiansDiabetes2
d <- d[, c(1:3, 6:9)] ## Remove 2 colulmns with the most NAs
d <- d[complete.cases(d), ] ## Remove ~44 row-wise incomplete rows
PimaIndiansDiabetes_long <- d
## save(PimaIndiansDiabetes_long, file = "./data/PimaIndiansDiabetes_long.rda")

Source

J.W. Smith., el al. 1988. Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care (pp. 261–265). IEEE Computer Society Press.

mlbench, R package. F. Leisch & E. Dimitriadou, 2021. mlbench: Machine Learning Benchmark Problems https://CRAN.R-project.org/package=mlbench

Examples

library(spinifex)
str(PimaIndiansDiabetes_long)
dat  <- scale_sd(PimaIndiansDiabetes_long[, 1:6])
clas <- PimaIndiansDiabetes_long$diabetes

bas <- basis_pca(dat)
mv  <- manip_var_of(bas)
mt  <- manual_tour(bas, mv)

ggt <- ggtour(mt, dat, angle = .2) +
  proto_default(aes_args = list(color = clas, shape = clas))

animate_plotly(ggt)


spinifex documentation built on May 29, 2024, 1:23 a.m.