seeds: seeds Data Set

seedsR Documentation

seeds Data Set

Description

Measurements of geometrical properties of kernels belonging to three different varieties of wheat. A soft X-ray technique and GRAINS package were used to construct all seven, real-valued attributes.

Format

A data frame with 209 rows and 7 covariate variables and 1 response variable.

Details

The variables listed below, from left to right, are:

  • area A

  • perimeter P

  • compactness C = 4piA/P^2

  • length of kernel

  • width of kernel

  • asymmetry coefficient

  • length of kernel groove

  • varieties of wheat (1, 2, 3 for Kama, Rosa and Canadian respectively)

Source

https://archive.ics.uci.edu/ml/datasets/seeds

References

M. Charytanowicz, J. Niewczas, P. Kulczycki, P.A. Kowalski, S. Lukasik, S. Zak, 'A Complete Gradient Clustering Algorithm for Features Analysis of X-ray Images', in: Information Technologies in Biomedicine, Ewa Pietka, Jacek Kawa (eds.), Springer-Verlag, Berlin-Heidelberg, 2010, pp. 15-24.

See Also

body_fat breast_cancer

Examples

data(seeds)
set.seed(221212)
train <- sample(1:209, 80)
train_data <- data.frame(seeds[train, ])
test_data <- data.frame(seeds[-train, ])

forest <- ODRF(varieties_of_wheat ~ ., train_data,
  split = "gini", parallel = FALSE, ntrees = 50
)
pred <- predict(forest, test_data[, -8])
# classification error
(mean(pred != test_data[, 8]))

tree <- ODT(varieties_of_wheat ~ ., train_data, split = "gini")
pred <- predict(tree, test_data[, -8])
# classification error
(mean(pred != test_data[, 8]))

ODRF documentation built on May 31, 2023, 8:22 p.m.