Datasets dragons and dragons_test are artificial, generated form the same ground truth model, but with sometimes different data distribution.




a data frame with 2000 rows and 8 columns


Values are generated in a way to: - have nonlinearity in year_of_birth and height - have concept drift in the test set

  • year_of_birth - year in which the dragon was born. Negative year means year BC, eg: -1200 = 1201 BC

  • year_of_discovery - year in which the dragon was found.

  • height - height of the dragon in yards.

  • weight - weight of the dragon in tons.

  • scars - number of scars.

  • colour - colour of the dragon.

  • number_of_lost_teeth - number of teeth that the dragon lost.

  • life_length - life length of the dragon.

