caravan: Caravan insurance data set

caravanR Documentation

Caravan insurance data set

Description

The contains 5822 customer records from an insurance company, each described by 86 variables. These include 43 sociodemographic features based on zip codes and 43 indicators of product ownership. The final variable, Purchase, indicates whether a customer bought a caravan insurance policy. Collected for the CoIL 2000 Challenge, the data was designed to address the question: Can you predict who would be interested in buying a caravan insurance policy and explain why? Further variable details are available at http://www.liacs.nl/~putten/library/cc2000/data.html.

Usage

data(caravan)

Format

A data frame with 5822 observations (rows) and 86 features (columns).

Source

The data was supplied by Sentient Machine Research: ⁠https://www.smr.nl⁠

References

P. van der Putten and M. van Someren (eds) . CoIL Challenge 2000: The Insurance Company Case. Published by Sentient Machine Research, Amsterdam. Also a Leiden Institute of Advanced Computer Science Technical Report 2000-09. June 22, 2000. http://www.liacs.nl/~putten/library/cc2000.

James, G., Witten, D., Hastie, T., and Tibshirani, R. (2013). An Introduction to Statistical Learning with applications in R, https://www.statlearning.com, Springer-Verlag.

See Also

adult, risk, churn, churnTel, bank, advertising, marketing, insurance, cereal, housePrice, house

Examples

data(caravan)

str(caravan)

liver documentation built on Sept. 9, 2025, 5:49 p.m.