churn: Customer Churn Data

Description Details Value Source

Description

A data set from the MLC++ machine learning software for modeling customer churn. There are 19 predictors, mostly numeric: state (categorical), account_length area_code international_plan (yes/no), voice_mail_plan (yes/no), number_vmail_messages total_day_minutes total_day_calls total_day_charge total_eve_minutes total_eve_calls total_eve_charge total_night_minutes total_night_calls total_night_charge total_intl_minutes total_intl_calls total_intl_charge, and number_customer_service_calls.

Details

The outcome is contained in a column called churn (also yes/no).

The training data has 3333 samples and the test set contains 1667.

A note in one of the source files states that the data are "artificial based on claims similar to real world".

A rule-based model shown on the RuleQuest website contains 19 rules, including:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
 
Rule 1: (60, lift 6.8)
         international_plan = yes
         total_intl_calls <= 2
         ->  class yes  [0.984]

Rule 5: (43/2, lift 6.4)
        international_plan = no
        voice_mail_plan = no
        total_day_minutes > 246.6
        total_eve_charge > 20.5
        ->  class yes  [0.933]

Rule 10: (211/84, lift 4.1)
         total_day_minutes > 264.4
          ->  class yes  [0.601]

Value

churnTrain

The training set

churnTest

The test set.

Source

http://www.sgi.com/tech/mlc/, http://www.rulequest.com/see5-examples.html


C50 documentation built on Dec. 2, 2017, 1:04 a.m.