This function calculates the variable importance (aka attribute usage) for C5.0 models.
an object of class
either 'usage' or 'splits' (see Details below)
a logical: should the importance values be converted to be between 0 and 100?
other options (not currently used)
By default, C5.0 measures predictor importance by determining the
percentage of training set samples that fall into all the terminal
nodes after the split (this is used when
metric = "usage"). For
example, the predictor in the first split automatically has an
importance measurement of 100 percent. Other predictors may be used
frequently in splits, but if the terminal nodes cover only a handful
of training set samples, the importance scores may be close to
zero. The same strategy is applied to rule-based models as well as the
corresponding boosted versions of the model.
There is a difference in the attribute usage numbers between this output and the nominal command line output. Although the calculations are almost exactly the same (we do not add 1/2 to everything), the C code does not display that an attribute was used if the percentage of training samples covered by the corresponding splits is very low. Here, the threshold was lowered and the fractional usage is shown.
metric = "splits", the percentage of splits associated
with each predictor is calculated.
a data frame with a column
Overall with the predictor usage values. The row names indicate the predictor.
Original GPL C code by Ross Quinlan, R code and modifications to C by Max Kuhn, Steve Weston and Nathan Coulter
Quinlan R (1993). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, http://www.rulequest.com/see5-unix.html
1 2 3 4 5
Overall total_day_minutes 100.00 number_customer_service_calls 93.67 international_plan 87.73 total_eve_charge 20.73 voice_mail_plan 8.97 total_intl_calls 8.01 total_intl_minutes 6.48 total_night_minutes 6.33 total_eve_minutes 4.74 total_eve_calls 0.57 account_length 0.18 total_day_charge 0.18 state 0.00 area_code 0.00 number_vmail_messages 0.00 total_day_calls 0.00 total_night_calls 0.00 total_night_charge 0.00 total_intl_charge 0.00 Overall total_day_minutes 26.923077 total_eve_charge 15.384615 total_night_minutes 15.384615 international_plan 7.692308 voice_mail_plan 7.692308 account_length 3.846154 number_customer_service_calls 3.846154 total_day_charge 3.846154 total_eve_calls 3.846154 total_eve_minutes 3.846154 total_intl_calls 3.846154 total_intl_minutes 3.846154 state 0.000000 area_code 0.000000 number_vmail_messages 0.000000 total_day_calls 0.000000 total_night_calls 0.000000 total_night_charge 0.000000 total_intl_charge 0.000000
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.