This is Lab 13 on Data Mining.
First we load the datamining
package and the data for the lab.
devtools::load_all() load('Credit.rda')
x.tr <- x[i.tr,] x.te <- x[i.te,] attr(x.te, "terms") = NULL nx.tr <- make_numeric_data_frame(x.tr, "Class", warn=FALSE) nx.te <- make_numeric_data_frame(x.te, "Class", warn=FALSE) fmla = Class ~ Checking.200 + CheckingNone + Months + History + PurposeUsed.car + PurposeOther + PurposeFurniture.equipment + PurposeRadio.TV + PurposeRepairs + PurposeBusiness + StatusMale.single + StatusMale.married + GuarantorCo.applicant + Other.plansStores + Foreign
truth Bad Good
Bad 0 5
Good 1 0
What is the average cost (total cost divided by the test set size) of a classifier which always reports
“Good”? What is the average cost of a classifier which always reports “Bad”? The minimum of
these two is the baseline cost.
11. ‘To minimize cost, the bank should say “Good” only when the probability of “Good” exceeds
5/6. For each of the four classifiers, compute the average cost of this policy on the test set.
Which are below baseline?
12. You can now get checked off. Save all of your results for the homework.
Logistic regression ‘lhe logistic function is similar to tree:
fit = logistic(
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.