# Load the rchallenge package library(rchallenge) # General settings data_dir = "@DATA_DIR@" # directory of the data submissions_dir = "@SUBMISSIONS_DIR@" # directory of the submissions hist_dir = "@HIST_DIR@" # directory where the submissions history is stored email = "@EMAIL@" # email address of the challenge administrator date_start = "@DATE_START@" # starting date of the challenge deadline = as.POSIXct("@DEADLINE@") # deadline of the challenge baseline = "@BASELINE@" # name of the team considered as the baseline # Load the data load(file.path(data_dir, "data_train.rda")) load(file.path(data_dir, "data_test.rda")) load(file.path(data_dir, "y_test.rda")) load(file.path(data_dir, "ind_quiz.rda")) # Toggle test or quiz dataset for performance evaluation test_or_quiz <- ifelse(Sys.time()<deadline, "quiz", "test") # comment to disable quiz # test_or_quiz <- "test" # uncomment to disable quiz # Define custom functions that compute performance criterions error_rate <- function(y_pred, y_test) { FP = (y_pred == "Good") & (y_test == "Bad") FN = (y_pred == "Bad") & (y_test == "Good") return(sum(FP+FN)/length(y_test)) } average_cost <- function(y_pred, y_test) { FP = (y_pred == "Good") & (y_test == "Bad") FN = (y_pred == "Bad") & (y_test == "Good") return(sum(5*FP+FN)/length(y_test)) } metrics = list(error = error_rate, cost = average_cost) # Define custom function that reads a submission file and throws errors or # warnings if it is not valid. read_pred <- function(file, n = length(y_test)) { y_pred <- scan(file, what = "character") y_pred <- factor(y_pred, levels = c("Bad", "Good")) if (length(y_pred) != n) stop("incorrect number of predictions") if (any(is.na(y_pred))) stop("predictions contain missing values (NA)") return(y_pred) } # Store new submission files. read_err = store_new_submissions(submissions_dir, hist_dir, deadline = deadline, valid_fun = read_pred) # Compute metrics of the submissions in the history. history = compute_metrics(hist_dir, metrics, y_test, ind_quiz, read_fun = read_pred) # Load the current best submissions results if (file.exists(file.path(hist_dir, "best.rda"))) { load(file.path(hist_dir, "best.rda")) } else best = list() n_team = 0 n_submissions = 0 # Update the current best submissions results if (length(history)>0) { # compute one leaderboard per metric # in both cases, the other metric is used to break ties best_error = get_best(history, metrics=c("error", "cost"), test_name=test_or_quiz) best_cost = get_best(history, metrics=c("cost", "error"), test_name=test_or_quiz) if ("error" %in% names(best)) best_error = update_rank_diff(best_error, best$error) if ("cost" %in% names(best)) best_cost = update_rank_diff(best_cost, best$cost) best$error = best_error best$cost = best_cost # save best save(best, file = file.path(hist_dir, "best.rda")) # get stats n_team = sum(best[[1]]$team != baseline) n_submissions = sum(best[[1]]$n_submissions[best[[1]]$team != baseline]) }
Welcome to the challenge webpage!
| r icon("fa-cloud-upload fa-2x")
\ r n_submissions
| r icon("fa-users fa-2x")
\ r n_team
| r icon("fa-calendar fa-2x")
\ r countdown(deadline)
|
|:-------------:|:-------------:|:-------------:|
| submissions | teams | r ifelse(Sys.time()<deadline, "active", "completed")
|
Last update: r last_update(deadline)
r date_start
:
~ The challenge is open!
colnames = c(paste(icon(c("fa-trophy", "fa-users", "fa-cloud-upload", "fa-calendar", "fa-bullseye", "fa-bullseye")), c("Rank", "Team", "Submissions", "Date", "Error rate", "Average cost"))) if ("error" %in% names(best)) print_leaderboard(best[["error"]], c("error", "cost"), test_name=test_or_quiz, col.names=colnames, caption="Best error rate")
colnames[5:6] = colnames[6:5] if ("cost" %in% names(best)) print_leaderboard(best[["cost"]], c("cost", "error"), test_name=test_or_quiz, col.names=colnames, caption="Best average cost")
Binary classification: predict the creditworthiness or default risk of a set of customers for the purpose of credit allowance.
To this aim, we dispose of a supervised training set: a set of customers whose answer is known. The goal is to get the highest prediction score on a test set whose answer is hidden.
Send an email to <r email
> with the following information:
r icon("fa-dropbox fa-lg")
You will receive an invitation to a shared Dropbox folder named after your team.
Download the data.
Submit test prediction files in csv format in the shared Dropbox folder.
Note: The number of submissions is not limited. However the scores are updated every hour.
submissions deadline: r format(deadline, "%A %d %b %Y %H:%M", usetz=TRUE)
| Name | File | Description | Links |
| ---- | ---- | ----------- | ----- |
| Training set | data_train.rda
| data.frame
with r nrow(data_train)
rows/customers and r ncol(data_train)
columns/variables | r icon("fa-download fa-lg")
|
| Test set | data_test.rda
| data.frame
with r nrow(data_test)
rows/customers and r ncol(data_test)
columns/variables | r icon("fa-download fa-lg")
|
Load the files in R with:
load("data_train.rda") load("data_test.rda")
The variable to predict is variable Class
which takes values Bad
or Good
. Training values of this variable are provided in the last column of data_train
. data_test
does not contain this variable as it must be predicted.
The full dataset contains 30% of Bad
and 70% of Good
. This proportion is kept in the training set as well as the test set.
table(data_train$Class)/nrow(data_train)
For the prediction task, we have r ncol(data_test)
input variables with:
r sum(sapply(data_test[1,], is.numeric))
quantitative variables of class numeric
r sum(sapply(data_test[1,], is.factor))
qualitative variables of class factor
str(data_test)
Univariate statistics of these variables can easily be obtained with:
summary(rbind(data_train[,-ncol(data_train)], data_test))
A classifier is a function that assigns Bad
or Good
to each sample in the test set. Such a function can be:
predict_all_good <- function(data_test, ...) { y_pred = rep("Good", nrow(data_test)) return(y_pred) }
that assigns Good
to every sample. Such a classifier does not use the training set. It corresponds to accepting every credit application.
We obtain the following result
y_pred = predict_all_good(data_test)
Your goal is to program one or several classifiers using the training data to improve the perfomance of such a default decision.
Your prediction performance is calculated based on real answers to the test set. We use two different criteria.
The error rate measures the misclassification rate of your predictions, i.e. the number of false positives FP
plus the number of false negativesFN
divided by the total number. It is measured by the error_rate
function.
dump("error_rate", "")
This performance metric is the 0-1 cost averaged over all predictions. The goal is to minimize the error rate. Since 70% of individuals are Good
, the error rate associated withpredict_all_good
predictor is 0.3, while its counterpart predict_all_bad
provides 0.7. Here predict_all_good
is preferable.
However, it is considered 5 times more risky/expensive to give credit to insolvent person (false positive) than not to grant credit to a creditworthy person (false negative). The average cost is measured by the average_cost
function.
dump("average_cost", "")
The goal is to minimize the average cost predict_all_bad
here better with 0.7 topredict_all_good
with 1.5. From the average cost point of view, it is safer not to grant credit.
The submissions consist in text files with the extension .csv
, that you can export with the following command:
write(y_pred, file = "my_pred.csv")
The file must contain r nrow(data_test)
lines containing with one of the words Bad
or Good
.
All .csv
files in your shared Dropbox folder will be automatically imported by theread_pred
function.
dump("read_pred", "")
Use this function to check that your file is correctly imported.
Read errors during the import are displayed at the [Read errors] section.
Once a file is imported, the score is calculated and stored. You can delete or replace submission files, the history is kept.
The ranking and scores displayed are calculated on only half of the test data. The final scores computed on the whole test set will be revealed at the end of the challenge.
Only the highest score by team among all submissions is retained.
The team r baseline
corresponds to the score of the best classifier among predict_all_good
or predict_all_bad
, in lieu of reference to improve.
# color palette Dark2 <- colorRampPalette(c("#1B9E77", "#D95F02", "#7570B3", "#E7298A", "#66A61E", "#E6AB02", "#A6761D", "#666666")) palette(Dark2(max(length(history)-1, 3))) par(mar = c(5,4,4,7) + 0.1) plot_history(history, "error", test_name=test_or_quiz, baseline=baseline, ylab = "Error rate")
par(mar = c(5,4,4,7) + 0.1) plot_history(history, "cost", test_name=test_or_quiz, baseline=baseline, ylab = "Average cost")
par(mar = c(5,4,4,7) + 0.1) plot_activity(history, baseline=baseline)
colnames = c(paste(icon(c("fa-users", "fa-file", "fa-comment")), c("Team", "File", "Message"))) print_readerr(read_err)
Powered by rchallenge.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.