clean | R Documentation |
Cleans the dataset in order to create a suitable data.frame ready to be used in the welofit
function.
clean(x, MNM = 10, MRANK = 500)
x |
Data to be cleaned. It must be a data.frame coming from http://www.tennis-data.co.uk/. |
MNM |
optional Minimum number of matches played by each player to include in the cleaned dataset. Default to 10. This means that each player has to play at least 10 matches |
MRANK |
optional Maximum rank of the players to consider. Default to 500. This means that all the matches with players with ranks greater than 500 are dropped |
The cleaning operations are:
Remove all the uncompleted matches;
Remove all the NAs from B365 odds;
Remove all the NAs from the variable "ranking";
Remove all the NAs from the variable "games";
Remove all the NAs from the variable "sets";
Remove all the matches where the B365 odds are equal;
Define players i
and j
and their outcomes (Y_i
and Y_j
);
Remove all the matches of players who played less than MNM matches;
Remove all the matches of players with rank greater than MRANK;
Sort the matches by date.
Data.frame cleaned
data(atp_2019)
db_clean<-clean(atp_2019)
str(db_clean)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.