fifa: FIFA 20 preprocessed data

The fifa dataset is a preprocessed players_20.csv dataset which comes as a part of "FIFA 20 complete player dataset" at Kaggle.




a data frame with 5000 rows, 42 columns and rownames


It contains 5000 'overall' best players and 43 variables. These are:

It is advised to leave only one target variable for modeling.


All transformations:

  1. take 43 columns: [3, 5, 7:9, 11:14, 45:78] (R indexing)

  2. take rows with value_eur > 0

  3. convert short_name to ASCII

  4. remove rows with duplicated short_name (keep first)

  5. sort rows on overall and take top 5000

  6. set short_name column as rownames

  7. transform nationality to factor

  8. reorder columns


The players_20.csv dataset was downloaded from the Kaggle site and went through few transformations. The complete dataset was obtained from on January 1, 2020.

