fifa: FIFA 20 preprocessed data

fifaR Documentation

FIFA 20 preprocessed data


The fifa dataset is a preprocessed players_20.csv dataset which comes as a part of "FIFA 20 complete player dataset" at Kaggle.




a data frame with 5000 rows, 42 columns and rownames


It contains 5000 'overall' best players and 43 variables. These are:

  • short_name (rownames)

  • nationality of the player (not used in modeling)

  • overall, potential, value_eur, wage_eur (4 potential target variables)

  • age, height, weight, attacking skills, defending skills, goalkeeping skills (37 variables)

It is advised to leave only one target variable for modeling.


All transformations:

  1. take 43 columns: [3, 5, 7:9, 11:14, 45:78] (R indexing)

  2. take rows with value_eur > 0

  3. convert short_name to ASCII

  4. remove rows with duplicated short_name (keep first)

  5. sort rows on overall and take top 5000

  6. set short_name column as rownames

  7. transform nationality to factor

  8. reorder columns


The players_20.csv dataset was downloaded from the Kaggle site and went through few transformations. The complete dataset was obtained from on January 1, 2020.

DALEX documentation built on Jan. 16, 2023, 1:06 a.m.

Related to fifa in DALEX...