adult: adult data set

adultR Documentation

adult data set

Description

the adult dataset was collected from the US Census Bureau and the primary task is to predict whether a given adult makes more than $50K a year based attributes such as education, hours of work per week, etc. the target feature is income, a factor with levels "<=50K" and ">50K", and the remaining 14 variables are predictors.

Usage

 data(adult) 

Format

the adult dataset, as a data frame, contains 48598 rows and 15 columns (variables/features). the 15 variables are:

  • age: age in years.

  • workclass: a factor with 6 levels.

  • demogweight: the demographics to describe a person.

  • education: a factor with 16 levels.

  • education.num: number of years of education.

  • marital.status: a factor with 5 levels.

  • occupation: a factor with 15 levels.

  • relationship: a factor with 6 levels.

  • race: a factor with 5 levels.

  • gender: a factor with levels "Female","Male".

  • capital.gain: capital gains.

  • capital.loss: capital losses.

  • hours.per.week: number of hours of work per week.

  • native.country: a factor with 42 levels.

  • income: yearly income as a factor with levels "<=50K" and ">50K".

Details

For more information related to the dataset see the UCI Machine Learning Repository:
http://www.cs.toronto.edu/~delve/data/adult/desc.html
http://www.cs.toronto.edu/~delve/data/adult/adultDetail.html

Source

This dataset comes from the UCI repository of machine learning databases:
https://archive.ics.uci.edu

References

Kohavi, R. (1996). Scaling up the accuracy of naive-bayes classifiers: A decision-tree hybrid. Kdd.

Reza Mohammadi (2025). Data Science Foundations and Machine Learning with R: From Data to Decisions. https://book-data-science-r.netlify.app.

See Also

bank, churn_mlc, churn, churn_tel, risk, cereal, advertising, marketing, drug, house, house_price, red_wines, white_wines, insurance, caravan, fertilizer, corona

Examples

data(adult)
str(adult)

liver documentation built on Feb. 19, 2026, 1:07 a.m.