adult: Adult Data Set

adultR Documentation

Adult Data Set

Description

Extraction was done by Barry Becker from the 1994 Census database. A set of reasonably clean records was extracted using the following conditions: ((AAGE > 16) && (AGI > 100) && (AFNLWGT > 1) && (HRSWK > 0))

Usage

adult

Format

A data frame with 32561 observations on the following 15 variables.

  • age: Integer

    • Number of years alive

  • workclass: Factor

    • Private, Self-emp-not-inc, Self-emp-inc, Federal-gov, Local-gov, State-gov, Without-pay, Never-worked.

  • fnlwgt: Numeric

    • The variable represents the Final Weight, which is more so a sampling weight. See the names file listed in references for more details.

  • education: Factor

    • Highest level of education attained

    • Bachelors, Some-college, 11th, HS-grad, Prof-school, Assoc-acdm, Assoc-voc, 9th, 7th-8th, 12th, Masters, 1st-4th, 10th, Doctorate, 5th-6th, Preschool.

  • education_num: Numeric

    • Number of years of education

  • marital_status: Factor

    • Married-civ-spouse, Divorced, Never-married, Separated, Widowed, Married-spouse-absent, Married-AF-spouse

  • occupation: Factor

    • Tech-support, Craft-repair, Other-service, Sales, Exec-managerial, Prof-specialty, Handlers-cleaners, Machine-op-inspct, Adm-clerical, Farming-fishing, Transport-moving, Priv-house-serv, Protective-serv, Armed-Forces.

  • relationship: Factor

    • Wife, Own-child, Husband, Not-in-family, Other-relative, Unmarried.

  • race: Factor

    • White, Asian-Pac-Islander, Amer-Indian-Eskimo, Other, Black.

  • sex: Factor

    • Female, Male

  • capital_gain: Integer

    • Income from investment sources, apart from wages/salary

  • capital_loss: Integer

    • Losses from investment sources, apart from wages/salary

  • hours_per_week: Integer

    • Amount of hours worked per week

  • native_country: Factor

    • Country of origin

    • United-States, Cambodia, England, Puerto-Rico, Canada, Germany, Outlying-US(Guam-USVI-etc), India, Japan, Greece, South, China, Cuba, Iran, Honduras, Philippines, Italy, Poland, Jamaica, Vietnam, Mexico, Portugal, Ireland, France, Dominican-Republic, Laos, Ecuador, Taiwan, Haiti, Columbia, Hungary, Guatemala, Nicaragua, Scotland, Thailand, Yugoslavia, El-Salvador, Trinadad&Tobago, Peru, Hong, Holand-Netherlands.

  • income: Factor

    • Whether the income greater than $50,000 or not.

    • <=50K, >50K

Details

Prediction task is to determine whether a person makes over 50K a year.

Source

Ronny Kohavi and Barry Becker Data Mining and Visualization Silicon Graphics. e-mail: ronnyk '@' live.com for questions.

References

https://archive.ics.uci.edu/ml/machine-learning-databases/adult/

http://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names

https://archive.ics.uci.edu/ml/datasets/adult


coatless/ucidata documentation built on Nov. 17, 2023, 9:19 a.m.