Description Usage Format Details Source Examples
The dataset is an extract from this survey. It consists of 14 demographic attributes. The dataset is a good mixture of categorical and continuos variables with a lot of missing data. This is characteristic for data mining applications.
1 |
A data frame with 8993 observations on the following 14 variables.
ANNUAL INCOME OF HOUSEHOLD (PERSONAL INCOME IF SINGLE) 1. Less than \$10,000 2. \$10,000 to \$14,999 3. \$15,000 to \$19,999 4. \$20,000 to \$24,999 5. \$25,000 to \$29,999 6. \$30,000 to \$39,999 7. \$40,000 to \$49,999 8. \$50,000 to \$74,999 9. \$75,000 or more
1. Male 2. Female
1. Married 2. Living together, not married 3. Divorced or separated 4. Widowed 5. Single, never married
1. 14 thru 17 2. 18 thru 24 3. 25 thru 34 4. 35 thru 44 5. 45 thru 54 6. 55 thru 64 7. 65 and Over
1. Grade 8 or less 2. Grades 9 to 11 3. Graduated high school 4. 1 to 3 years of college 5. College graduate 6. Grad Study
1. Professional/Managerial 2. Sales Worker 3. Factory Worker/Laborer/Driver 4. Clerical/Service Worker 5. Homemaker 6. Student, HS or College 7. Military 8. Retired 9. Unemployed
HOW LONG HAVE YOU LIVED IN THE SAN FRAN./OAKLAND/SAN JOSE AREA? 1. Less than one year 2. One to three years 3. Four to six years 4. Seven to ten years 5. More than ten years
DUAL INCOMES (IF MARRIED) 1. Not Married 2. Yes 3. No
PERSONS IN YOUR HOUSEHOLD 1. One 2. Two 3. Three 4. Four 5. Five 6. Six 7. Seven 8. Eight 9. Nine or more
PERSONS IN HOUSEHOLD UNDER 18 0. None 1. One 2. Two 3. Three 4. Four 5. Five 6. Six 7. Seven 8. Eight 9. Nine or more
HOUSEHOLDER STATUS 1. Own 2. Rent 3. Live with Parents/Family
1. House 2. Condominium 3. Apartment 4. Mobile Home 5. Other
1. American Indian 2. Asian 3. Black 4. East Indian 5. Hispanic 6. Pacific Islander 7. White 8. Other
WHAT LANGUAGE IS SPOKEN MOST OFTEN IN YOUR HOME? 1. English 2. Spanish 3. Other
The goal is to predict the Anual Income of Household from the other 13 demographics attributes.
Number of instances: 8993.
These are obtained from the original dataset with 9409 instances, by removing those observations with the response (Annual Income) missing.
Source: Impact Resources, Inc., Columbus, OH (1987). A total of N=9409 questionnaires containg 502 questions were filled out by shopping mall customers in the San Francisco Bay area.
1 2 |
'data.frame': 8993 obs. of 14 variables:
$ Income : int 9 9 9 1 1 8 1 6 2 4 ...
$ Sex : int 2 1 2 2 2 1 1 1 1 1 ...
$ Marital : int 1 1 1 5 5 1 5 3 1 1 ...
$ Age : int 5 5 3 1 1 6 2 3 6 7 ...
$ Edu : int 4 5 5 2 2 4 3 4 3 4 ...
$ Occupation : int 5 5 1 6 6 8 9 3 8 8 ...
$ Lived : int 5 5 5 5 3 5 4 5 5 4 ...
$ Dual_Income : int 3 3 2 1 1 3 1 1 3 3 ...
$ Household : int 3 5 3 4 4 2 3 1 3 2 ...
$ Householdu18: int 0 2 1 2 2 0 1 0 0 0 ...
$ Status : int 1 1 2 3 3 1 2 2 2 2 ...
$ Home_Type : int 1 1 3 1 1 1 3 3 3 3 ...
$ Ethnic : int 7 7 7 7 7 7 7 7 7 7 ...
$ Language : int NA 1 1 1 1 1 1 1 1 1 ...
Income Sex Marital Age
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:2.000 1st Qu.:1.000 1st Qu.:1.000 1st Qu.:2.000
Median :5.000 Median :2.000 Median :3.000 Median :3.000
Mean :4.895 Mean :1.547 Mean :3.031 Mean :3.415
3rd Qu.:7.000 3rd Qu.:2.000 3rd Qu.:5.000 3rd Qu.:4.000
Max. :9.000 Max. :2.000 Max. :5.000 Max. :7.000
NA's :160
Edu Occupation Lived Dual_Income
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:3.000 1st Qu.:1.000 1st Qu.:4.000 1st Qu.:1.000
Median :4.000 Median :4.000 Median :5.000 Median :1.000
Mean :3.835 Mean :3.788 Mean :4.198 Mean :1.545
3rd Qu.:5.000 3rd Qu.:6.000 3rd Qu.:5.000 3rd Qu.:2.000
Max. :6.000 Max. :9.000 Max. :5.000 Max. :3.000
NA's :86 NA's :136 NA's :913
Household Householdu18 Status Home_Type
Min. :1.000 Min. :0.0000 Min. :1.000 Min. :1.000
1st Qu.:2.000 1st Qu.:0.0000 1st Qu.:1.000 1st Qu.:1.000
Median :3.000 Median :0.0000 Median :2.000 Median :1.000
Mean :2.852 Mean :0.6669 Mean :1.837 Mean :1.856
3rd Qu.:4.000 3rd Qu.:1.0000 3rd Qu.:2.000 3rd Qu.:3.000
Max. :9.000 Max. :9.0000 Max. :3.000 Max. :5.000
NA's :375 NA's :240 NA's :357
Ethnic Language
Min. :1.000 Min. :1.000
1st Qu.:5.000 1st Qu.:1.000
Median :7.000 Median :1.000
Mean :5.956 Mean :1.127
3rd Qu.:7.000 3rd Qu.:1.000
Max. :8.000 Max. :3.000
NA's :68 NA's :359
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.