census | R Documentation |

Includes a data frame of 1994 US census income from 48,842 people
divided into a training set of 32,561 and an independent test set
of 16,281. The training outcome variable `y`

(`yt`

for test) is
binary and indicates whether or not a person’s income is greater
than $50,000 per year. There are 12 predictor variables `x`

(`xt`

for test) consisting of various demographic and financial
properties associated with each person. It also included estimates
of `Pr(y=1|x)`

obtained by several machine learning methods:
gradient boosting on logistic scale using maximum likelihood (GBL),
random forest (RF), and gradient boosting on the probability scale
(GBP) using least–squares.

```
census
```

`census`

A list of 10 items.

- x
training data frame of 32561 observations on 12 predictor variables

- y
training binary response whether salary is above $50K or not

- xt
test data frame of 16281 observations predictor variables

- yt
test binary response whether salary is above $50K or not

- gbl
training GBL response variable

- gblt
test GBL response variable

- gbp
training GBP response variable

- gbpt
test GBP response variable

- rf
training RF response probabilities

- rft
test GBP response probabilities

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.