census | R Documentation |
Includes a data frame of 1994 US census income from 48,842 people
divided into a training set of 32,561 and an independent test set
of 16,281. The training outcome variable y
(yt
for test) is
binary and indicates whether or not a person’s income is greater
than $50,000 per year. There are 12 predictor variables x
(xt
for test) consisting of various demographic and financial
properties associated with each person. It also included estimates
of Pr(y=1|x)
obtained by several machine learning methods:
gradient boosting on logistic scale using maximum likelihood (GBL),
random forest (RF), and gradient boosting on the probability scale
(GBP) using least–squares.
census
census
A list of 10 items.
training data frame of 32561 observations on 12 predictor variables
training binary response whether salary is above $50K or not
test data frame of 16281 observations predictor variables
test binary response whether salary is above $50K or not
training GBL response variable
test GBL response variable
training GBP response variable
test GBP response variable
training RF response probabilities
test GBP response probabilities
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.