Description Format Details Source Examples

`toys`

is a simple simulated dataset of a binary classification
problem, introduced by Weston et.al..

The format is a list of 2 components:

- x
a dataframe containing input variables: with 100 obs. of 200 variables

- y
output variable: a factor with 2 levels "-1" and "1"

It is an equiprobable two class problem, Y belongs to {-1,1}, with six
true variables, the others being some noise.
The simulation model is defined through the conditional distribution
of the *X_i* for Y=y:

with probability 0.7, X^j ~ N(yj,1) for j=1,2,3 and X^j ~ N(0,1) for j=4,5,6 ;

with probability 0.3, X^j ~ N(0,1) for j=1,2,3 and X^j ~ N(y(j-3),1) for j=4,5,6 ;

the other variables are noise, X^j ~ N(0,1) for j=7,...,p.

After simulation, the obtained variables are finally standardized.

Weston, J., Elisseff, A., Schoelkopf, B., Tipping, M. (2003),
*Use of the zero norm with linear models and Kernel methods*,
J. Machine Learn. Res. 3, 1439-1461

1 2 3 4 5 6 7 8 9 10 11 12 |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.