toys: A simulated dataset called toys data

Description Format Details Source Examples

Description

toys is a simple simulated dataset of a binary classification problem, introduced by Weston et.al..

Format

The format is a list of 2 components:

x

a dataframe containing input variables: with 100 obs. of 200 variables

y

output variable: a factor with 2 levels "-1" and "1"

Details

It is an equiprobable two class problem, Y belongs to {-1,1}, with six true variables, the others being some noise. The simulation model is defined through the conditional distribution of the X_i for Y=y:

After simulation, the obtained variables are finally standardized.

Source

Weston, J., Elisseff, A., Schoelkopf, B., Tipping, M. (2003), Use of the zero norm with linear models and Kernel methods, J. Machine Learn. Res. 3, 1439-1461

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
data(toys)
toys.rf <- randomForest::randomForest(toys$x, toys$y)
toys.rf

## Not run: 
# VSURF applied for toys data:
# (a few minutes to execute)
data(toys)
toys.vsurf <- VSURF(toys$x, toys$y)
toys.vsurf

## End(Not run)

robingenuer/VSURF documentation built on Oct. 16, 2018, 11:09 a.m.