toys.data: Toys data

Description Usage Format Details Source Examples

Description

toys.data is a simple simulated dataset of a binary classification problem, introduced by Weston et.al..

Usage

1

Format

An object of class list of length 2.

Details

The data-frame x is composed by 2 independant clusters, each cluster contains 25 correlated variables. It is an equiprobable two class problem, Y belongs to -1,1, with 12 true variables (6 true variables in each cluster), the others being noise. The simulation model is defined through the conditional distribution of the X^j for Y=y. In the first cluster, the X^j are simulated in the following way:

The second cluster of 25 variables is simulated in a similar way.

Source

Weston, J., Elisseff, A., Schoelkopf, B., Tipping, M. (2003), Use of the zero norm with linear models and Kernel methods, J. Machine Learn. Res. 3, 1439-14611

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
library(ClustOfVar)
library(impute)
library(FAMT)
library(VSURF)
library(glmnet)
library(anapuce)
library(qvalue)
X<-toys.data$x
Y<-toys.data$Y
scoreX<-data.frame(c(rep(8,6),rep(0,19),rep(8,6),rep(0,19)))
rownames(scoreX)<-colnames(X)
select<-ARMADA.heatmap(X, Y,  scoreX, threshold=1)
 ## Not run: 
result<-ARMADA(X,Y, nclust=2)
select<-ARMADA.heatmap(X, Y,  result[[3]], threshold=5)

## End(Not run)

armada documentation built on May 2, 2019, 6:37 a.m.

Related to toys.data in armada...