# varset: Simulation Model In ipred: Improved Predictors

 varset R Documentation

## Simulation Model

### Description

Three sets of variables are calculated: explanatory, intermediate and response variables.

### Usage

``````varset(N, sigma=0.1, theta=90, threshold=0, u=1:3)
``````

### Arguments

 `N` number of simulated observations. `sigma` standard deviation of the error term. `theta` angle between two u vectors. `threshold` cutpoint for classifying to 0 or 1. `u` starting values.

### Details

For each observation values of two explanatory variables `x = (x_1, x_2)^{\top}` and of two responses `y = (y_1, y_2)^{\top}` are simulated, following the formula:

``` y = U*x+e = ({u_1^{\top} \atop u_2^{\top}})*x+e ```

where x is the evaluation of as standard normal random variable and e is generated by a normal variable with standard deviation `sigma`. U is a 2*2 Matrix, where

``` u_1 = ({u_{1, 1} \atop u_{1, 2}}), u_2 = ({u_{2, 1} \atop u_{2, 2}}), ||u_1|| = ||u_2|| = 1, ```

i.e. a matrix of two normalised vectors.

### Value

A list containing the following arguments

 `explanatory` N*2 matrix of 2 explanatory variables. `intermediate` N*2 matrix of 2 intermediate variables. `response` response vectors with values 0 or 1.

### References

David J. Hand, Hua Gui Li, Niall M. Adams (2001), Supervised classification with structured class definitions. Computational Statistics & Data Analysis 36, 209–225.

### Examples

``````
theta90 <- varset(N = 1000, sigma = 0.1, theta = 90, threshold = 0)
theta0 <- varset(N = 1000, sigma = 0.1, theta = 0, threshold = 0)
par(mfrow = c(1, 2))
plot(theta0\$intermediate)
plot(theta90\$intermediate)

``````

ipred documentation built on March 31, 2023, 11:08 p.m.