# Simulated Data

### Description

generates data that can be used for simulations

### Usage

1 | ```
sim.data.ppls(ntrain,ntest,stnr,p,a=NULL,b=NULL)
``` |

### Arguments

`ntrain` |
number of training observations |

`ntest` |
number of test observations |

`stnr` |
signal to noise ratio |

`p` |
number of predictor variables |

`a` |
vector of length 5 that determines the regression problem to be simulated |

`b` |
vector of length 5 that determines the regression problem to be simulated |

### Details

The matrix of training and test data is drawn from a uniform
distribution over [-1,1] for each of the `p`

variables. The response is
generated via a nonlinear regression model of the form

*Y=∑ _{j=1} ^5 f_j(X_j) + \varepsilon*

where *f_j(x)=a_j x + sin(6 b_jx)*. The values of *a_j* and
*b_j* can be specified via `a`

or `b`

. If no values
for `a`

or `b`

is given, they are drawn randomly from
[-1,1]. The variance of the noise term is chosen such that the
signal-to-noise-ratio equals `stnr`

on the training data.

### Value

`Xtrain` |
matrix of size |

`ytrain` |
vector of lengt |

`Xtest` |
matrix of size |

`ytest` |
vector of lengt |

`sigma` |
standard deviation of the noise term |

`a` |
vector that determines the nonlinear function |

`b` |
vector that determines the nonlinear function |

### Author(s)

Nicole Kraemer

### References

N. Kraemer, A.-L. Boulsteix, and G. Tutz (2008). *Penalized Partial Least Squares with Applications
to B-Spline Transformations and Functional Data*. Chemometrics and Intelligent Laboratory Systems, 94, 60 - 69. http://dx.doi.org/10.1016/j.chemolab.2008.06.009

### See Also

`ppls.splines.cv`

### Examples

1 | ```
dummy<-sim.data.ppls(ntrain=50,ntest=200,p=16,stnr=16)
``` |