friedman: Generate data for an example of Friedman (1991)

View source: R/gen.data.R

friedmanR Documentation

Generate data for an example of Friedman (1991)

Description

Generate data including responses and predictors values according to an example of Friedman, J. H. (1991). "Multivariate adaptive regression splines." Ann. Statist. 19 1–141.

Usage

friedman(n, p, sigma, binary)

Arguments

n

The number of observations.

p

The number of predictors.

sigma

The error variance.

binary

A boolean argument: binary = TRUE indicates that binary responses are generated and binary = FALSE indicates that continuous responses are generated.

Details

Sample the predictors x_1, ..., x_p from Uniform(0, 1) independently. If binary = FALSE, sample the continuous response y from Normal(f0(x), σ^2), where

f0(x) = 10sin(π x_1*x_2) + 20(x_3-0.5)^2 + 10x_4 + 5x_5.

If binary = TRUE, sample the binary response y from Bernoulli(Φ(f0(x))) where f0 is defined above and Φ is the cumulative density function of the standard normal distribution.

Value

Return a list with the following components.

X

An n by p data frame representing predictors values, with each row corresponding an observation.

Y

A vector of length n representing response values.

f0

A vector of length n representing the values of f0(x).

sigma

The error variance which is only returned when binary = FALSE.

prob

A vector of length n representing the values of Φ(f0(x)), which is only returned when binary = TRUE.

Author(s)

Chuji Luo: cjluo@ufl.edu and Michael J. Daniels: daniels@ufl.edu.

References

Friedman, J. H. (1991). "Multivariate adaptive regression splines." Ann. Statist. 19 1–141.

Luo, C. and Daniels, M. J. (2021) "Variable Selection Using Bayesian Additive Regression Trees." arXiv preprint arXiv:2112.13998.

Examples

data = friedman(100, 10, 1, FALSE)

BartMixVs documentation built on May 5, 2022, 9:05 a.m.