logistic_gaussian_dgp | R Documentation |
Generate independent normally-distributed covariates and logistic response data.
logistic_gaussian_dgp( n, p, s = p, betas = NULL, betas_sd = 1, intercept = 0, data_split = FALSE, train_prop = 0.5, return_values = c("X", "y", "support"), ... )
n |
Number of samples. |
p |
Number of features. |
s |
Sparsity level of features. Coefficients corresponding to features
after the |
betas |
Coefficient vector for observed design matrix. If a scalar is provided, the coefficient vector is constant. If |
betas_sd |
(Optional) SD of normal distribution from which to draw |
intercept |
Scalar intercept term. |
data_split |
Logical; if |
train_prop |
Proportion of data in training set if |
return_values |
Character vector indicating what objects to return in list. Elements in vector must be one of "X", "y", "support". |
... |
Not used. |
Data is generated via:
log(p / (1 - p)) = intercept + betas \%*\% X,
where p = P(y = 1 | X), X is a standard Gaussian random matrix,
and the true underlying support of this data is the first s features in X
(unless specified otherwise by betas
).
A list of the named objects that were requested in
return_values
. See brief descriptions below.
A data.frame
.
A response vector of length nrow(X)
.
A vector of feature indices indicating all features used in the true support of the DGP.
Note that if data_split = TRUE
and "X", "y"
are in return_values
, then the returned list also contains slots for
"Xtest" and "ytest".
# generate data from: log(p / (1 - p)) = betas_1 * x_1 + betas_2 * x_2, where # betas_1, betas_2 ~ N(0, 1) and X ~ N(0, I_10) sim_data <- logistic_gaussian_dgp(n = 100, p = 10, s = 2, betas_sd = 1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.