sample.multinom: Generates covariate matrix X with correlated block of...
In plsgenomics: PLS Analyses for Genomics

sample.multinom

R Documentation

Generates covariate matrix X with correlated block of covariates and a multi-label random reponse depening on X through a multinomial model

Description

The function sample.multinom generates a random sample of n observations, composed of p predictors, collected in the n x p matrix X, and a binary response, in a vector Y of length n, thanks to a logistic model, where the response Y is generated as a Bernoulli random variable of parameter logit^{-1}(XB), the coefficients B are sparse. In addition, the covariate matrix X is composed of correlated blocks of predictors.

Usage

sample.multinom(
  n,
  p,
  nb.class = 2,
  kstar,
  lstar,
  beta.min,
  beta.max,
  mean.H = 0,
  sigma.H = 1,
  sigma.F = 1,
  seed = NULL
)

Arguments

`n`	the number of observations in the sample.
`p`	the number of covariates in the sample.
`nb.class`	the number of groups in the data.
`kstar`	the number of underlying latent variables used to generates the covariate matrix `X`, `kstar <= p`. `kstar` is also the number of blocks in the covariate matrix (see details).
`lstar`	the number of blocks in the covariate matrix `X` that are used to generates the response `Y`, i.e. with non null coefficients in vector `B`, `lstar <= kstar`.
`beta.min`	the inf bound for non null coefficients (see details).
`beta.max`	the sup bound for non null coefficients (see details).
`mean.H`	the mean of latent variables used to generates `X`.
`sigma.H`	the standard deviation of latent variables used to generates `X`.
`sigma.F`	the standard deviation of the noise added to latent variables used to generates `X`.
`seed`	an positive integer, if non NULL it fix the seed (with the command `set.seed`) used for random number generation.

Details

The set (1:p) of predictors is partitioned into kstar block. Each block k (k=1,...,kstar) depends on a latent variable H.k which are independent and identically distributed following a Gaussian distribution N(mean.H, sigma.H^2). Each columns X.j of the matrix X is generated as H.k + F.j for j in the block k, where F.j is independent and identically distributed gaussian noise N(0,sigma.F^2).

The coefficients B are generated as random between beta.min and beta.max on lstar blocks, randomly chosen, and null otherwise. The variables with non null coefficients are then relevant to explain the response, whereas the ones with null coefficients are not.

The response is generated as by drawing one observation of n different Bernoulli random variables of parameters logit^{-1}(XB).

The details of the procedure are developped by Durif et al. (2018).

Value

An object with the following attributes:

`X`	the (n x p) covariate matrix, containing the `n` observations for each of the `p` predictors.
`Y`	the (n) vector of Y observations.
`proba`	the n vector of Bernoulli parameters used to generate the response, in particular `logit^{-1}(X %*% B)`.
`sel`	the index in (1:p) of covariates with non null coefficients in `B`.
`nosel`	the index in (1:p) of covariates with null coefficients in `B`.
`B`	the (n) vector of coefficients.
`block.partition`	a (p) vector indicating the block of each predictors in (1:kstar).
`p`	the number of covariates in the sample.
`kstar`	the number of underlying latent variables used to generates the covariate matrix `X`, `kstar <= p`. `kstar` is also the number of blocks in the covariate matrix (see details).
`lstar`	the number of blocks in the covariate matrix `X` that are used to generates the response `Y`, i.e. with non null coefficients in vector `B`, `lstar <= kstar`.
`p0`	the number of predictors with non null coefficients in `B`.
`block.sel`	a (lstar) vector indicating the index in (1:kstar) of blocks with predictors having non null coefficient in `B`.
`beta.min`	the inf bound for non null coefficients (see details).
`beta.max`	the sup bound for non null coefficients (see details).
`mean.H`	the mean of latent variables used to generates `X`.
`sigma.H`	the standard deviation of latent variables used to generates `X`.
`sigma.F`	the standard deviation of the noise added to latent variables used to generates `X`.
`seed`	an positive integer, if non NULL it fix the seed (with the command `set.seed`) used for random number generation.

Author(s)

Ghislain Durif (https://gdurif.perso.math.cnrs.fr/).

References

Durif, G., Modolo, L., Michaelsson, J., Mold, J.E., Lambert-Lacroix, S., Picard, F., 2018. High dimensional classification with combined adaptive sparse PLS and logistic regression. Bioinformatics 34, 485–493. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/bioinformatics/btx571")}. Available at http://arxiv.org/abs/1502.05933.

Examples

### load plsgenomics library
library(plsgenomics)

### generating data
n <- 100
p <- 1000
nclass <- 3
sample1 <- sample.multinom(n=n, p=p, nb.class=nclass,
                           kstar=20, lstar=2, beta.min=0.25,
                           beta.max=0.75, mean.H=0.2,
                           sigma.H=10, sigma.F=5)

str(sample1)

plsgenomics documentation built on June 22, 2024, 7:30 p.m.

plsgenomics index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

plsgenomics
PLS Analyses for Genomics

sample.multinom: Generates covariate matrix X with correlated block of...
In plsgenomics: PLS Analyses for Genomics

Generates covariate matrix X with correlated block of covariates and a multi-label random reponse depening on X through a multinomial model

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to sample.multinom in plsgenomics...

R Package Documentation

Browse R Packages

We want your feedback!

plsgenomics PLS Analyses for Genomics

sample.multinom: Generates covariate matrix X with correlated block of... In plsgenomics: PLS Analyses for Genomics

Generates covariate matrix X with correlated block of covariates and a multi-label random reponse depening on X through a multinomial model

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to sample.multinom in plsgenomics...

R Package Documentation

Browse R Packages

We want your feedback!

plsgenomics
PLS Analyses for Genomics

sample.multinom: Generates covariate matrix X with correlated block of...
In plsgenomics: PLS Analyses for Genomics