makedata: Simulates data from the aforementioned model

Description Usage Arguments Details Value References Examples

View source: R/myFUN.R

Description

This function can be used to simulate observations from the aforementioned model, if G(\cdot) is chosen as a finite Gaussian mixture. It returns the true local false discovery rates which determine the optimal multiple testing procedure.

Usage

1
makedata(n, x, sx, atoms, probs, variances)

Arguments

n

Number of z-scores to be generated.

x

n\timesp data matrix. Do not add an additional column of 1's.

sx

The vector of coefficients for the logistic function. The first entry will be considered as the intercept term by default. Requires compatibility with x. See Details.

atoms

The vector of means for each component of the mixing distribution.

probs

The probability vector for the mixing distribution.

variances

The vector of variances for each component of the mixing distribution. Requires compatibility with atoms and probs. See Details.

Details

Given X=x, a Bernoulli(π^*(x)) sample is drawn. If the outcome is 1 (0), a z-score is drawn from φ_1(\cdot) (φ(\cdot)). All the observations corresponding to a Bernoulli outcome 1 (0) are termed as non-null observations (null observations).
The length of sx should be 1 more than the number of columns of the data matrix x.
The vectors - atoms, probs and variances must have the same length.

Value

The output is a list with the following entries:

y

The vector of simulated z-scores.

x

The input data matrix.

pix

The vector of signal proportions.

f0y

The vector of standard Gaussian densities evaluated at simulated z-scores.

f1y

The vector of signal densities evaluated at simulated z-scores.

den

The vector of conditional densities evaluated at simulated z-scores.

localfdr

The vector of local false discovery rates evaluated at simulated z-scores. Note that the local FDR can be interpreted as one minus the posterior probability that a given observation is non-null.

ll

The average conditional log-likelihood.

nnind

The indices corresponding to non-null observations.

References

Basu, P., Cai, T.T., Das, K. and Sun, W., 2018. Weighted false discovery rate control in large-scale multiple testing. Journal of the American Statistical Association, 113(523), pp.1172-1183.

Scott, J.G., Kelly, R.C., Smith, M.A., Zhou, P. and Kass, R.E., 2015. False discovery rate regression: an application to neural synchrony detection in primary visual cortex. Journal of the American Statistical Association, 110(510), pp.459-471.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
x=cbind(runif(1000),runif(1000))
n=1000
atoms=c(-2,0,2)
probs=c(0.48,0.04,0.48)
variances=c(1,16,1)
sx=c(-3,1.5,1.5)
### Generating the data ###
st=makedata(n,x,sx,atoms,probs,variances)
### Output the vector of local false discovery rates ###
st$localfdr

NPMLEmix documentation built on Dec. 6, 2020, 9:06 a.m.