npmleEM: Implements the full likelihood approach based on the EM...

Description Usage Arguments Details Value References Examples

View source: R/myFUN.R

Description

This function estimates the signal proportion and the signal density by using the full likelihood of the sample, followed by an EM algorithm based approach. It returns the vector of estimated local false discovery rates and the corresponding rejection set at a prespecified level for the false discovery rate.

Usage

1
npmleEM(y, x, level = 0.05, initp = 1)

Arguments

y

The observed vector of z-scores.

x

The n\times p data matrix, where n mist be equal to thelength of y. If you are interested in the intercept, you must add a column of 1's to x.

level

The level at which the false discovery rate is to be controlled. Should be a scalar in [0,1]. Default set to 0.05.

initp

The initialization method for the EM algorithm. It should be either 1,2,3 or 4. 1 indicates a marg1() initialization, 2 indicates a marg2() initialization, 3 indicates a FDRreg() initialization (see Details and References) and 4 chooses that initialization among marg1(), marg2() and FDRreg() which yields the highest sample likelihood. Default is set to 1.

Details

The key observation in the full likelihood approach is that the M-step of the EM algorithm results in two decoupled optimization problems, one involving π^*(\cdot) and the other involving φ_1(\cdot). These two individual problems are then solved using the BFGS algorithm and the Rmosek optimization suite, as has been discussed previously in the Details sections of the methods marg1() and marg2().
The FDRreg() method was introduced in Scott et al (2015). We recommend using the version of the FDRreg() package available in https://github.com/jgscott/FDRreg/tree/master/R_pkg.

Value

This function returns a list consisting of the following:

atoms

The vector of means for the Gaussian distributions used to approximate G(\cdot).

probs

The vector of probabilities for each Gaussian component used to approximate G(\cdot).

f1y

The vector of estimated signal densities evaluated at the data points.

f0y

The vector of null densities evaluated at the data points.

b

The estimates for the coefficient vector in the logistic function.

p

The estimated prior probabilities, i.e., \hatπ(\cdot) evaluated at the data points.

ll

The log-likelihood evaluated at the estimated optima.

rejset

The vector of 1s and 0s where 1 indicates that the corresponding hypothesis is to be rejected.

den

The vector of estimated conditional densities evaluated at the data points.

localfdr

The vector of estimated local false discovery rates evaluated at the data points.

References

Deb, N., Saha, S., Guntuboyina, A. and Sen, B., 2018. Two-component Mixture Model in the Presence of Covariates. arXiv preprint arXiv:1810.07897.

Scott, J.G., Kelly, R.C., Smith, M.A., Zhou, P. and Kass, R.E., 2015. False discovery rate regression: an application to neural synchrony detection in primary visual cortex. Journal of the American Statistical Association, 110(510), pp.459-471.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
require(NPMLEmix)
### Use example data ###
st=makedata(100,cbind(runif(100),runif(100)),c(0,1,-1),c(0,1),c(0.4,0.6),c(1,1))
### Use the default rejection level and default initialization ###
npmle1=npmleEM(st$y, cbind(1, st$xs))
### Use a new rejection level of 0.1 and marg2() initialization ###
npmle2=npmleEM(st$y, cbind(1, st$xs), level = 0.1, initp = 2)
#' ### Use a new rejection level of 0.1 and FDRreg() initialization ###
npmle3=npmleEM(st$y, cbind(1, st$xs), level = 0.1, initp = 3)
### Use the best initialization among other three ###
npmle4=npmleEM(st$y, cbind(1, st$xs), level = 0.2, initp = 4)
### Output the vector of prior probabilities ###
npmle1$p
### Output the rejection set ###
npmle2$rejset
### Output the vector of local false discovery rates ###
npmle3$localfdr
### Output the vector of estimated conditional densities ###
npmle4$den

NabarunD/NPMLEmix documentation built on June 19, 2020, 12:11 p.m.