marg1: Implements a profile likelihood based algorithm for...
In NPMLEmix: Two-Groups Mixture Model with Covariates

Description Usage Arguments Details Value References

View source: R/myFUN.R

This function estimates the signal proportion and the signal density by using the marginal distribution of Y, followed by a profile likelihood based approach. It returns the vector of estimated local false discovery rates and the corresponding rejection set at a prespecified level for the false discovery rate.

1	marg1(y, x, blambda = 1e-06/length(y), level = 0.05)

`y`	The observed vector of z-scores.
`x`	The n\times p data matrix, where n must be equal to thelength of y. If you are interested in the intercept, you must add a column of 1's to x.
`blambda`	The tolerance threshold while implementing a quasi-Newton approach for estimating the signal proportion. Default is set to 1e-6/length(y). We recommend not changing it unless absolutely sure.
`level`	The level at which the false discovery rate is to be controlled. Should be a scalar in [0,1]. Default set to 0.05.

Note that the marginal distribution of Y based on the aforementioned model is same as that in a standard two-groups model (Efron 2008, see References). Fixing \barπ = \mathbf{E}[π(X)], the signal density φ_1(\cdot) is estimated using the Rmosek optimization suite. The primary idea is to approximate the mixing distribution G(\cdot) using \max\{100,√{n}\} many components, each having a suitable Gaussian distribution. The signal proportion is then estimated using the BFGS algorithm. Finally, the algorithm chooses the best value of \barπ based on a profile likelihood approach.

This function returns a list consisting of the following:

`p`	The estimated prior probabilities, i.e., \hatπ(\cdot) evaluated at the data points.
`b`	The estimates for the coefficient vector in the logistic function.
`f1y`	The vector of estimated signal density evaluated at the data points.
`kwo`	This is a list with four items - i. atoms: The vector of means for the Gaussian distributions used to approximate G(\cdot), ii. probs: The vector of probabilities for each Gaussian component used to approximate G(\cdot), iii. f1y: Same as f1y above, iv. ll: The average of the logarithmic values of f1y.
`localfdr`	The vector of estimated local false discovery rates evaluated at the data points.
`den`	The vector of estimated conditional densities evaluated at the data points.
`ll`	The log-likelihood evaluated at the estimated optima.
`rejset`	The vector of 1s and 0s where 1 indicates that the corresponding hypothesis is to be rejected.
`pi0`	The average of the entries of the vector p.
`ll_list`	The vector of profile log-likelihoods corresponding to a pre-determined set of grid points for \barπ. The highest element of this vector is the output in ll.

Deb, N., Saha, S., Guntuboyina, A. and Sen, B., 2018. Two-component Mixture Model in the Presence of Covariates. arXiv preprint arXiv:1810.07897.

Koenker, R. and Mizera, I., 2014. Convex optimization, shape constraints, compound decisions, and empirical Bayes rules. Journal of the American Statistical Association, 109(506), pp.674-685.

Efron, B., 2008. Microarrays, empirical Bayes and the two-groups model. Statistical science, pp.1-22.

NPMLEmix documentation built on Dec. 6, 2020, 9:06 a.m.