nparncp: Nonparametric estimation of noncentrality parameters
In pi0: Estimating the proportion of true null hypotheses for FDR

Description Usage Arguments Details Value Note Author(s) References See Also Examples

The functions use Gaussian basis functions to estimate the noncentrality parameters (ncp) from a large number of t-statistics.

nparncpt(tstat, df, ...)
nparncpt.sqp(tstat, df, penalty=3L, lambdas=10^seq(-1,5,by=1), starts, 
		IC=c('BIC','CAIC','HQIC','AIC'), K=100, 
		bounds=quantile(tstat,c(.01,.99)), 
        solver=c('solve.QP','lsei','ipop','LowRankQP'),
		plotit=FALSE, verbose=FALSE, approx.hess=TRUE, ... )

`tstat`	Numeric vector of noncentrality parameters
`df`	Numeric vector of degrees of freedom
`penalty`	An integer scalar among 1 through 5, indicating the order of derivatives of the estimated density funciton of ncp. The integral of square of such derivatives is the penalty to the log likelihood function. A character value among `c('1st.deriv','2nd.deriv','3rd.deriv','4th.deriv','5th.deriv')` is also accepted but deprecated.
`lambdas`	Numeric vector of smoothness tuning parameter `lambda` to be tried. The one that minimizes NIC will be chosen.
`starts`	Optional numeric vector of starting values. If missing, `parncpt` will be called with `zeromean` set to `FALSE` to get an initial esimate of `pi0`. And the starting values (`theta`) will be set all equal to each other and sum to `1-pi0`. Note that this is the starting value for the largest `lambdas` only. For smaller `lambdas`, the estimates from larger `lambdas` will be used as starting values (i.e., warm start).

`IC`	Character; one of `AIC`, `BIC`, `CAIC`, `HQIC`, specifying the factor multiplied to the ENP in computing Information Criterion (IC).
`K`	The number of basis Gaussian density functions.
`bounds`	A numeric vector of length 2, giving the approximate bounds where most of the probability of ncp lies.
`solver`	Character. The name of the function for solving quadratic programming problems. Note that `ipop` and `kernlab` are not very reliable. `solve.QP` is faster but `lsei` is more stable.
`plotit`	logical; indicating if `plot.nparncpt` should be called after estimation. This is always recommended before accepting the results.
`verbose`	logical; if `TRUE`, extensive messages will be printed.
`approx.hess`	either logical or a number between 0 and 1. This helps in reducing time in evaluating the hessian matrix. If it is set to `TRUE`, for the kth Gaussian basis function and the gth `tstat`, the marginal t-statistic density evaluated at this `tstat` will be set to zero if it is below the average of all `K*length(tstat)` such values. If it is set to FALSE or 0, then none of the density will be treated as zero, no matter how small they are. If it is set to a number between 0 and 1, values below this quantile will be treated as zero. Note that this approximation only affects the computation of hessian matrix, which does not need to be exact in an optimization routine. Hence, a reasonable sparseness speeds up computation of a hessian matrix but might increase the number of iterations to converge. Set this to `TRUE` seems a reasonable trade-off between the two effects and usually saves computing time.
`...`	other paramters passed to `dtn.mix`. Usually, the `approximation` argument.

nparncpt is a wrapper for nparncpt.sqp, the latter of which uses a sequential quadratic programming algorithm to find the mixing proportions of the basis Gaussian density functions.

A list with class attribute c("nparncpt", "ncpest")

`pi0`	estimated proportion of true nulls
`mu.ncp`	mean of ncp
`sd.ncp`	SD of ncp
`logLik`	an object of class `logLik`. The associated `df` is the estimated effective number of parameters (enp). The log likelihood is also penalized likelihood. See also `logLik.ncpest` and `AIC`.
`enp`	estimated ENP
`par`	estimated parameters `theta`
`lambda`	the lambda that minimizes NIC
`gradiant`	analytic gradiant at the estimate
`hessian`	analytic hessian at the estimate
`beta`	estimated mixing proportions for the NCP distribution
`IC`	the information criterion specified by the user
`all.mus`	mean of each basis Gaussian density
`all.sigs`	SD of each basis Gaussian density
`data`	a list of `tstat` and `df`
`i.final`	the index of `lambdas` that minimizes NIC
`all.pi0s`	estimated pi0 for each lambda
`all.enps`	ENP for each lambda
`all.thetas`	parameter estimates for each lambda
`all.nics`	Network information criterion (NIC) for each lambda
`all.nic.sd`	SD of NIC for each lambda
`all.lambdas`	the `lambdas` argument itself
`nobs`	the number of test statistics

df could be Inf for z-tests. When this is the case, approximation is ignored.

Long Qu

Qu L, Nettleton D, Dekkers JCM. (2012) Improved Estimation of the Noncentrality Parameter Distribution from a Large Number of $t$-statistics, with Applications to False Discovery Rate Estimation in Microarray Data Analysis. Biometrics, 68, 1178–1187.

parncpt, sparncpt, fitted.nparncpt, plot.nparncpt, summary.nparncpt, coef.ncpest, logLik.ncpest, vcov.ncpest, AIC, dncp

## Not run: 
data(simulatedTstat)
(npfit=nparncpt(tstat=simulatedTstat, df=8)); 
(pfit=parncpt(tstat=simulatedTstat, df=8, zeromean=FALSE)); plot(pfit)
(pfit0=parncpt(tstat=simulatedTstat, df=8, zeromean=TRUE)); plot(pfit0)
(spfit=sparncpt(npfit,pfit)); plot(spfit)

## End(Not run)

pi0= 0.7483634
mu.ncp= -0.02254265
sd.ncp= 1.523897
enp= 2.408478
lambda= 100
Warning message:
In nparncpt.sqp(tstat, df, ...) :
  Less than half of the estimated coefficients (betas) are less than 0.01. Your might want to try enlarging the `bounds` argument.
pi0 (proportion of null hypotheses) = 0.7483103
mu.ncp (mean of noncentrality parameters) = -0.03791745
sd.ncp (SD of noncentrality parameters) = 1.624555
pi0 (proportion of null hypotheses) = 0.7486391
mu.ncp (mean of noncentrality parameters) = 0
sd.ncp (SD of noncentrality parameters) = 1.626181
pi0= 0.7483134
mu.ncp= -0.03704109
sd.ncp= 1.534416
rho= 0.943
enp= 3.966283