episelect: Model selection
In epistasis: Detecting Epistatic Selection with Partially Observed Genotype Data

Description Usage Arguments Details Value Author(s) References See Also Examples

Estimate the optimal regularization parameter at EM convergence based on different information criteria .

1	episelect(epi.object, criteria = NULL, ebic.gamma = 0.5, loglik_Y = FALSE, ncores = NULL)

`epi.object`	An object with S3 class "epi"
`criteria`	Model selection criteria. "ebic" and "aic" are available. BIC model selection can be calculated by fixing `ebic.gamma = 0`.
`ebic.gamma`	The tuning parameter for ebic. The`ebic.gamma = 0` results in bic model selection. The default value is 0.5.
`loglik_Y`	Model selection based on either log-likelihood of observed data (`loglik_Y = TRUE`), or the joint log-likelihood of observed and latent variables (`loglik_Y = FALSE`).
`ncores`	The number of cores to use for the calculations. Using `ncores = NULL` automatically detects number of available cores and runs the computations in parallel.

This function computes extended Bayesian information criteria (ebic), Bayesian information criteria, Akaike information criterion (aic) at EM convergence based on observed or joint log-likelihood. The observed log-likelihood can be obtained through

\ell_Y(\widehat{Θ}_λ) = Q(\widehat{Θ}_λ | \widehat{Θ}^{(m)}) - H (\widehat{Θ}_λ | \widehat{Θ}^{(m)}),

Where Q can be calculated from epistasis function and H function is

H(\widehat{Θ}_λ | \widehat{Θ}^{(m)}_λ) = E_z[\ell_{Z | Y}(\widehat{Θ}_λ) | Y; \widehat{Θ}_λ] = E_z[\log f(z)| Y ;\widehat{Θ}_λ ] - \log p(y).

The "ebic" and "aic" model selection criteria can be obtained as follow

ebic(λ) = -2 \ell(\widehat{Θ}_λ) + ( \log n + 4 γ \log p) df(λ)

aic(λ) = -2 \ell(\widehat{Θ}_λ) + 2 df(λ)

where df refers to the number of non-zeros offdiagonal elements of \hat{Θ}_λ, and γ \in [0, 1]. Typical value for for ebic.gamma is 1/2, but it can also be tuned by experience. Fixing ebic.gamma = 0 results in bic model selection.

An object with S3 class "episelect" is returned:

`opt.path`	The optimal graph selected from the graph path
`opt.theta`	The optimal precision matrix from the graph path
`opt.Sigma`	The optimal covariance matrix from the graph path
`ebic.scores`	Extended BIC scores for regularization parameter selection at the EM convergence.
`opt.index`	The index of optimal regularization parameter.
`opt.rho`	The selected regularization parameter.

and anything else that is included in the input epi.object.

Pariya Behrouzi and Ernst C.Wit
Maintainer: Pariya Behrouzi pariya.behrouzi@gmail.com

1. P. Behrouzi and E. C. Wit. Detecting Epistatic Selection with Partially Observed Genotype Data Using Copula Graphical Models. arXiv, 2016.
2. Ibrahim, Joseph G., Hongtu Zhu, and Niansheng Tang. "Model selection criteria for missing-data problems using the EM algorithm." Journal of the American Statistical Association (2012). 3. D. Witten and J. Friedman. New insights and faster computations for the graphical lasso. Journal of Computational and Graphical Statistics, to appear, 2011.
4. J. Friedman, T. Hastie and R. Tibshirani. Sparse inverse covariance estimation with the lasso, Biostatistics, 2007.
5. Foygel, R. and M. Drton (2010). Extended bayesian information criteria for Gaussian graphical models. In Advances in Neural Information Processing Systems, pp. 604-612.

epistasis

## Not run: 
#simulate data
D <- episim(p=50, n=100, k= 3, adjacent = 3, alpha = 0.06 , beta = 0.06)
plot(D)

#detect epistatic selection path
out  <-  epistasis(D$data, method="gibbs", n.rho= 5, ncores= 1)

#different graph selection methods
sel.ebic1 <- episelect(out, criteria="ebic")
plot(sel.ebic1)

sel.ebic2 <- episelect(out, criteria="ebic", loglik_Y=TRUE)
plot(sel.ebic2)

sel.aic <- episelect(out, criteria="aic")
plot(sel.aic)

sel.bic <- episelect(out, criteria="ebic", ebic.gamma = 0)
plot(sel.bic)

## End(Not run)