kl.compreg: Divergence based regression for compositional data

View source: R/kl.compreg.R

Divergence based regression for compositional dataR Documentation

Divergence based regression for compositional data

Description

Regression for compositional data based on the Kullback-Leibler the Jensen-Shannon divergence and the symmetric Kullback-Leibler divergence.

Usage

kl.compreg(y, x, con = TRUE, B = 1, ncores = 1, xnew = NULL, tol = 1e-07, maxiters = 50)
js.compreg(y, x, con = TRUE, B = 1, ncores = 1, xnew = NULL)
tv.compreg(y, x, con = TRUE, B = 1, ncores = 1, xnew = NULL)
symkl.compreg(y, x, con = TRUE, B = 1, ncores = 1, xnew = NULL)

Arguments

y

A matrix with the compositional data (dependent variable). Zero values are allowed.

x

The predictor variable(s), they can be either continnuous or categorical or both.

con

If this is TRUE (default) then the constant term is estimated, otherwise the model includes no constant term.

B

If B is greater than 1 bootstrap estimates of the standard error are returned. If B=1, no standard errors are returned.

ncores

If ncores is 2 or more parallel computing is performed. This is to be used for the case of bootstrap. If B=1, this is not taken into consideration.

xnew

If you have new data use it, otherwise leave it NULL.

tol

The tolerance value to terminate the Newton-Raphson procedure.

maxiters

The maximum number of Newton-Raphson iterations.

Details

In the kl.compreg() the Kullback-Leibler divergence is adopted as the objective function. In case of problematic convergence the "multinom" function by the "nnet" package is employed. This will obviously be slower. The js.compreg() uses the Jensen-Shannon divergence and the symkl.compreg() uses the symmetric Kullback-Leibler divergence. The tv.compreg() uses the Total Variation divergence. There is no actual log-likelihood for the last three regression models.

Value

A list including:

runtime

The time required by the regression.

iters

The number of iterations required by the Newton-Raphson in the kl.compreg function.

loglik

The log-likelihood. This is actually a quasi multinomial regression. This is bascially half the negative deviance, or - \sum_{i=1}^ny_i\log{y_i/\hat{y}_i}.

be

The beta coefficients.

covbe

The covariance matrix of the beta coefficients, if bootstrap is chosen, i.e. if B > 1.

est

The fitted values of xnew if xnew is not NULL.

Author(s)

Michail Tsagris.

R implementation and documentation: Michail Tsagris mtsagris@uoc.gr and Giorgos Athineou <gioathineou@gmail.com>.

References

Murteira, Jose MR, and Joaquim JS Ramalho 2016. Regression analysis of multivariate fractional data. Econometric Reviews 35(4): 515-552.

Tsagris, Michail (2015). A novel, divergence based, regression for compositional data. Proceedings of the 28th Panhellenic Statistics Conference, 15-18/4/2015, Athens, Greece. https://arxiv.org/pdf/1511.07600.pdf

Endres, D. M. and Schindelin, J. E. (2003). A new metric for probability distributions. Information Theory, IEEE Transactions on 49, 1858-1860.

Osterreicher, F. and Vajda, I. (2003). A new class of metric divergences on probability spaces and its applicability in statistics. Annals of the Institute of Statistical Mathematics 55, 639-653.

See Also

diri.reg, ols.compreg, comp.reg

Examples

library(MASS)
x <- as.vector(fgl[, 1])
y <- as.matrix(fgl[, 2:9])
y <- y / rowSums(y)
mod1<- kl.compreg(y, x, B = 1, ncores = 1)
mod2 <- js.compreg(y, x, B = 1, ncores = 1)

Compositional documentation built on Oct. 9, 2024, 5:10 p.m.