phyloglm | R Documentation |
Fits the phylogenetic logistic regression described in Ives and Garland (2010) and the Poisson regression described in Paradis and Claude (2002). The computation uses an algorithm that is linear in the number of tips in the tree.
phyloglm(formula, data, phy, method = c("logistic_MPLE","logistic_IG10", "logistic_MLE",
"poisson_GEE"), btol = 10, log.alpha.bound = 4,
start.beta=NULL, start.alpha=NULL,
boot = 0, full.matrix = TRUE, save = FALSE)
formula |
a model formula. |
data |
a data frame containing variables in the model. If not
found in |
phy |
a phylogenetic tree of type phylo with branch lengths. |
method |
The "logistic_IG10" method optimizes a GEE approximation to the penalized likelihood of the logistic regression. "logistic_MPLE" maximizes the penalized likelihood of the logistic regression. In both cases, the penalty is Firth's correction. The "poisson_GEE" method solves the generalized estimating equations (GEE) for Poisson regression. |
btol |
(logistic regression only) bound on the linear predictor to bound the searching space. |
log.alpha.bound |
(logistic regression only) bound for the log of the parameter alpha. |
start.beta |
starting values for beta coefficients. |
start.alpha |
(logistic regression only) starting values for alpha (phylogenetic correlation). |
boot |
number of independent bootstrap replicates, |
full.matrix |
if |
save |
if |
This function uses an algorithm that is linear in the number of tips in the tree.
Bootstrapping can be parallelized using the future
package on any future
compatible back-end. For example, run library(future); plan(multiprocess))
,
after which bootstrapping will automatically occur in parallel. See
plan
for options.
coefficients |
the named vector of coefficients. |
alpha |
(logistic regression only) the phylogenetic correlation parameter. |
scale |
(Poisson regression only) the scale parameter which estimates the overdispersion. |
sd |
standard deviation for the regression coefficients. |
vcov |
covariance matrix for the regression coefficients. |
logLik |
(logistic regression only) log likelihood. |
aic |
(logistic regression only) AIC. |
penlogLik |
(logistic regression only) penalized log likelihood, using Firth's penalty for coefficients. |
y |
response. |
n |
number of observations (tips in the tree). |
d |
number of dependent variables. |
formula |
the model formula. |
call |
the original call to the function. |
method |
the estimation method. |
convergence |
An integer code with '0' for successful optimization. With logistic_MPLE, this is the convergence code from the |
alphaWarn |
(logistic regression only) An interger code with '0' for the estimate of alpha is not near the lower and upper bounds, code with '1' for the estimate of alpha near the lower bound, code with '2' for the estimate of alpha near the upper bound. |
X |
a design matrix with one row for each tip in the phylogenetic tree. |
bootmean |
( |
bootsd |
( |
bootconfint95 |
( |
bootmeanAlog |
( |
bootsdAlog |
( |
bootnumFailed |
( |
bootstrap |
( |
bootdata |
( |
The tip labels in the tree are matched to the data names (row names in
the data frame). If no data frame is provided through the argument data
,
taxon labels in the tree are matched to taxon labels in the response
variable based on the row names of the response vector, and the taxa are
assumed to come in the same order for all variables in the model.
The logistic regression method of Ives and Garland (2010) uses alpha to estimate the level of phylogenetic correlation. The GEE method for Poisson regression does not estimate the level of phylogenetic correlation but takes it from the existing branch lengths in the tree.
The standard deviation and the covariance matrix for the coefficients of logistic regression are conditional on the estimated value of the phylogenetic correlation parameter \alpha
.
The default choice btol=10
constrains the fitted values, i.e. the probability of "1" predicted by the model, to lie within
1/(1+e^{ 10})=0.000045
and
1/(1+e^{-10})=0.999955
.
The log of \alpha
is bounded in the interval
-\log(T) \pm \mathrm{log.alpha.bound}
where T
is the mean of the distances from the root to tips. In
other words, \alpha T
is constrained to lie within
\exp(\pm\mathrm{log.alpha.bound})
.
Lam Si Tung Ho, Robert Lachlan, Rachel Feldman and Cecile Ane
Ho, L. S. T. and Ane, C. 2014. "A linear-time algorithm for Gaussian and non-Gaussian trait evolution models". Systematic Biology 63(3):397-408.
Ives, A. R. and T. Garland, Jr. 2010. "Phylogenetic logistic regression for binary dependent variables". Systematic Biology 59:9-26.
Paradis E. and Claude J. 2002. "Analysis of Comparative Data Using Generalized Estimating Equations". Journal of Theoretical Biology 218:175-185.
compar.gee
.
set.seed(123456)
tre = rtree(50)
x = rTrait(n=1,phy=tre)
X = cbind(rep(1,50),x)
y = rbinTrait(n=1,phy=tre, beta=c(-1,0.5), alpha=1 ,X=X)
dat = data.frame(trait01 = y, predictor = x)
fit = phyloglm(trait01~predictor,phy=tre,data=dat,boot=100)
summary(fit)
coef(fit)
vcov(fit)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.