phyloglmstep: Stepwise model selection for Phylogenetic Generalized Linear...

Description Usage Arguments Details Value Author(s) See Also Examples

View source: R/phyloglmstep.R

Description

Performs stepwise model selection for phylogenetic generalized linear models, using the criterion -2*log-likelihood + k*npar, where npar is the number of estimated parameters and k=2 for the usual AIC.

Usage

1
2
3
4
5
6
phyloglmstep(formula, starting.formula = NULL, data=list(), phy, 
       method = c("logistic_MPLE","logistic_IG10"),
       direction = c("both", "backward", "forward"), trace = 2,
       btol = 10, log.alpha.bound = 4, start.beta=NULL, 
       start.alpha=NULL, boot = 0, full.matrix = TRUE,
       k=2, ...)

Arguments

formula

formula of the full model.

starting.formula

optional formula of the starting model.

data

a data frame containing variables in the model. If not found in data, the variables are taken from current environment.

phy

a phylogenetic tree of type phylo with branch lengths.

method

The "logistic_IG10" method optimizes a GEE approximation to the penalized likelihood of the logistic regression. "logistic_MPLE" maximizes the penalized likelihood of the logistic regression. In both cases, the penalty is Firth's correction.

direction

direction for stepwise search, can be both, forward, and backward.

trace

if positive, information on each searching step is printed. Larger values may give more detailed information.

btol

bound on the linear predictor to bound the searching space.

log.alpha.bound

bound for the log of the parameter alpha.

start.beta

starting values for beta coefficients.

start.alpha

starting values for alpha (phylogenetic correlation).

boot

number of independent bootstrap replicates, 0 means no bootstrap.

full.matrix

if TRUE, the full matrix of bootstrap estimates (coefficients and alpha) will be returned.

k

optional weight for the penalty.

...

further arguments to be passed to the function optim.

Details

The default k=2 corresponds to the usual AIC penalty. Use k=log(n) for the usual BIC, although it is unclear how BIC should be defined for phylogenetic regression.

See phyloglm for details on the possible phylogenetic methods for the error term, for default bounds on the phylogenetic signal parameters, or for matching tip labels between the tree and the data.

Value

A phyloglm object correponding to the best model is returned.

Author(s)

Rutger Vos

See Also

phyloglm.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
set.seed(123456)
tre = rcoal(60)
taxa = sort(tre$tip.label)
b0=0; b1=1;
x1 = rTrait(phy=tre,model="BM",
           parameters=list(ancestral.state=0,sigma2=10))
x2 = rTrait(phy=tre,model="BM",
            parameters=list(ancestral.state=0,sigma2=10))
x3 = rTrait(phy=tre,model="BM",
            parameters=list(ancestral.state=0,sigma2=10))
X = cbind(rep(1,60), x1)
y = rbinTrait(n=1,phy=tre, beta=c(-1,0.5), alpha=1 ,X=X)
dat = data.frame(trait=y[taxa],pred1=x1[taxa],pred2=x2[taxa],pred3=x3[taxa])
fit = phyloglmstep(trait~pred1+pred2+pred3,data=dat,phy=tre,method="logistic_MPLE",direction="both")
summary(fit)

Example output

Loading required package: ape
----------
Starting model: trait ~ 1 + pred1 + pred2 + pred3
Direction: both
AIC(k=2): 18.3022809094149
	Proposed: trait ~ 1 + pred2 + pred3
	AIC(k=2): 15.5581166122006
	Proposed: trait ~ 1 + pred1 + pred3
	AIC(k=2): 10.3573773706303
	Proposed: trait ~ 1 + pred1 + pred2
	AIC(k=2): 10.3737026748776
----------
Step 1
Current model: trait ~ 1 + pred1 + pred3
AIC(k=2): 10.3573773706303
	Proposed: trait ~ 1 + pred3
	AIC(k=2): 13.6212547118843
	Proposed: trait ~ 1 + pred1 + pred2 + pred3
	AIC(k=2): 18.3022809094149
	Proposed: trait ~ 1 + pred1
	AIC(k=2): 7.90572591676469
----------
Step 2
Current model: trait ~ 1 + pred1
AIC(k=2): 7.90572591676469
	Proposed: trait ~ 1
	AIC(k=2): 11.3907754373768
	Proposed: trait ~ 1 + pred1 + pred2
	AIC(k=2): 10.3737026748776
	Proposed: trait ~ 1 + pred1 + pred3
	AIC(k=2): 10.3573773706303
----------
Step 3
Current model: trait ~ 1 + pred1
AIC(k=2): 7.90572591676469
---END
There were 23 warnings (use warnings() to see them)

Call:
phyloglm(formula = create.formula(plm), data = data, phy = phy, 
    method = method, btol = btol, log.alpha.bound = log.alpha.bound, 
    start.beta = start.beta, start.alpha = start.alpha, boot = boot, 
    full.matrix = full.matrix)
       AIC     logLik Pen.logLik 
    7.9057    -0.9529     0.2199 

Method: logistic_MPLE
Mean tip height: 1.650307
Parameter estimate(s):
alpha: 33.06484 

Coefficients:
            Estimate   StdErr z.value  p.value   
(Intercept) -3.87195  1.48910 -2.6002 0.009317 **
pred1        0.73438  0.27636  2.6573 0.007876 **
---
Signif. codes:  0***0.001**0.01*0.05.’ 0.1 ‘ ’ 1

Note: Wald-type p-values for coefficients, conditional on alpha=33.06484

phylolm documentation built on July 2, 2020, 3:44 a.m.