sis_iterate: Iterate tests trying a variety of cutoff values

View source: R/dotest.R

sis_iterateR Documentation

Iterate tests trying a variety of cutoff values

Description

This is a way of looking at the effect of using different cutoff values on the sister group comparisons. Do clades with a higher value have more species than their sister, and is this robust to what cutoff value is used? At the extremes (the min and max value) this is almost certainly not the case, unless you have many taxa with the same maximum or minimum values.

Usage

sis_iterate(
  x,
  nsteps = 11,
  phy,
  sisters = sis_get_sisters(phy),
  drop_matches = TRUE
)

Arguments

x

Vector of continuous trait values

nsteps

Number of thresholds to try

phy

A phylo object

sisters

Data.frame from sis_get_sisters()

drop_matches

Drop sister group comparisons with equal numbers of taxa

Details

This is a very dangerous function to use. Someone could use this to find the perfect cutoff value to find a significant result. This is one of the many forms of p-hacking. So, if you use this function and then report on significance using some cutoff, you MUST mention somewhere in your manuscript that you've tried a variety of cutoff values, and include a discussion of why you used a particular cutoff. Ideally, you should have some biological intuition about what cutoff value is reasonable before using this function, as well.

Value

A data.frame, where each column is for a different cutoff percentile and every row is a number returned from sis_test()

Examples

data(geospiza, package="geiger")
cleaned <- sis_clean(geospiza$phy, geospiza$dat)
phy <- cleaned$phy
trait <- cleaned$traits[,1]
sis_iterate(trait, phy=phy)

bomeara/sisters documentation built on Oct. 11, 2023, 12:14 a.m.