humboldt.background.stat: Niche background statistic
In jasonleebrown/humboldt: Analysis of Species in Environmental Space

humboldt.background.stat

R Documentation

Niche background statistic

Description

Niche background statistic

Usage

humboldt.background.stat(
  g2e,
  rep = 100,
  sim.dir = 1,
  env.reso,
  kern.smooth = 1,
  R = 100,
  correct.env = F,
  thresh.espace.z = 0.001,
  force.equal.sample = F,
  run.silent.bak = F,
  ncores = 1
)

Arguments

`g2e`	an espace file output from humboldt.g2e
`rep`	is the number of iterations. Values higher than 200 are recommend for final analysis.
`sim.dir`	if sim.dir=1, z2 will be randomly shifted the resulting niche will be compared to z1 (which is unchanged). If sim.dir=2, z1 will be randomly shifted the resulting niche will be compared to z2 (which is unchanged)
`env.reso`	the resolution of the environmental data grid (typically in decimal degrees)
`kern.smooth`	scale at which kernel smoothing occurs on environmental data, larger values (i.e. 2) increase scale (making espace transitions smoother and typically larger) and smaller values (i.e. 0.5) decrease scale (making occupied espace clusters more dense and irregular). Default value is 1. You can also input: "auto", which estimates the kernel parameter by calculating the standard deviation of rescaled PC1 and PC2 coordinates divided by the sixth root of the number of locations. This method can be unreliable when used on multimodal espace distributions as it results in over-smoothing of home ranges. Multimodal espace occupancy can be somewhat common when a species occupies an extreme aspect of habitat or when espace is not broadly accessible in both dimensions of espace (PCs 1 & 2)
`R`	resolution of grid in environmental space (RxR). This needs to match the value input into humboldt.g2e. Default value is 100
`correct.env`	if correct.env=T, the analysis corrects occurrence densities of each species by the prevalence of the environments in their range. If correct.env=F, the overlap measure does not correct occurrence densities of each species by the prevalence of the environments in their range. Default value is FALSE
`thresh.espace.z`	this parameter is an experimental parameter and controls the level at which values below the kernel density z values are removed for creating areas of analogous environmental space. Higher values will increase value from which the low-density areas are removed from the environmental space of z1 and z2. Basically values above this are retained and values below are removed. Default=0.001
`force.equal.sample`	Occasionally points are shifted into areas without environment data. If force.equal.sample=T, the points without environment data are shifted iteratively. Each round, if environment data are present in the new location, the environment is sampled and that point is added back to the original dataset. This is repeated until all points have sampled areas with existing environment data. In practice, when clusters of points are shifted to areas of no environmental data, the entire cluster is subsequently shifted back into an area with data. If force.equal.sample=F, the points shifted into areas without environmental data are excluded from niche quantification.
`run.silent.bak`	if run.silent=T, texts boxes displaying progress will not be displayed
`ncores`	number of CPUs to use for tests. If unsure on the number of cores and want to use all but 1 CPU, input ncores="All"

Value

Runs a modified niche background statistic(see Warren et al 2008) based on two species' occurrence density grids. The function compares the observed niche similarity between z1 and z2 (created by humboldt.grid.espace) to overlap between z1 and the random shifting of the spatial distribution of z2 in geographic space and then measuring how that shift in geography changes occupied environmental space (called z2.sim). This test maintains the spatial structure of all the localities and thereby retains all nuances associated with each datasets' spatial autocorrelation (vs. random sampling). This test asks if the two distributed populations/species are more different than would be expected given the underlying environmental differences between the regions in which they occur.

If the observed values of the niche similarity measures obtained from the two original populations are significantly higher than expected from this null distribution, then the null hypothesis that similarity between species is solely due to differences in habitat availability is rejected. Or in other words, a significant value suggests that the two species are more divergent than expected solely on the habitat availability (pending a significant equivalence statistic). A non-significant test suggest that most occupied environments are similar among environments.

IMPORTANT. This test measures the power of the equivalence statistic to detect differences based on the available e-space. If both the equivalence statistic and background statistic are non-significant, this means the two species occupied environmental spaces are not significantly different and resulting niche 'equivalence' is likely a result of the limited environmental space present in habitat(s). Basically in these situations, there is limited power for the equivalence statistic to actually detect and significant differences among taxa, even if they existed. However, conversely it also doesn't provide any evidence that they in fact on not equivalent- simply there is little power to detect it in input environmental data.

If both background statistic are non-significant and equivalence statistic are non-significant, try to increase the spatial extent of input climate data. If the equivalence statistic is significant and this is not, this means the two species occupied environmental space is significantly different despite the existence of largely similar habitats. This is strong evidence of niche divergence.

Output: $sim= simulation values for D, I, nDg (number times D.sim is greater than D.obs), nDl (number times D.sim is less than D.obs), nIg (number times I.sim is greater than I.obs), nIl (number times I.sim is less than I.obs), n.pts=number of points in random sample (often when shifting the spatial distribution of z2 in geographic space, points are moved to areas with no environmental data and thereby are excluded for simulation niche calculation), x.shift= shift on longitude, y.shift= shift in latitude; $obs= D & I values in observed datasets; $p.D=one-tailed p-value of Schoener's D values (simulation vs. observed), $p.I=one-tailed p-value of Hellinger's I value (simulation vs. observed).

Examples

library(humboldt)

##load environmental variables for all sites of the study area 1 (env1). Column names should be x,y,X1,X2,...,Xn)
env1<-read.delim("env1.txt",h=T,sep="\t")

## load environmental variables for all sites of the study area 2 (env2). Column names should be x,y,X1,X2,...,Xn)
env2<-read.delim("env2.txt",h=T,sep="\t") 

## remove NAs and make sure all variables are imported as numbers
env1<-humboldt.scrub.env(env1)
env2<-humboldt.scrub.env(env2)

##load occurrence sites for the species at study area 1 (env1). Column names should be 'sp', 'x','y'
occ.sp1<-na.exclude(read.delim("sp1.txt",h=T,sep="\t"))

##load occurrence sites for the species at study area 2 (env2). Column names should be 'sp', 'x','y'
occ.sp2<-na.exclude(read.delim("sp2.txt",h=T,sep="\t"))

##convert geographic space to espace
zz=humboldt.g2e(env1=env1, env2=env2, sp1=occ.sp1, sp2=occ.sp2, reduce.env = 0, reductype = "PCA", non.analogous.environments = "NO", env.trim= T, e.var=c(3:21),  col.env = e.var, trim.buffer.sp1 = 200, trim.buffer.sp2 = 200, rarefy.dist = 50, rarefy.units="km", env.reso=0.41666669, kern.smooth = 1, R = 100, run.silent = F)

##perform background statistics 
bg.sp1tosp2<-humboldt.background.stat(g2e=zz, rep = 100, sim.dir = 1, env.reso=0.41666669, kern.smooth = 1, correct.env = F, R = 100, run.silent.bak = F)
bg.sp2tosp1<-humboldt.background.stat(g2e=zz, rep = 100, sim.dir = 2, env.reso=0.41666669, kern.smooth = 1, correct.env = F, R = 100, run.silent.bak = F)

jasonleebrown/humboldt documentation built on Jan. 4, 2024, 7:46 a.m.