clusterise_sites_large_dataframe: Cluster Occurrence Data (large dataframe)

View source: R/clusterise_sites_large_dataframe.R

clusterise_sites_large_dataframeR Documentation

Cluster Occurrence Data (large dataframe)

Description

Cluster a large occurrence dataframe by date with the option to group each cluster using a specified radius distance. Each of these clusters of data represents a site and a pair of centered coordinates for each site is generated.

Usage

clusterise_sites_large_dataframe(
  dataframe,
  cluster_min_length,
  day_split_min_length = 10,
  group_radius = 40075000
)

Arguments

dataframe

A dataframe with occurrence data for the chosen taxon and location.

cluster_min_length

The minimum number of observations in each cluster.

day_split_min_length

By default the function filters out days with fewer than 10 observations. This value adjusts the minimum threshold.

group_radius

An optional value to have sites grouped. Group radius is measured in metres.

Value

The function returns a 'clusterised object', which is a list containing two elements: The first element is a list of data clusters. The second element is a dataframe that includes centred coordinates for each site, group number, and date.

Examples

# clusterise sites for the entire Santander province of Colombia
Colombia_Santander_dataframe <- subset(Colombia, stateProvince == "Santander")

clusterised_Santander <- clusterise_sites_large_dataframe(

 dataframe = Colombia_Santander_dataframe,
 cluster_min_length = 30

)

print(clusterised_Santander[[2]])



DivInsight documentation built on Aug. 12, 2023, 9:06 a.m.