circleclust: Cluster coordinates based on circular variance

View source: R/circleclust.R

circleclustR Documentation

Cluster coordinates based on circular variance

Description

circleclust calculates the circular variance of consecutive latitude and longitude coordinates within a moving window and identifies spatiotemporal clusters based on a set of flexible parameters. These parameters include the size of the window in which to calculate circular variance, the threshold value (circular variance) to identify clustering, and the minimum number of observations allowed in a cluster.

Usage

circleclust(
  df,
  dt_field = NULL,
  circvar_threshold = 0.7,
  window = 60,
  show_circvar = FALSE,
  rspeed_threshold = NULL,
  pl_dist_threshold = NULL,
  cluster_threshold = NULL
)

Arguments

df

a data frame with columns speed and azimuth created by move(). The data frame must also include datetime, longitude, and latitude columns.

dt_field

POSIXct; name of datetime field.

circvar_threshold

numeric; circular variance threshold to determine clustering. Default = 0.7.

window

numeric; window (number of rows) in which to calculate circular variance.

show_circvar

logical; if TRUE, a column will be added to the output data frame listing the circular variance (circvar) for each observation. Default = FALSE.

rspeed_threshold

numeric; if assigned a numeric value, the 1-minute rolling median speed (m/s) is calculated. Observations with a circular variance above 'circvar_threshold' and below 'rspeed_threshold' are assigned to a cluster. Assigning a value to this parameter can mitigate incidental clustering of coordinates due to stop-and-go transit.

pl_dist_threshold

numeric; distance threshold (meters) used to aggregate clusters. If the distance between consecutive clusters is below this threshold, each cluster, and the coordinates between them, are combined into a single grouping of coordinates. Setting this parameter may be useful if location data includes transit at a low speed with frequent stops (i.e. walking).

cluster_threshold

numeric; the minimum allowable number of observations in each cluster. If the number of observations in an identified cluster is less than this threshold, the observations are retained but not assigned a to a cluster.

Details

Observations belonging to a cluster are assigned to a cluster group, which are ordered temporally. Further, observations are marked as either static or mobile.

The circleclust() function has been optimized to detect changes in activity pattern within a 5-minute moving window and a threshold circular variance of 0.7. These parameters can be adjusted if desired.

Imputing lon/lat values using impute_coords() is recommended if GPS coordinates are missing.

Value

a data frame. New columns sp_temporal_cluster and activity_status are appended to the input data frame.

Examples

## Not run: 

circleclust(df,
  dt_field = NULL, circvar_threshold = .7, window = 60,
  cluster_threshold = NULL, show_circvar = FALSE, rspeed_threshold = NULL,
  pl_dist_threshold = NULL, cluster_threshold = NULL
)


zoo_trip %>%
  impute_coords("Date_Time") %>%
  dt_aggregate("Date_Time") %>%
  move("Date_Time") %>%
  circleclust("Date_Time", pl_dist_threshold = 25, show_circvar = TRUE)

## End(Not run)

wolfeclw/circleclust documentation built on Aug. 13, 2024, 3:33 a.m.