Description Usage Arguments Value Dataframe Requirements See Also Examples
Use get_clusters() to cluster a dataframe of GPS coordinates into places.
1 2 3 4 5 6 7 8 9 | get_clusters(
df,
max.accu = 165,
max.speed = 2.6,
min.time = 3,
max.time = 15,
max.distance = 150,
var.segment = NULL
)
|
df |
A dataframe of GPS coordinates as described below. |
max.accu |
An integer in meters. This number means there’s a 68% probability that the true location is within this radius. The default is 165 m. Any GPS rows with an accuracy higher than this will be dropped. |
max.speed |
An integer in meters/sec. It is the threshold value that distinguishes a row as Static or Moving. The default is 2.6 meters/sec. |
min.time |
An integer in minutes. It is the minimum amount of time between two points for the pair to be considered a stationary cluster. The defaults is 3 minutes. |
max.time |
An integer in minutes. It is the maximum amount of time between two points for the pair to be considered a stationary cluster. The defaults is 15 minutes. |
max.distance |
An integer in meters. It is the maximum distance in meters between two points for the pair to be labelled a cluster. The defaults is 150 m. |
var.segment |
If this variable is NOT set, clusters will be created based on the participant’s entire dataset. If this variable is set, clusters will be segmented on the variable. A list can be provided. |
A list containing two named objects. PLACES is a dataframe of named clusters and latitude and longitude coordinates for each named cluster that was computed as a weighted average of the original GPS datapoints found within the cluster. The PLACES dataframe identifies moving clusters as 999999 CLUSTERS is a list of dataframes for each participant that contain the named clusters and coordinates for each original GPS datapoint. Unlike the PLACES dataframe, the CLUSTERS list labels "moving" clusters as NA.
The dataframe needs to have the following named columns:
user_id = participant id
lat = latitude coordinates
lon = longitude coordinates
start_time = time of GPS coordinates as POSIXct
The dataframe may - but does need to - have the following named columns:
tz_olson_id = local timezone (only needed if running "get_home")
accu = GPS accuracy. This number means there’s a 68% probability that the true location is within this radius. If this is not available, an accu column will be created and set to 0 so all rows are kept.
speed = Speed in meters/sec at which the phone sensing data indicates an individual was moving. If this is not available, speed will be calculated as distance / time between two coordinates.
get_home
to predict which cluster is an individual's home
get_places
to label each cluster's place type as identified by Google Places API
1 2 3 4 5 6 7 8 9 10 | ## Prepare the dataset "places_gps" and run get_clusters()
## Not run:
places_gps$time_local <- as.POSIXct(strptime(places_gps$time_local, "%m/%d/%y %H:%M"), tz="UTC")
colnames(places_gps)[c(2,4)] <- c("start_time", "lon")
clusters <- get_clusters(places_gps)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.