# Guide to spotoroo In spotoroo: Spatiotemporal Clustering of Satellite Hot Spot Data

head(result$ignition, 2)  The hotspots dataset contains information of each hot spot. Particularly, the membership column is the membership label column.$-1$represents noise. The ignition dataset contains information of each cluster. Similarly, the membership column is the membership label column. And the lon and lat are the coordinate information of the ignition points. ### Extract a subset of clusters If you would like to extract a subset of clusters from the results or merge the hotspots and ignition dataset, you could use the function extract_fire(). You could choose to extract all clusters along with noise points by setting cluster = "all" and noise = TRUE. This will merge the hotspots and ignition dataset. # Merge the hotspots and ignition dataset merged_result <- extract_fire(result, cluster = "all", noise = TRUE)  You could also only extract a subset of clusters without any noise by providing a vector of membership labels to the argument cluster and set noise = FALSE. This will merge the hotspots and ignitoin dataset but filtering out the noise points and selecting needed clusters. # Merge the hotspots and ignition dataset # Select cluster 2 and 3 and filter out noise cluster_2_and_3 <- extract_fire(result, cluster = c(2, 3), noise = FALSE)  ## Additional topic: Choice of parameters In principal, the parameters activeTime and adjDist are determined using the professional knowledge of fire behaviour, but in practice, we generally don't know much about them. In the rest of the section, we will show one of the methods to choose proper values for these two parameters. We first set minPts = 4 and minTime = 3. You could set different values for minPts and minTime if you like. We then could do a grid search for activeTime and adjDist in a sensible range. Here, we set adjDist$\in$[500,1000,1500,2000,2500,3000,3500,4000] and activeTime$\in$[6,12,18,24,30,36,42,48]. For each pair of activeTime and adjDist, we record the proportion of noise as the metric for comparison. The following code does the calculation. It may takes around 10 minutes to run. You could have a try if you like. # NOT RUN # NOTICE: MAY TAKE AROUND 10 MINS TO RUN THIS CODE BLOCK noise_prop <- c() for (adjDist in seq(500, 4000, 500)) { for (activeTime in seq(6, 48, 6)) { result <- suppressMessages(hotspot_cluster(hotspots = hotspots, lon = "lon", lat = "lat", obsTime = "obsTime", activeTime = activeTime, adjDist = adjDist, minPts = 4, minTime = 3, ignitionCenter = "mean", timeUnit = "h", timeStep = 1)) noise_prop <- c(noise_prop, mean(result$hotspots$noise)) } } tab <- expand.grid(activeTime = seq(6, 48, 6), adjDist = seq(500, 4000, 500)) tab$noise_prop <- noise_prop

tab <- expand.grid(activeTime = seq(6, 48, 6),

tab\$noise_prop <- c(0.320560748, 0.282242991, 0.235514019, 0.133644860,
0.129906542, 0.129906542, 0.126168224, 0.118691589,
0.154205607, 0.134579439, 0.109345794, 0.026168224,
0.026168224, 0.026168224, 0.026168224, 0.021495327,
0.086915888, 0.075700935, 0.055140187, 0.011214953,
0.011214953, 0.011214953, 0.011214953, 0.011214953,
0.081308411, 0.070093458, 0.049532710, 0.009345794,
0.009345794, 0.009345794, 0.009345794, 0.009345794,
0.081308411, 0.070093458, 0.049532710, 0.009345794,
0.009345794, 0.009345794, 0.009345794, 0.009345794,
0.079439252, 0.061682243, 0.049532710, 0.009345794,
0.009345794, 0.009345794, 0.009345794, 0.009345794)


With the proportion of noise, we could make two line plots to reveal the relationships between proportion of noise, adjDist and activeTime.

It works like the scree plot used in principal component analysis. We want to keep clusters separate without introducing too much noise.

In the first plot, most of the significant drops of proportion of noise are observed when adjDist less than 2500 metres. Therefore, adjDist = 2500 is a reasonable choice.

ggplot(tab) +
geom_line(aes(adjDist, noise_prop, color = as.factor(activeTime))) +
ylab("Noise Propotion") +
labs(col = "activeTime") +
theme_minimal() +
scale_x_continuous(breaks = seq(500, 4000, 500))


In the second plot, most of the significant drops of proportion of noise are observed when activeTime less than 24 hours. Therefore, activeTime = 24 is a reasonable choice.

ggplot(tab) +
geom_line(aes(activeTime, noise_prop, color = as.factor(adjDist))) +
ylab("Noise Propotion") +
theme_minimal() +
scale_x_continuous(breaks = seq(6, 48, 6))


# Exploring the spatiotemporal clustering results

The package provides some useful functions to explore the clustering results.

## Summary

You could make a brief summary of the clustering results.

summary_spotoroo(result)


Or make a brief summary of a subset of clusters by providing a vector of membership labels to the cluster argument.

summary_spotoroo(result, cluster = c(1, 3, 4))


### Called by summary()

For convenience, the summary_spotoroo() can be called by the summary() function.

summary(result)
summary(result, cluster = c(1, 3, 4))


## Plot

You could produce a plot of the clustering results. There are three types of plots, which are "def" (default), "mov" (fire movement) and "timeline" (timeline).

### Default

plot_spotoroo(result, type = "def")


### Timeline

plot_spotoroo(result, type = "timeline")


### Fire movement

The fire movement is calculated from the get_fire_mov() function.

plot_spotoroo(result, type = "mov", step = 6)


If you have a background ggplot object, you can let the function plots onto it.

if (requireNamespace("sf", quietly = TRUE)) {
plot_spotoroo(result, bg = plot_vic_map())
}

if (requireNamespace("sf", quietly = TRUE)) {
plot_spotoroo(result, type = "mov", bg = plot_vic_map(), step = 6)
}


More details about the usage of this function can be found by using the help(plot_spotoroo) function.

### Called by plot()

For convenience, the plot_spotoroo() can be called by the plot() function.

plot(result)
plot(result, type = "timeline")
plot(result, type = "mov")
plot(result, bg = plot_vic_map())
plot(result, type = "mov", bg = plot_vic_map())


