SaTScan is a powerful stand-alone software program that runs spation-temporal scan statistics. It is carefully optimized and contains many tricks to reduce the computational burden of the approach, which is doubly computationaly intensive. First, scanning itself can be costly, particularly in spatio-temporal settings. However, even more difficult, testing involves resampling (Monte Carlo hypothesis testing). While SaTScan is not open source, it is distributed free of charge.
However, use of SaTScan can be cumbersome. There are two means of access: a GUI, and a batch file. The GUI allows complete control, but precludes automated or repeated operation. The batch file allows this, but may be difficult to integrate into other analyses. The
rsatscan package contains a set of functions and defines a class and methods to make it easy to work with SaTScan from R. This should allow easy automation and integration.
The functions in the package can be grouped into three sets:
SaTScan parameter functions that set parameters for SaTScan or write them in a file to the OS; write functions that write R data frames to the OS in SaTScan-readable formats; and the
satscan() function, which calls out into the OS, runs SaTScan, and returns a
satscan class object. Successful use of the package requires a fairly precise understanding of the SaTScan parameter file, for which users are referred to the SaTScan manual.
Basic usage of the package will:
ss.options()function to set SaTScan parameters; these are saved in R
write.ss.prm()function to write the SaTScan parameter file
satscan()function to run SaTScan
satscanobject and proceed to analyze the results from SaTScan in R.
The New York City fever data, which are distributed with SaTScan, are also included with the package.
For good style, an analysis would begin by resetting the paremeter file:
Then, one would change parameters as desired. This can be done in as many or few steps as you like; the previous state of the parameter set is retained, as in
options(). Here, the parameters used in the example from the SaTScan manual are replicated:
ss.options(list(CaseFile="NYCfever.cas", PrecisionCaseTimes=3)) ss.options(c("StartDate=2001/11/1","EndDate=2001/11/24")) ss.options(list(CoordinatesFile="NYCfever.geo", AnalysisType=4, ModelType=2, TimeAggregationUnits=3)) ss.options(list(UseDistanceFromCenterOption="y", MaxSpatialSizeInDistanceFromCenter=3, NonCompactnessPenalty=0)) ss.options(list(MaxTemporalSizeInterpretation=1, MaxTemporalSize=7)) ss.options(list(ProspectiveStartDate="2001/11/24", ReportGiniClusters="n", LogRunToHistoryFile="n"))
Note that the second call to
ss.options() uses the character vector format, while the others use the list format; either works.
It might be reasonable at this point to check what the parameter file looks like:
Then, we write the parameter file, the case file, and the geometry file to some writeable location in the OS, using the functions in package. These ensure that SaTScan-readable formats are used.
td = tempdir() write.ss.prm(td, "NYCfever") write.cas(NYCfevercas, td, "NYCfever") write.geo(NYCfevergeo, td, "NYCfever")
write.??? functions append the appropriate file extensions to the files they save into the OS.
Then we're ready to run SaTScan. The location of the SaTScan executable may well differ on you r disk, particularly if you do not use Windows. In a later release of the package, it may be possible to detect the location the executable
# This step omitted in compliance with CRAN policies # Please install SaTScan and run the vignette with this and following code uncommented # SaTScan can be downloaded from www.satscan.org, free of charge # you will also find there fully compiled versions of this vignette with results ## NYCfever = satscan(td, "NYCfever", sslocation="C:/Program Files (x86)/SaTScan")
rsatscan package provides a
summary method for
satscan object has a slot for each possible output file that SaTScan creates, and contains whatever output files your call happened to generate.
If SaTScan generated a shapefile,
satscan() reads it, by way of the
readOGR(), if it's available, into a class defined in the
sp package. You can use the plot methods defined in the
sp package to plot it, or use one of the many packages that builds on the
sp package for further processing.
It might be interesting to examine the scan statistics from the Monte Carlo steps.
## hist(unlist(NYCfever$llr), main="Monte Carlo") # Let's draw a line for the clusters in the observed data ## abline(v=NYCfever$col[,c("TEST_STAT")], col = "red")
This shows why none of the observed clusters had small p=values.
#clean up! file.remove(paste0(td,"/NYCfever.prm")) file.remove(paste0(td,"/NYCfever.cas")) file.remove(paste0(td,"/NYCfever.geo"))
This is another data set included with
SaTScan. It differs from the NYC fever examle in that denominators are available; these are porvided in a population file. The analysis uses the Poisson model rather than the Spatio-temporal permutation.
write.cas(NMcas, td,"NM") write.geo(NMgeo, td,"NM") write.pop(NMpop, td,"NM")
Again, replicating the examples from the SaTScan user guide, we set up and then write the parameter file, then run SaTScan.
invisible(ss.options(reset=TRUE)) ss.options(list(CaseFile="NM.cas",StartDate="1973/1/1",EndDate="1991/12/31", PopulationFile="NM.pop", CoordinatesFile="NM.geo", CoordinatesType=0, AnalysisType=3)) ss.options(c("NonCompactnessPenalty=0", "ReportGiniClusters=n", "LogRunToHistoryFile=n")) write.ss.prm(td,"testnm") ## testnm = satscan(td,"testnm", sslocation="C:/Program Files (x86)/SaTScan")
Note that the parameter file need not have the same name as the case and other input files, which also need not share a name, though it may be helpful in keeping things organized.
One of the elements of a
satscan class object is the parameter set which was used to call SaTScan. This may be useful, later.
#clean up! file.remove(paste0(td,"/testnm.prm")) file.remove(paste0(td,"/NM.pop")) file.remove(paste0(td,"/NM.cas")) file.remove(paste0(td,"/NM.geo"))
A third data set included with
SaTScan is also included with the package. This one has cases and controls, and uses the Bernoulli model. We replicate the parameters from the
SaTScan manual again.
write.cas(NHumbersidecas, td, "NHumberside") write.ctl(NHumbersidectl, td, "NHumberside") write.geo(NHumbersidegeo, td, "NHumberside") invisible(ss.options(reset=TRUE)) ss.options(list(CaseFile="NHumberside.cas", ControlFile="NHumberside.ctl")) ss.options(list(PrecisionCaseTimes=0, StartDate="2001/11/1", EndDate="2001/11/24")) ss.options(list(CoordinatesFile="NHumberside.geo", CoordinatesType=0, ModelType=1)) ss.options(list(TimeAggregationUnits = 3, NonCompactnessPenalty=0)) ss.options(list(ReportGiniClusters="n", LogRunToHistoryFile="n")) write.ss.prm(td, "NHumberside") ## NHumberside = satscan(td, "NHumberside", sslocation="C:/Program Files (x86)/SaTScan") ## summary(NHumberside)
#clean up! file.remove(paste0(td,"/NHumberside.cas")) file.remove(paste0(td,"/NHumberside.ctl")) file.remove(paste0(td,"/NHumberside.geo")) file.remove(paste0(td,"/NHumberside.prm"))
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.