Forecast Verification with Cluster Analysis: The Variation
Description
A variation on cluster analysis for forecast verification as proposed by Marzban and Sandgathe (2008).
Usage
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25  CSIsamples(x, ...)
## Default S3 method:
CSIsamples(x, ..., xhat, nbr.csi.samples = 100, threshold = 20,
k = 100, width = 25, stand = TRUE, z.mult = 0, hit.threshold = 0.1,
max.csi.clust = 100, diss.metric = "euclidean", linkage.method = "average",
verbose = FALSE)
## S3 method for class 'SpatialVx'
CSIsamples(x, ..., time.point = 1, model = 1, nbr.csi.samples = 100,
threshold = 20, k = 100, width = 25, stand = TRUE, z.mult = 0,
hit.threshold = 0.1, max.csi.clust = 100, diss.metric = "euclidean",
linkage.method = "average", verbose = FALSE)
## S3 method for class 'CSIsamples'
summary(object, ...)
## S3 method for class 'CSIsamples'
plot(x, ...)
## S3 method for class 'summary.CSIsamples'
plot(x, ...)
## S3 method for class 'CSIsamples'
print(x, ...)

Arguments
x,xhat 
default method: matrices giving the verification and forecast fields, resp. “SpatialVx” method:

object 
list object of class “CSIsamples”. 
nbr.csi.samples 
integer giving the number of samples to take at each level of the CA. 
threshold 
numeric giving a value over which is to be considered an event. 
k 
numeric giving the value for 
width 
numeric giving the size of the samples for each cluster sample. 
stand 
logical, should the data first be standardized before applying CA? 
z.mult 
numeric giving a value by which to multiply the z component. If zero, then the CA is performed on locations only. Can be used to give more or less weight to the actual values at these locations. 
hit.threshold 
numeric between zero and one giving the threshold for the proportion of a cluster that is from the verification field vs the forecast field used for determining whether the cluster consitutes a hit (vs false alarm or miss depending). 
max.csi.clust 
integer giving the maximum number of clusters allowed. 
diss.metric 
character giving which 
linkage.method 
character giving the name of a linkage method acceptable to the 
time.point 
numeric or character indicating which time point from the “SpatialVx” verification set to select for analysis. 
model 
numeric indicating which forecast model to select for the analysis. 
verbose 
logical, should progress information be printed to the screen? 
... 
Not used by
Not used by the 
Details
This function carries out the procedure described in Marzban and Sandgathe (2008) for verifying forecasts. Effectively, it combines the verification and forecast fields (keeping track of which values belong to which field) and applies CA to the combined field. Clusters identified with a proportion of values belonging to the verification field within a certain range (defined by the hit.threshold argument) are determined to be hits, misses or false alarms. From this information, the CSI (at each number of clusters; scale) is calculated. A sampling scheme is used to speed up the process.
The plot
and summary
functions all give the same information, but in different formats: i.e., CSI by number of clusters (scale).
Value
A list is returned by CSIsamples with components:
data.name 
character vector giving the names of the verification and forecast fields analyzed, resp. 
call 
an object of class “call” giving the function call. 
results 
max.csi.clust by nbr.csi.samples matrix giving the caluclated CSI for each sample and iteration of CA. 
The summary method function invisibly returns the same list, but with the additional component:
csi 
vector of length max.csi.clust giving the sample average CSI for each iteration of CA. 
The plot method functions do not return anything. Plots are created.
Note
Special thanks to Caren Marzban, marzban “at” u.washington.edu, for making the CSIsamples (originally called csi.samples) function available for use with this package.
Author(s)
Hillary Lyons, h.lyons “at” comcast.net, and modified by Eric Gilleland
References
Marzban, C., Sandgathe, S. (2008) Cluster Analysis for ObjectOriented Verification of Fields: A Variation. Mon. Wea. Rev., 136, (3), 1013–1025.
See Also
hclust
, hclust
, kmeans
, clusterer
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30  ## Not run:
grid< list( x= seq( 0,5,,100), y= seq(0,5,,100))
obj<Exp.image.cov( grid=grid, theta=.5, setup=TRUE)
look< sim.rf( obj)
look2 < sim.rf( obj)
res < CSIsamples(x=look, xhat=look2, 10, threshold=0, k=100,
width=2, z.mult=0, hit.threshold=0.25, max.csi.clust=75)
plot(res)
y < summary(res)
plot(y)
## End(Not run)
## Not run:
data(UKfcst6)
data(UKobs6)
data(UKloc)
hold < make.SpatialVx(UKobs6, UKfcst6, thresholds=0,
loc=UKloc, map=TRUE, field.type="Rainfall", units="mm/h",
data.name=c("Nimrod", "obs 6", "fcst 6"))
res < CSIsamples(hold, threshold=0, k=200, z.mult=0.3, hit.threshold=0.2,
max.csi.clust=150, verbose=TRUE)
plot(res)
summary(res)
y < summary(res)
plot(y)
## End(Not run)
