Description Usage Arguments Details Value Author(s) References See Also Examples
validClimR
computes indices for cluster validation, and an
objective tree cut for regional
linkage custering method.
1 2 
y 
a dendrogram tree produced by 
k 

minSize 
minimum cluster size. The 
alpha 
confidence level: the default is 
verbose 
logical to print processing information if 
plot 
logical to call the plotting method if 
colPalette 
a color palette or a list of colors such as that generated
by 
pch 
Either an integer specifying a symbol or a single character to
be used as the default in plotting points. See 
cex 
A numerical value giving the amount by which plotting symbols should
be magnified relative to the 
The validClimR
function is used for validation of a dendrogram tree
produced by HiClimR
, by computing detailed statistical information for
each cluster about cluster means, sizes, intra and intercluster correlations,
and overall summary. It requires the preprocessed data matrix and the tree from
HiClimR
function as inputs. An optional parameter can be used to
validate clustering for a selected number of clusters k
. If k = NULL
,
the default which supports only the regional
linkage method, objective cutting
of the tree to find the optimal number of clusters will be applied based on a user
specified significance level (/codealpha parameter). In regional
linkage method,
noisy spatial elements are isolated in very smallsize clusters or individuals since
they do not correlate well with any other elements. They can be excluded from the
validation indices (interCor
, intraCor
, diffCor
, and statSum
),
based on minSize
minimum cluster size. The excluded clusters are identified in
the output of validClimR
in clustFlag
, which takes a value of 1
for selected clusters or 0
for excluded clusters. The sum of clustFlag
elements represents the selected number clusters.This should be followed by a quality
control step before repeating the analysis.
An object of class HiClimR which produces indices for validating the tree produced by the clustering process. The object is a list with the following components:
cutLevel 
the minimum significant correlation used for objective tree cut together with the corresponding confidence level. 
clustMean 
the cluster means which are the region's mean timeseries for all selected regions. 
clustSize 
cluster sizes for all selected regions. 
clustFlag 
a flag 
interCor 
intercluster correlations for all selected regions. It is the intercluster correlations between cluster means. The maximum intercluster correlation is a measure for separation or contiguity, and it is used for objective tree cut (to find the "optimal" number of clusters). 
intraCor 
intracluster correlations for all selected regions. It is the intracluster correlations between the mean of each cluster and its members. The average intracluster correlation is a weighted average for all clusters, and it is a measure for homogeneity. 
diffCor 
difference between intracluster correlation and maximum intercluster correlation for all selected regions. 
statSum 
overall statistical summary for i 
region 
ordered regions vector of size 
regionID 
ordered regions ID vector of length equals the selected number
of clusters, after excluding the small clusters defined by 
Hamada Badr <[email protected]>, Ben Zaitchik <[email protected]>, and
Amin Dezfuli <[email protected]>. The HiClimR
is a modification of
hclust
function, which is based on Fortran code
contributed to STATLIB by F. Murtagh.
Hamada S. Badr, Zaitchik, B. F. and Dezfuli, A. K. (2015): A Tool for Hierarchical Climate Regionalization, Earth Science Informatics, 110, http://dx.doi.org/10.1007/s1214501502217.
Hamada S. Badr, Zaitchik, B. F. and Dezfuli, A. K. (2014): Hierarchical Climate Regionalization, CRAN, http://cran.rproject.org/package=HiClimR.
HiClimR
, validClimR
, geogMask
,
fastCor
, grid2D
, and minSigCor
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31  require(HiClimR)
## Load test case data
x < TestCase$x
## Generate longitude and latitude mesh vectors
xGrid < grid2D(lon = unique(TestCase$lon), lat = unique(TestCase$lat))
lon < c(xGrid$lon)
lat < c(xGrid$lat)
## Hierarchical Climate Regionalization
y < HiClimR(x, lon = lon, lat = lat, lonStep = 1, latStep = 1, geogMask = FALSE,
continent = "Africa", meanThresh = 10, varThresh = 0, detrend = TRUE,
standardize = TRUE, nPC = NULL, method = "regional", hybrid = FALSE,
kH = NULL, members = NULL, validClimR = TRUE, k = NULL, minSize = 1,
alpha = 0.01, plot = TRUE, colPalette = NULL, hang = 1, labels = FALSE)
## Validtion of Hierarchical Climate Regionalization
z < validClimR(y, k = NULL, minSize = 1, alpha = 0.01, plot = TRUE)
## Use a specified number of clusters (k = 12)
z < validClimR(y, k = 12, minSize = 1, alpha = 0.01, plot = TRUE)
## Apply minimum cluster size (minSize = 25)
z < validClimR(y, k = NULL, minSize = 25, alpha = 0.01, plot = TRUE)
## The optimal number of clusters, including small clusters
k < length(z$clustFlag)
## The selected number of clusters, after excluding small clusters (if minSize > 1)
ks < sum(z$clustFlag)

PROCESSING STARTED
Checking MultiVariate Clustering (MVC)...
> x is a matrix
> singlevariate clustering: 1 variable
Checking data...
> Checking dimensions...
> Checking row names...
> Checking column names...
Data filtering...
> Computing mean for each row...
> Checking rows with mean bellow meanThresh...
> 4678 rows found, mean <U+2264> 10
> Computing variance for each row...
> Checking rows with nearzerovariance...
> 3951 rows found, variance <U+2264> 0
Data preprocessing...
> Applying mask...
> Checking columns with missing values...
> Removing linear trend...
> Standardizing data...
Agglomerative Hierarchical Clustering...
> Computing correlation/dissimilarity matrix...
> Starting clustering process...
> Constructing dendrogram tree...
Calling cluster validation...
> Cutting tree based on minimum significant correlation...
> Computing minimum significant correlation coefficient...
> Computing cluster means...
> Computing intercluster correlations...
> Computing intracluster correlations...
> Computing summary statistics...
Generating region map...
dev.new(): using pdf(file="Rplots1.pdf")
PROCESSING COMPLETED
Running Time:
user system elapsed
1.333 0.068 1.417
Time difference of 1.417521 secs
> Cutting tree based on minimum significant correlation...
> Computing minimum significant correlation coefficient...
> Computing cluster means...
> Computing intercluster correlations...
> Computing intracluster correlations...
> Computing summary statistics...
Generating region map...
dev.new(): using pdf(file="Rplots2.pdf")
> Computing cluster means...
> Computing intercluster correlations...
> Computing intracluster correlations...
> Computing summary statistics...
Generating region map...
dev.new(): using pdf(file="Rplots3.pdf")
> Cutting tree based on minimum significant correlation...
> Computing minimum significant correlation coefficient...
> Computing cluster means...
> Computing intercluster correlations...
> Computing intracluster correlations...
> Computing summary statistics...
Generating region map...
dev.new(): using pdf(file="Rplots4.pdf")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.