geoFold | R Documentation |
This function assigns geographically-divided k-folds ("g-folds") using partitioning around mediods (PAM) algorithm. The user can specify the number of folds to create, and optionally, the minimum size of any fold plus the minimum number of sites NOT in any fold (good for ensuring each fold has enough sites for testing and training).
geoFold( x, k, minIn = NULL, minOut = NULL, longLat = NULL, distFunct = NULL, swaps = NULL, ... )
x |
A data frame, matrix, |
k |
Positive integer. Number of k-folds to create. |
minIn |
Positive integer or |
minOut |
Positive integer or |
longLat |
Two-element character list or two-element integer list. If |
distFunct |
Either a function or |
swaps |
Positive integer. Sometimes the routine generates folds that aren't minimally compact; i.e., points from some folds are spatially inside other folds. To correct this a random swap procedure is performed at the end in which pairs of points from different folds are swapped assignment. If this decreases the mean distance to the (new) centroid of each fold then the swap is kept. Otherwise it is not. This procedure is performed |
... |
Arguments to pass to |
An integer vector, one element for for of x, with values 1 through k indicating which fold a site is located in.
distCosine
, pam
# Make three groups, one with two points and two with 20 points apiece. # Naturally these should group into 3 groups with 2, 20, and 20 point apiece. # By setting minIn and minOut to non-NULL values, we can increase/decrease # the size of the groups. # define plot function pointPlot <- function(x, folds, ...) { plot(x, pch=16, cex=2, col='white', ...) for (i in sort(unique(folds))) points(x[folds==i, ], bg=i + 1, pch=20 + i, cex=2) legend('bottomright', legend=paste('fold', sort(unique(folds))), pt.bg=sort(unique(folds)) + 1, pch=20 + sort(unique(folds)), cex=1.4) } set.seed(17) group1 <- data.frame(x=c(-90, -90), y=c(40, 41)) group2 <- data.frame(x=rep(-80, 20), y=rep(37, 20)) group3 <- data.frame(x=rep(-100, 20), y=rep(37, 20)) group2 <- group2 + cbind(rnorm(20), rnorm(20)) group3 <- group3 + cbind(rnorm(20), rnorm(20)) sites <- rbind(group1, group2, group3) # simple g-folds folds <- geoFold(sites, k=3) pointPlot(sites, folds, main='Simple G-folds') # g-folds with >= 5 sites per fold folds <- geoFold(sites, k=3, minIn=5) pointPlot(sites, folds, main='G-folds with >=5 sites in each') # g-folds with >= 10 sites in and out of each fold folds <- geoFold(sites, k=3, minIn=10, minOut=10) pointPlot(sites, folds, main='G-folds with >=10\nsites in/outside each') # g-folds with >=14 sites in and >= 20 sites out of each fold folds <- geoFold(sites, k=3, minIn=14, minOut=20) pointPlot(sites, folds, main='G-folds >=14 in\nand >=20 outside')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.