evaluateGeoDist: Evaluate matching with geographic distances

Description Usage Arguments Value Author(s) Examples

View source: R/evaluateMatching.R

Description

Calculate 1) distance between target and matched subset cells and 2) distance between the Subset cells matched to each Target cell and the Subset cells matched to the eight adjacent neighbors of that Target cell.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
evaluateGeoDist(
  matches,
  subsetcells,
  subsetcells_id = "site_id",
  subset_in_target = TRUE,
  quality_name = "matching_quality",
  exclude_poor_matches = TRUE,
  matching_distance = 1.5,
  longlat = TRUE,
  raster_template = NULL,
  map_distances = TRUE,
  map_neighbor_distances = TRUE,
  which_distance = "both",
  saverasters = FALSE,
  filepath = getwd(),
  overwrite = FALSE
)

Arguments

matches

data frame output from the multivarmatch function.

subsetcells

if subset_in_target is TRUE, this should be a data frame of coordinates (expects coordinates in columns named 'x' and 'y') for Subset cells. May be extracted from output from kpoints function or provided separately. Row names should be unique identifiers for each point (unique means no repeats in rownames of subsetcells if subset_in_target is TRUE). If subset_in_target is FALSE, this should be a data frame of subset cells with column names corresponding exactly to those in matchingvars and row names should be unique identifiers (unique means no repeats among all row names in targetcells and matchingvars if subset_in_target is FALSE). See subset_in_target.

subsetcells_id

character or numeric, but must be composed of numbers and convertable to numeric. Refers to the column in subsetcellsthat provides the unique identifiers for Subset cells. When subset_in_target is TRUE, these ids must be unique from matchingvars_ids. Note that if there are repeats between thematchingvars_ids and the subsetcells_ids, you can paste "00" before the subsetcells_ids to ensure they are unique from the matchingvars_ids. Defaults to NULL.

subset_in_target

boolean. Indicates if Subset cells have been selected from Target cells using kpoints function

quality_name

character. Name of the column in the matches data frame that contains the matching quality variable to use to evaluate matching 'matching_quality' or 'matching_quality_secondary'. Defaults to 'matching_quality'.

exclude_poor_matches

boolean. Indicates if poor matches (with weighted Euclidean distance <= matching_distance) should be excluded from geographic distance calculation. Defaults to TRUE.

matching_distance

numeric. Gives the maximum allowable matching quality value (weighted Euclidean distance) between Target and Subset cells. Default value is 1.5.

longlat

boolean. Pass to function in pointDistance. Indicates if the coordinates are in longitude and latitude format for calculating distances between points. Default value is TRUE and coordinates need to be provided in this format.

raster_template

one of the raster layers used for input data.

map_distances

boolean. Indicates whether a map of distances between Target and matched Subset cells should be plotted. Defaults to TRUE.

map_neighbor_distances

boolean. Indicates whether a map of average distance between the Subset cells matched to each Target cell and the Subset cells matched to the eight adjacent neighbors of that Target cell. Defaults to TRUE.

which_distance

character. One of 'both', 'simple', or 'neighbor'. Determines which distance(s) will be calculated. 'simple' will calculate the dstance between target and matched subset cells, 'neighbor' will calculate the distance between the Subset cells matched to each Target cell and the Subset cells matched to the eight adjacent neighbors of that Target cell. 'both' will calculate both simple and neighbor distances.

saverasters

boolean. Indicates whether to save rasters of the calculated distance metrics. Defaults to FALSE.

filepath

provides path for location where raster will be saved. Defaults to working directory.

overwrite

boolean. Indicates whether writeRaster should overwrite existing files with the same name in filepath. Defaults to FALSE.

Value

Data frame with the distance between Target and matched Subset cells ('target_to_subset_distance') and the average distance between the Subset cell matched to each Target cell and the Subset cells matched to the eight adjacent Target cells ('avgdistance_to_neighbors'). The first column and the rownames correspond to the unique identifiers for the Target cells, and columns 2 and 3 correspond to the 'x' and 'y' coordinates of the Target cells.

Author(s)

Rachel R. Renne

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
# Load targetcells data for Target Cells
data(targetcells)

# Create raster_template
raster_template <- targetcells[[1]]

# Create data frame of potential matching variables for Target Cells
allvars <- makeInputdata(targetcells)

# Restrict data to matching variables of interest
matchingvars <- allvars[,c("cellnumbers","x","y","bioclim_01","bioclim_04",
                       "bioclim_09","bioclim_12","bioclim_15","bioclim_18")]

# Create raster_template
raster_template <- targetcells[[1]]

# Create vector of matching criteria
criteria <- c(0.7,42,3.3,66,5.4,18.4)

# Find solution for k = 200
# Note: n_starts should be >= 10, it is 1 here to reduce run time.
results1 <- kpoints(matchingvars, criteria = criteria, klist = 200,
                    n_starts = 1, min_area = 50, iter = 50,
                    raster_template = raster_template)


###################################
# First an example where subset_in_target = TRUE
# Get points from solution to kpoints algorithm
subsetcells <- results1$solutions[[1]]

# Create raster_template
raster_template <- targetcells[[1]]

# Find matches and calculate matching quality
quals <- multivarmatch(matchingvars, subsetcells,
                       criteria = criteria,
                       matchingvars_id = "cellnumbers",
                       raster_template = raster_template,
                       subset_in_target = TRUE)

# Look at geographic distances
geodist <- evaluateGeoDist(matches = quals, subsetcells = subsetcells,
                           subset_in_target = TRUE,
                           quality_name = "matching_quality",
                           exclude_poor_matches = TRUE,
                           matching_distance = 1.5,
                           longlat = TRUE,
                           raster_template = raster_template)


###################################
# Now an example where subset_in_target is FALSE
# Remove previous subsetcells
rm(subsetcells)

# Get points from solution to kpoints algorithm
data(subsetcells)

# Remove duplicates (representing cells with same climate but different
# soils--we want to match on climate only)
subsetcells <- subsetcells[!duplicated(subsetcells$site_id),]

# Pull out matching variables only, with site_id that identifies unique climate
subsetcells <- subsetcells[,c("site_id","X_WGS84","Y_WGS84","bioclim_01",
                           "bioclim_04","bioclim_09","bioclim_12",
                           "bioclim_15","bioclim_18")]

# Ensure that site_id will be values unique to subsetcells
subsetcells$site_id <- paste0("00",subsetcells$site_id)

# Find matches and calculate matching quality
quals <- multivarmatch(matchingvars, subsetcells=subsetcells,
                       criteria = criteria,
                       matchingvars_id = "cellnumbers",
                       subsetcells_id = "site_id",
                       raster_template = raster_template,
                       subset_in_target = FALSE)

# Prepare subsetcells site_ids
subsetcells$site_id <- as.character(as.numeric(subsetcells$site_id))

# Look at geographic distances
geodist <- evaluateGeoDist(matches = quals, subsetcells = subsetcells,
                           subsetcells_id = 'site_id',
                           subset_in_target = FALSE,
                           exclude_poor_matches = TRUE,
                           matching_distance = 1.5,
                           longlat = TRUE, quality_name = "matching_quality",
                           raster_template = raster_template)

DrylandEcology/rMultivariateMatching documentation built on Dec. 17, 2021, 5:30 p.m.