Description Usage Arguments Details Value Note Author(s) References See Also Examples
This function conducts a series of spatial joins for Geographic Information Systems (GIS) data. It integrates three of R's most commonly used GIS data classes  polygons, points and rasters. With flexible options for assignment rules and including the calculation of spatial and temporal lags, geomerge
returns a spatial (panel) dataset in the form of a SpatialPolygonsDataFrame
that users may import into any predictive statistical analysis.
1 2 3 
... 
input datasets and, if provided, optional arguments. See Details. 
target 

time 
temporal window for dynamic temporal binning of point data. Format 
time.lag 
Boolean indicating whether or not first and second order temporal lag values of all variables are returned. Only affects dynamic point data integration. Default = TRUE. 
spat.lag 
Boolean indicating whether or not first and second order spatial lag values of all variables are returned. Default = TRUE. 
zonal.fun 
object of class function applied to values of 
assignment 
identification of either population or areaweighting assignment rules when handling 
population.data 
specifies data used for weighting if a populationbased 
point.agg 
specification of aggregation format for data of type 
t_unit 
temporal unit used for dynamic point aggregation. Default = "days". 
silent 
Boolean switch to suppress any (noncritical) warnings and messages. Default = FALSE. 
geomerge
accepts any number of data inputs of classes SpatialPolygonsDataFrame
, SpatialPointsDataFrame
, and RasterLayer
. The target
they are merged to may be of any shape but must be a SpatialPolygonsDataFrame
. The extent of each data input should at least match the extent of the target
; if not, the package returns a warning. In order to perform accurate area calculations at any scale, geomerge
projects any data geometry into WGS84. Input data (including target
) not in WGS84 are automatically reprojected.
geomerge
assumes that all inputs of SpatialPolygonsDataFrame
and RasterLayer
are static and contemporary. If polygons or raster are changing, we advise to simply rerun geomerge
for each interval in which data are static and contemporary. The package allows for dynamic integration of all inputs that are a SpatialPointsDataFrame
, i.e., one can, for example, automatically generate the counts of events that occur within a specific unit of target
within a specific time period. Further details are given below.
If SpatialPolygonsDataFrame
data are joined to target
, they must have only one column containing the data of interest. RasterLayer
are by default singlevalued. These data may be of class factor or numeric. If SpatialPointsDataFrame
are joined to target
they must have one column coding the variable of interest and, if points carry timestamps, dates must be given in a second column date and formatted as a UTC date string with format "YYYYMMDD" or "YYYYMMDD hh:mm:ss". In practice, this implies that if more than one variable of interest are to be merged to target
, each has to be separately entered as argument. Note that variable names in target
derive from the name of the input data and it is therefore advised to use meaningful labels for input data.
In merging SpatialPolygonsDataFrame
values to units of analysis given by target
, users have a choice among a number of different assignment
rules based on area overlap and population size. Areabased assignment generally can take the values "max(area)" or "min(area)", i.e., the value assigned to a given unit in target
comes from that polygon in the SpatialPolygonsDataFrame
with maximal or minimal area overlap respectively. If the value of interest is of class numeric, the user may also choose "weighted(area)", i.e., the values is assigned as the areaweighted average of the values in all polygons intersecting a given unit in target
.
The assignment rules "max(pop)", "min(pop)" and "weighted(pop)" (the latter again for numeric variables only) analogously use the population value given by population.data
in overlapping areas as basis for assignment. If any of them is selected in the assignment
argument, users must provide population.data
as a RasterLayer
. The geographical resolution of population.data
should be the same or better than that of target
. The zonal statistic used for population within overlapping polygons is sum
.
When a SpatialPointsDataFrame
is merged to target
, one of two operations can be performed. For point.agg = "cnt"
the function calculates the sum of the number of locations that fall within each unit of target
. For numerical variables of interest, point.agg = "sum"
returns the sum across for all values associated with points within each unit of target
. If different aggregation formats are to be applied to different SpatialPointsDataFrame
inputs, these have to be specified as a character vector, i.e., point.agg = c("sum", "cnt")
, in the order of inputs.
Values for inputs of type SpatialPointsDataFrame
are either calculated statically across the entire frame if time = NA
or dynamically within a given time period defined by time = c(start_date, end_date, interval_length)
. Default for interval_length
is a numerical value for number of t_unit = "days"
. The package also accepts inputs of "secs", "mins", "hours", "months" or "years".
Zonal statistics are applied to objects of class RasterLayer
that are joined to target
. The specific operations are defined in the function call using the argument zonal.fun
and each is added into the result. Any zonal statistics compatible with the extract
function in raster is accepted. Note that geomerge
does not accept raster stacks. If you have raster stacks they must be separated and the layers integrated separately into the function.
If spat.lag = TRUE
spatial lags of all numeric variables from a SpatialPolygonsDataFrame
or RasterLayer
joined to target
polygons are returned using first and also second order neighboring weights matrices. The package assigns target
polygons the mean value of units within each neighborhood. When dynamic point aggregation is run and time.lag = TRUE
, geomerge
returns the values of every target polygon, as well as its first and second order neighboring unit averages, separately, at time t1 and t2 defined by interval
in the argument time
.
Returns an object of class "geomerge".
The functions summary
, print
, plot
overload the standard outputs for objects of type geomerge
providing summary information and and visualizations specific to the output object. An object of class "geomerge" is a list containing the following three components:
data 

inputData 
List containing the spatial objects used as input. 
parameters 
List containing information on all input parameters used during integration. 
geomerge
exclusively merges data using the global WGS84 coordinate reference system (CRS) to ensure that areal statistics are accurate at all scales. If data are entered that are using a different and/or projected CRS, the tool automatically first transforms the data. This onthefly transformation, however, may be very slow and it is advised to always enter inputs in WGS84.
Karsten Donnay and Andrew M. Linke.
Andrew M. Linke, Karsten Donnay. (2017). "Scale Variability Misclassification: The Impact of Spatial Resolution on Effect Estimates in the Geographic Analysis of Foreign Aid and Conflict." Paper presented at the International Studies Association Annual Meeting, February 2225 2017, Baltimore.
geomergepackage
, print.geomerge
, plot.geomerge
, summary.geomerge
, generateGrid
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30  data(geomerge)
# 1) Simple static integration of polygon data
output < geomerge(geoEPR,target=counties,silent=TRUE)
summary (output)
## Not run:
# 2) Static integration for point, polygon, raster data
ACLED.events < ACLED[,names(ACLED)
AidData.projects < AidData[,names(AidData)
output < geomerge(ACLED.events,AidData.projects,geoEPR,
gpw,na.rm=TRUE,target=counties)
summary(output)
plot(output)
# 3) Dynamic point data integration for numeric variables
ACLED.fatalities < ACLED[,names(ACLED)
AidData.commitment < AidData[,names(AidData)
output < geomerge(ACLED.fatalities,AidData.commitment,geoEPR,
target=counties,time=c("20110101", "20111231","1"),
t_unit='months',point.agg='sum')
summary(output)
plot(output)
# 4) Population weighted assignment
output < geomerge(geoEPR,target=counties,assignment='max(pop)',
population.data = gpw)
summary(output)
plot(output)
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.