In pbiecek/SPAG: Indexes of spatial agglomeration

title: "SPAG package tutorial (vignette)" date: "r Sys.Date()" output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{SPAG tutorial} %\VignetteEngine{knitr::rmarkdown} %\VignetteDepends{Cairo} %\VignetteEncoding{UTF-8} \usepackage[utf8]{inputenc}

R tool for measuring economic activity

knitr::opts_chunk$set(echo = TRUE)

This package provides a method for calculating and visualizing the Index of Spatial Agglomeration (SPAG) used for measuring the territorial integrity of industries.

Installation

The package can be downloaded and installed from GitHub:

devtools::install_github("pbiecek/SPAG")

Use the library() function to load the package.

library("SPAG")

Data sets

In order to calculate the SPAG Index two arguments have to be provided - information about the area and companies for which the index is supposed to be calculated. The area should be provided as a SpatialPolygonsDataFrame. An exemplary map in the format of a SpatialPolygonsDataFrame is provided with the package:

?MapPoland
MapPoland

The information about the companies should be provided in the form of a data frame with four columns: the geographical coordinates of the companies (longitude and latitute), the number of people working for the company (numeric) and a categorical value defining the field of work of the company. A data frame with such data is provided in the package:

# ?CompaniesPoland
head(CompaniesPoland)

Calculating the index

The package implements function SPAG responsible for calculating the index along with its components. The function takes up to 8 parameters - but only the data frame with companies and a spatial data frame with the map is necessary. The additional parameters are:

companiesProjection - by default the function assumes that the geographical coordinates in the data frame with information about the companies is given in the same projection as the map. If the map is given in other projection, then this string argument modifies the data frame accordingly.
CRSProjection - this string argument modifies the geographical coordinates of the map and data frame.
theoreticalSample - the time and space compexity of the algorithm used for calculating the average distance between uniformly distributed companies is too big, therefore a sample of size theoreticalSample is used as an basis for estimating the real theoretical distance. By default the size of the sample is 1000.
empiricalSample - this argument plays the same role as the former one but is used for estimating the real empirical distance.
numberOfSamples - in order to improve the accuracy of the calculated distance index boosting is applied. The numberOfSamples defines how many times the distance index should be calculated (to get the average of results).
columnAreaName - by default the SPAG index is calculated for the whole area but the index can be also calculated for each area within the spatial data frame separately. In order to enable such calculation the name of the column which specifies the names of subareas should be specified.
companiesProjection - by default the data frame with information about the companies uses the same projection as the map. If this is not the case this parameter will be used to set a different projection.
CRSProjection - the projection of the map determines the way the map will be plotted. This parameter is used to change this projection.
totalOnly - by default SPAG functions calculates the index for each category as well as the total. This can be changed by setting this parameter to TRUE.

# ?SPAG
SPAGIndex <- SPAG(companiesDF = CompaniesPoland, shp = MapPoland, theoreticalSample=100, empiricalSample=200)
print(SPAGIndex)

The SPAG function returns an SPAG object: a dataframe containing SPAG index and its components calculated for every category as well as the data required for visualizing the index.

#setwd("C:/Users/Max/Desktop/TestEmpiryczny")
#dane<-read.csv("geoloc data.csv", header=TRUE, sep=";", dec=".")
#dane$zatr<-ifelse(dane$GR_LPRAC==1, 5, ifelse(dane$GR_LPRAC==2, 30, ifelse(dane$GR_LPRAC==3,150, ifelse(dane$GR_LPRAC==4, #600, 1500))))
#CompaniesPoland<-dane[dane$SEK_PKD7 %in% c("B","C","D","E"), c(23,24,25,26) ]

The last two parameters - theoreticalSample and empiricalSample limit the number of companies used in the calculation of the Distance Index. This has to be done as the algorithm has time of $O(n^2)$. For bigger data sets such algorithm would be too time consuming. Below an example of limiting the calculation for a slightly biger data set is shown. As seen only the distance index changes:

startTime <- Sys.time()
SPAGIndexFULL <- SPAG(companiesDF = CompaniesPoland, shp = MapPoland, theoreticalSample=10000, empiricalSample=10000)
endTime <- Sys.time()
print(SPAGIndexFULL)
print(endTime-startTime)

startTime <- Sys.time()
SPAGIndexSAM <- SPAG(companiesDF = CompaniesPoland, shp = MapPoland, theoreticalSample=100, empiricalSample=100)
endTime <- Sys.time()
print(SPAGIndexSAM)
print(endTime-startTime)

Parameter numberOfSamples defines the number of times the distance index will be calculated. This increases the total time of calculations but gives more accurate results:

startTime <- Sys.time()
SPAGIndexSAM <- SPAG(companiesDF = CompaniesPoland, shp = MapPoland, theoreticalSample=100, empiricalSample=100,
                     numberOfSamples=1)
endTime <- Sys.time()
print(SPAGIndexSAM)
print(endTime-startTime)

The SPAG function also provides a method for calculating the SPAG Index with regards to regions, which simplifies the analysis:

SPAGIndex <- SPAG(companiesDF = CompaniesPoland, shp = MapPoland, theoreticalSample=100, empiricalSample=100, numberOfSamples=1, columnAreaName = "jpt_nazwa_")
SPAGIndex

Plotting the Index

SPAG package allows plotting the index using innate plot functions as well as the ggplot2 and ggmap packages. The first just prints out the map with circles:

plot.SPAG = function(x, category="Total", addCompanies=TRUE, circleUnion=FALSE){

  currentMargain <- par()$mar

  if(category=="Total"){
    companies <- attr(x,"companies")
  } else {
    companies <- attr(x,"companies")[attr(x,"companies")[,4]==category,]
  }

  if(circleUnion){
    polygonArea <- attr(x,"unionAreaList")[[category]]
  } else {
    polygonArea <- attr(x,"circles")[[category]]
  }

  par(mar = rep(0, 4))
  plot(attr(x,"map"), border='#808080')
  plot(polygonArea, add=TRUE)
 #plot(x@unionAreaList[["Total"]])
 #plot(x@map, border='#808080', add=TRUE)
 ##points(companies[,c(1,2)], add=TRUE)
 if(addCompanies){points(companies[,c(1,2)],pch=16,cex=0.2)}

 par(mar=currentMargain)
}

SPAGIndex <- SPAG(companiesDF = CompaniesPoland, shp = MapPoland)
plot(SPAGIndex)

The ggplot function used on an object of class SPAG returns a a ggplot object that can be plotted or modified:

ggSPAG <- ggplot(SPAGIndex)
ggSPAG

By default both plotting function plot the index for all the data, but this can be modified by setting the category parameter to match the category for which we want to plot the SPAG Index:

plot(SPAGIndex, category="gimn.")
plot(SPAGIndex, category="gimn.", addCompanies = FALSE)

By default the circles representing the overlap index are plotted separately, but the plot functions also provide an interface to plot the union of the areas:

plot(SPAGIndex, circleUnion=TRUE)

By default the index is presented as a map on a white background. This can be changed by adding a theme from ggplot2 package. More information about ggplot2 themes can be found here :

ggSPAG + theme_bw()

There are different ways to present a map on a 2 dimensional canvas. By default SPAG Index is plotted using mercator projection but it can be changed by using the coord_map() function from package ggplot2:

ggSPAG + coord_map("ortho", orientation = c(-10, 5, 0))
ggSPAG + coord_map("conic", lat=20) + coord_flip()

Instead of manipulating with the map after calculating the SPAG Index it is possible to change the coordinates of the map and companies before the calculation. This functionality is provided with the parameter CRSProjection:

SPAGIndex <- SPAG(companiesDF = CompaniesPoland, shp = MapPoland, CRSProjection="+proj=longlat +datum=WGS84")
ggplot(SPAGIndex)

pbiecek/SPAG documentation built on May 24, 2019, 10:36 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

pbiecek/SPAG
Indexes of spatial agglomeration

In pbiecek/SPAG: Indexes of spatial agglomeration

R tool for measuring economic activity

Installation

Data sets

Calculating the index

Plotting the Index

R Package Documentation

Browse R Packages

We want your feedback!

pbiecek/SPAG Indexes of spatial agglomeration

In pbiecek/SPAG: Indexes of spatial agglomeration

R tool for measuring economic activity

Installation

Data sets

Calculating the index

Plotting the Index

R Package Documentation

Browse R Packages

We want your feedback!

pbiecek/SPAG
Indexes of spatial agglomeration