Home

/

GitHub

/

In avisserquinn/OSMtidy: OSMtidy is an R package for anyone who needs to tidy Open Street Map data into a concise yet complex geospatial dataset with a consistent naming convention.

library(OSMtidy)

knitr::opts_chunk$set(echo = TRUE, out.height = "75%", out.width = "75%")

This vignette describes the OSMtidy workflow. The workflow consists of six steps which are intended to be simple and easy to follow:

Input A shapefile outlining the location
Extract Spatial data – inside the shapefiles ‘bounding box’ – is extracted from OpenStreetMaps servers via the R package osmdata
Cut The extracted data is ‘cookie cutter’-ed to the shapefile extent
Wrangle The data is transformed into a suitable format for filtering
Filter The physical objects are filtered and renamed to follow a simple naming convention
Tidy Collates the outputs to form a streamlined database of physical objects

This vignette also introduces to helper functions which can be applied at all of these six steps: dataSummary() and dataExport().

To get started, you'll need:

An up to date install of the OSMtidy package. You can install or update OSMtidy using the package devtools: devtools::install_github("avisserquinn/OSMtidy")
A shapefile (a single polygon) outline of the location you want to extract OSM data for

1. Input - Using dataShapefile()

Using the function dataShapefile() we can import the shapefile for which data is to be extracted. There are two input arguments:

filename The filename or filepath of the shapefile
crs Coordinate projection of the input data (optional). This function will automatically convert the shapefile to EPSG:4326 (https://epsg.io/4326)

shp <- dataShapefile(filename = "exampleEdinburgh.shp", crs = 4326) # or dataShapefile("exampleEdinburgh.shp")
shp

1.1 dataSummary()

At each step, you can print a summary of the OSMtidy outputs using the function dataSummary().

dataSummary(shp)

1.2 dataExport()

You can also export any OSMtidy output using the function dataExport(). The export file names have the following convention:

Location name
Step number (e.g. 1)
Step name (e.g. dataShapefile)
A timestamp of when the output was exported

dataExport(shp, "exampleEdinburgh")

2. Extract - Using dataExtract()

The OpenStreetMap data is extracted, via the R package osmdata and the overpass server, using the function dataExtract().

2.1 Features

This data is extracted by feature, and a vector of the names of features consisting of physical objects can be generated via data("features"). A vector of all the vailable features in osmdata can be accessed via the function osmdata::available_features.

data("features")
features <- features[c(2,18,19,22,24)]; features # For this example we'll select a subset of 5 features

osmdata::available_features # A vector of all 209 available features

2.2 dataExtract()

The function dataExtract() has three input arguments:

dataShapefile The shapefile output from step 1
timeout The time in seconds before the query to the overpass server will time out; see Details below
Default 300 seconds
memsize The memory size for the overpass server; see Details below
Default 1073741824 Bytes

Details from the R package osmdata

timeout It may be necessary to increase this value for large queries, because the server may time out before all data are delivered. memsize The default memory size for the 'overpass' server in bytes; may need to be increased in order to handle large queries.> See https://wiki.openstreetmap.org/wiki/Overpass_API#Resource_management_options_.28osm-script.29 for explanation of timeout and memsize (or maxsize in overpass terms). Note in particular the comment that queries with arbitrarily large memsize are likely to be rejected.

Note this function may take some time to run. Timestamps and progress are printed while the function is running. It is recommended that you execute the function once to avoid flooding the overpass server. The example below extracts 5 of the 47 features.

dlExtract <- dataExtract(dataShapefile = shp, features = features)

dataSummary(dlExtract)

dataExport(dlExtract, "exampleEdinburgh")

3. Cut - Using dataCut()

In step 2 the data was extracted as a "bounding box" (a rectangle). In step 3, the data is cut to the shapefile using the function dataCut().

The function dataCut has two input arguments:

dataExtracted Output from step 2
dataShapefile Output from step 1

Timestamps and progress are printed when the function is running.

dlCut <- dataCut(dataExtracted = dlExtract, dataShapefile = shp)

dataSummary(dlCut)

dataExport(dlCut, "exampleEdinburgh")

4. Wrangle - Using dataWrangle()

Using the function dataWrangle we can tidy up (or wrangle) the data before filtering.

There is one input argument:

dataCut Output from step 3

Timestamps and progress are printed when the function is running.

dlWrangle <- dataWrangle(dataCut = dlCut)

dataSummary(dlWrangle)

dataExport(dlWrangle, "exampleEdinburgh")

5. Filter - Using dataFilter()

5.1 Filters

The main function of OSMtidy is dataFilter(). Here, the data is filtered based on rules set out in an Excel spreadsheet. Default filters, generated as part of the Water Resilient Cities project, can be accessed using the data() function: data("filters").

data("filters")
filters

To see the spreadsheet the default filters are based on, or to access a template to create your own filters, generate the filepaths using the code below.

system.file("extdata", "filters.xlsx", package = "OSMtidy")
system.file("extdata", "filtersTemplate.xlsx", package = "OSMtidy")

See vignette 3 for further details.

5.2 filterOverview()

A filter overview can be generated using the function filterOverview(). The input can be either the filepath as a string or as an object in R. Examples of both are provided below.

filterOverview() with a filepath

filepath <- system.file("extdata", "filters.xlsx", package = "OSMtidyPackage")
filepath
filterOverview(filepath)

filterOverview() with an R object

data("filters")
filters
filterOverview(filters)

5.3 Application

There are three input arguments to dataFilter():

dataWrangle Output from step 4
filters The filepath of the filters as a string or as an object in R (see filterOverview() previously)
rows Specify the rows in the filters object to apply. Intended for troubleshooting and adjusting filters
Default to NULL, i.e. all filters (rows)

Timestamps and progress are printed when the function is running.

Depending on the location size, number of filters and computer performance, filters can take anything from a couple of minutes (the example ward) to multiple hours to run (City of London and Boroughs).

dlFilter <- dataFilter(dataWrangle = dlWrangle, filters = filters)
dataSummary(dlFilter)
dataExport(dlFilter, "exampleEdinburgh")

6. Tidy - Using dataTidy()

Note that multiple outputs from dataWrangle() and dataFilter() were spreadsheets (.xlsx extension). These spreadsheets may be manually adjusted; this is covered in the third vignette.

The function dataTidy() generates a single tidied output based on any combination (R object and/or spreadsheet) of this filtered, validated, unfiltered and no detail data.

There is one input argument to dataTidy():

datalist A list of R objects and/or spreadsheet.

In this vignette, the input is the list of outputs from steps 4 and 5. In vignettes 3 and 4 a number of alternative approaches are described.

The tidied geotagged dataset is saved in .RDS, and .csv for use in a range of applications. To export as a shapefile it is necessary to split the geotagged dataset by geometry type first. Still need to update how the final output works and runs

dlTidy <-
  dataTidy(dataList = 
             list(dlWrangle$noDetail,
                  dlFilter$unfiltered,
                  dlFilter$filtered,
                  dlFilter$validate))
dataSummary(dlTidy)
dataExport(dlTidy, "exampleEdinburgh")

avisserquinn/OSMtidy documentation built on June 3, 2023, 7:30 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

avisserquinn/OSMtidy
OSMtidy is an R package for anyone who needs to tidy Open Street Map data into a concise yet complex geospatial dataset with a consistent naming convention.

In avisserquinn/OSMtidy: OSMtidy is an R package for anyone who needs to tidy Open Street Map data into a concise yet complex geospatial dataset with a consistent naming convention.

1. Input - Using dataShapefile()

1.1 dataSummary()

1.2 dataExport()

2. Extract - Using dataExtract()

2.1 Features

2.2 dataExtract()

3. Cut - Using dataCut()

4. Wrangle - Using dataWrangle()

5. Filter - Using dataFilter()

5.1 Filters

5.2 filterOverview()

5.3 Application

6. Tidy - Using dataTidy()

R Package Documentation

Browse R Packages

We want your feedback!

avisserquinn/OSMtidy OSMtidy is an R package for anyone who needs to tidy Open Street Map data into a concise yet complex geospatial dataset with a consistent naming convention.

In avisserquinn/OSMtidy: OSMtidy is an R package for anyone who needs to tidy Open Street Map data into a concise yet complex geospatial dataset with a consistent naming convention.

1. Input - Using dataShapefile()

1.1 dataSummary()

1.2 dataExport()

2. Extract - Using dataExtract()

2.1 Features

2.2 dataExtract()

3. Cut - Using dataCut()

4. Wrangle - Using dataWrangle()

5. Filter - Using dataFilter()

5.1 Filters

5.2 filterOverview()

5.3 Application

6. Tidy - Using dataTidy()

R Package Documentation

Browse R Packages

We want your feedback!

avisserquinn/OSMtidy
OSMtidy is an R package for anyone who needs to tidy Open Street Map data into a concise yet complex geospatial dataset with a consistent naming convention.