# Introduction to Wicket In wicket: Utilities to Handle WKT Spatial Data

`wicket` is a little package that makes certain kinds of geospatial data manipulation easier in R - specifically, validating and generating Well-Known Text (WKT) data, including from `sp` objects. At the moment the functionality consists of:

1. Generating bounding boxes from WKT data and normal, R data
2. Validating WKT data, and
3. Converting `sp` objects into WKT data

Let's step through each in turn

## Bounding boxes

A bounding box is a very simple concept: a representation of the smallest area in which all the points in a dataset lie. In WKT, bounding boxes look like:

```POLYGON((10 14,10 16,12 16,12 14,10 14))
```

Sometimes you've got WKT data like this - a Polygon, a LineString, whatever - and you want a bounding box in a format R can understand. The answer is `wkt_bounding`, which takes a vector of valid WKT objects and produces a data.frame or matrix of R representations, whichever you'd prefer:

```wkt <- c("POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))",
"LINESTRING (30 10, 10 90, 40 40)")
wkt_bounding(wkt)
#   min_x min_y max_x max_y
# 1    10    10    40    40
# 2    10    10    40    90
```

Alternately you might want to go in the other direction and turn R bounding boxes into WKT objects. You can do that with, appropriately, `bounding_wkt`:

```bounding_wkt(min_x = 10, min_y = 10, max_x = 40, max_y = 40)
#  "POLYGON((10 10,10 40,40 40,40 10,10 10))"
```

This accepts either a series of vectors, one for each min or max value, or a list of length-4 vectors. Either way, it produces a nice WKT representation of the R data you give it.

## WKT validation

The two greatest challenges in computer science are naming things, cache invalidation, and off-by-one errors. The two greatest challenges in data science are naming things and other peoples' data. And off-by-one-errors.

`wicket` contains a validator for WKT, `validate_wkt`, which takes a vector of WKT objects and spits out a data.frame containing whether each object is valid, and any comments the parser has in the case that it isn't:

```wkt <- c("POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))",
"ARGHLEFLARFDFG",
"LINESTRING (30 10, 10 90, 40 out of cheese error redo universe from start)")
validate_wkt(wkt)
# 1    FALSE The WKT object has a different orientation from the default
# 2    FALSE Object could not be recognised as a supported WKT type
# 3    FALSE bad lexical cast: source type value could not be interpreted as target at 'out' in 'linestring (30 10, 10 90, 40 out of cheese error redo universe from start)'
```

With this you can check and clean your data before you rely on it and watch all your code fall down in a heap.

## WKT generation from `sp` objects

`sp` objects - particularly SpatialPolygons and SpatialPolygonDataFrames - are the standard way of representing geodata in R. They're also entirely unique to R and really difficult to use elsewhere. Enter `sp_convert`, which takes a list of SP/SPDF objects (or a single one) and turns the coordinate sets within them into WKT. In the case that there are multiple coordinate sets in an object and the `group` argument is set to TRUE, a MultiPolygon will be generated for that entry: if it's FALSE, a vector of Polygons:

```library(sp)
Sr1 <- Polygon(cbind(c(2,4,4,1,2),c(2,3,5,4,2)))
Sr2 <- Polygon(cbind(c(5,4,2,5),c(2,3,2,2)))
Sr3 <- Polygon(cbind(c(4,4,5,10,4),c(5,3,2,5,5)))
Sr4 <- Polygon(cbind(c(5,6,6,5,5),c(4,4,3,3,4)), hole = TRUE)

Srs1 <- Polygons(list(Sr1), "s1")
Srs2 <- Polygons(list(Sr2), "s2")
Srs3 <- Polygons(list(Sr3, Sr4), "s3/4")
sp_object <- SpatialPolygons(list(Srs1,Srs2,Srs3), 1:3)

# With grouping
sp_convert(x = sp_object, group = TRUE)
#  "MULTIPOLYGON(((2 2,1 4,4 5,4 3,2 2)),((5 2,2 2,4 3,5 2)),((4 5,10 5,5 2,4 3,4 5)),((5 4,5 3,6 3,6 4,5 4)))"

# Without grouping
sp_convert(x = sp_object, group = FALSE)
# []
#  "POLYGON((2 2,1 4,4 5,4 3,2 2))"  "POLYGON((5 2,2 2,4 3,5 2))"      "POLYGON((4 5,10 5,5 2,4 3,4 5))"
#  "POLYGON((5 4,5 3,6 3,6 4,5 4))"
```

## Coordinate and centroid extraction

WKT POLYGONs are often used to store latitude and longitude coordinates - and you can use `wkt_coords` to get them:

```wkt_coords(("POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))"))
#   object  ring lng lat
# 1      1 outer  30  10
# 2      1 outer  40  40
# 3      1 outer  20  40
# 4      1 outer  10  20
# 5      1 outer  30  10
```

The result of a `wkt_coords` call is a data.frame of four columns - `object`, identifying which of the input WKT objects the row refers to, `ring` referring to the layer in that object, and then `lat` and `lng`.

Extracting centroids is also useful, and can be performed with `wkt_centroid`. Again, it's entirely vectorised and produces a data.frame:

```wkt_centroid(("POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))"))
#        lng     lat
# 1 25.45455 26.9697
```

## New features and bugs

If you've got ideas for other features - or have found something in the existing featureset that is broken - throw them on the GitHub issues page!

## Try the wicket package in your browser

Any scripts or data that you put into this service are public.

wicket documentation built on May 2, 2019, 11:11 a.m.