Geographic Data Collection and Spatial data

knitr::opts_chunk$set(
  collapse = TRUE,
  message = FALSE,
  warning = FALSE,
  echo = TRUE,
  comment = "#>"
)

KoboToolbox allows the collection of spatial data through three questions types: geopoint, geotrace and geoshape.

Geopoint:
The geopoint question type captures a single geographic coordinate (latitude and longitude) including altitude and accuracy. This is useful for marking locations, such as homes, schools, or water sources.

Geotrace:
The geotrace question type collects a series of connected geographic coordinates, forming a line. This can be used to map routes, paths, or boundaries.

Geoshape:
A geoshape question type captures a series of geographic coordinates that form a closed polygon. This is useful for defining areas, such as land parcels, agricultural fields, or protected zones.

To utilize these data types, we need to parse them into a GIS friendly format. robotoolbox uses Well-Known Text (WKT), a standard markup language for representing vector geometry, to represent points (geopoint), lines (geotrace) and polygons (geoshape). WKT is chosen for its wide compatibility with GIS software and spatial analysis packages, making it easier to integrate KoboToolbox data with various spatial analysis workflows.

Spatial data

The following form provides a simple demonstration of how robotoolbox maps spatial field types.

library(robotoolbox)
library(dplyr)
library(sf)
library(mapview)
mapview::mapviewOptions(
    fgb = FALSE,
    basemaps = c(
      "Esri.WorldImagery",
      "Esri.WorldShadedRelief",
      "OpenTopoMap",
      "OpenStreetMap"),
    layers.control.pos = "topright")
l <- asset_list

Survey questions

|name |type |label | |:-------------------|:--------|:------------------------------| |point |geopoint |Record a location | |point_description |text |Describe the recorded location | |line |geotrace |Record a line | |line_description |text |Describe the recorded line | |polygon |geoshape |Record a polygon | |polygon_description |text |Describe the recorded polygon |

The form includes three spatial type columns: point, line and polygon.

Loading the project

The aforementioned form, named Spatial data, was uploaded to the server. You can load it from the asset_list of assets.

library(robotoolbox)
library(dplyr)

# Retrieve a list of all assets (projects) from your KoboToolbox server
asset_list <- kobo_asset_list()

# Filter the asset list to find the specific project and get its unique identifier (uid)
uid <- filter(asset_list, name == "Spatial data") |>
  pull(uid)

# Load the specific asset (project) using its uid
asset <- kobo_asset(uid)
asset
asset <- asset_spatial
asset

In this code:

We have a single submission, where we recorded one location using a geopoint question, mapped a portion of a road using a geotrace question, and outlined a stadium using a geoshape question.

Extracting the data

From the assets, we can proceed to extract the submissions.

df <- kobo_data(asset)
glimpse(df)
df <- data_spatial
glimpse(df)

We can see that we have all of our three columns point, line and polygon. For each of them, we have a corresponding WKT column.

The WKT format for a point is simply POINT (longitude latitude). For example, POINT (-17.446667 14.692778) represents a location in Dakar, Senegal.

pull(df, point)
pull(df, point_wkt)

For geopoint types, robotoolbox also offers columns for latitude, longitude, altitude, and precision.

df |>
  select(starts_with("point_"))

The line column, derived from the geotrace question, has a corresponding line_wkt column.

The WKT format for a line is LINESTRING (lon1 lat1, lon2 lat2, ...). Each pair of coordinates represents a point along the line. For example, LINESTRING (-17.4440 14.6937, -17.4502 14.7167) represents a line connecting two points in Dakar.

pull(df, line)
pull(df, line_wkt)

Lastly, polygon_wkt is the WKT column derived from the geoshape question labeled polygon.

The WKT format for a polygon is POLYGON ((lon1 lat1, lon2 lat2, ..., lon1 lat1)). Note that the first and last coordinate pairs are the same, closing the polygon. For example, POLYGON ((-17.4440 14.6937, -17.4502 14.7167, -17.4314 14.7145, -17.4440 14.6937)) represents a triangular area in Dakar.

pull(df, polygon)
pull(df, polygon_wkt)

Now that we understand how robotoolbox stores spatial question types, we can convert these columns into spatial objects suitable for spatial data analysis.

Geopoint

The standard approach to manipulate spatial vector data in R involves using the sf package. sf stands for Simple Features and it extends a data.frame by adding a geometry list-column. It creates a spatially enabled data.frame. It provides an interface to the popular GDAL, GEOS, PRØJ and S2 libraries. It can be used to efficiently manipulate and visualize spatial vector data.

Creating an sf object from a text column that contains WKT characters is straightforward. The sf::st_as_sf function can be used to turn the data.frame with a WKT column into an sf object.

point_sf <- st_as_sf(data_spatial,
                     wkt = "point_wkt", crs = 4326)
mapview(point_sf)

In this code, crs = 4326 specifies the Coordinate Reference System (CRS) for the spatial data. CRS 4326 refers to the WGS84 (World Geodetic System 1984) coordinate system, which is widely used in GPS and web mapping applications. It represents locations on the Earth using latitude and longitude in degrees. This is the standard CRS used by KoboToolbox for storing geographic coordinates.

Geotrace

We can also transform a data.frame with a column from a geotrace question to an sf object with a LINESTRING geometry. The WKT column is named line_wkt.

line_sf <- st_as_sf(data_spatial,
                     wkt = "line_wkt", crs = 4326)
mapview(line_sf)

Geoshape

The column polygon_wkt can be used to create an sf polygon object. It's a simple closed polygon.

poly_sf <- st_as_sf(data_spatial,
                    wkt = "polygon_wkt", crs = 4326)
mapview(poly_sf)

Conclusion

By combining robotoolbox with R spatial analysis tools, researchers and data analysts can efficiently process, analyze, and visualize geographic data collected through KoboToolbox, opening up a wide range of possibilities for spatial data analysis in various fields such as humanitarian work, environmental studies, and social sciences.

You can learn a lot about the sf packages and spatial data analysis with R from the excellent Geocomputation with R book and through the extensive sf package documentation.



Try the robotoolbox package in your browser

Any scripts or data that you put into this service are public.

robotoolbox documentation built on April 4, 2025, 12:21 a.m.