Introduction

About / Features

The R extension package sensorweb4R provides functions and classes to download data from sensor web services. These services contain timeseries of sensor data, such as temperature measurements of weather stations or pollution data from air quality stations. You can retrieve specific subsets of data from the services using plain R function calls. These datasets are loaded into your session as ready-to-use data structures.

Currently, the following web service APIs are supported:

A related package is sos4R (on CRAN), which provides similar functionality to download timeseries from a standardized OGC Sensor Observation Service.

Quick Start

The sensorweb4R package is not on CRAN yet, so please download and install the package manually, for example using devtools.

require(devtools)
devtools::install_github("52North/sensorweb4R", build_vignettes = TRUE)

Then load the package and take a look at the help and the vignette:

require(sensorweb4R)
?sensorweb4R
demo(package = "sensorweb4R")
vignette(package = "sensorweb4R")
vignette("<name of the vignette to open")

The package contains several demos for different aspects of the package. The demo ircel-celine is a good starting point:

demo(package = "sensorweb4R")
demo("ircel-celine")

Some konwn API endpoints are build in to start exploring sensor web data on your own:

sensorweb4R::example.endpoints()

Terms and Definitions

Sensor Web and OGC Sensor Web Enablement (SWE): "The concept of the "sensor web" is a type of sensor network that is especially well suited for environmental monitoring. [...] OGC's Sensor Web Enablement (SWE) framework defines a suite of web service interfaces and communication protocols abstracting from the heterogeneity of sensor (network) communication." [1]

The following are abstract concepts from Sensor Observation Service Specification as well as the Timeseries API. They can be used differently in specific instances of SensorWeb services. For getting started with a new service endpoint in sensorweb4R, stations, timeseries and phenomenons are most relevant.

Examples

Hydrological sensor network

Procedures are the abstract processes (e.g. daily average) that generate the observations. These are the same accros all measurement location in the network which are represented by the feature of interest. At these locations the sensors deliver values for observed properties which are represented by phenomena. In this context a station ties together features and all procedures and phenomena existing at this feature to produce a more unified view.

Accessing the Timeseries API

For detailed information about the Timeseries API, please check the API documentation, which provides the normative definitions of terms and functions used or explained in this document.

General concept

To download data the following three steps must be implemented in a script

  1. Connect to an endpoint
  2. Fetch content information (metadata) from the endpoint, i.e. available phenomena, timeseries, ...
  3. Download data

Common query parameters

http://sensorweb.demo.52north.org/sensorwebclient-webapp-stable/api-doc/index.html#general-common-query-parameters

Exploring Available Timeseries

Connecting to an endpoint

# connect
endpoint <- example.endpoints()[1]

# get all services
srv <- services(endpoint)

# get the names of the services
label(srv)

# subset services
srv <- srv[5]

# get all phenomena
phe <- phenomena(endpoint)

# get the names of the phenomena
head(label(phe))

Exploring stations of a service

# get all stations
sta <- stations(srv)
head(label(sta))

# filter by category
cat <- categories(srv)
sta <- stations(srv, category = cat[1])
head(label(sta))

Stations are spatial objects containing a geometry:

geom <- sp::geometry(sta)
head(geom)

Exploring timeseries of a service

# get all timeseries
ts <- timeseries(srv)
# filter by station
ts <- timeseries(sta[1])
# equivalent
# sta <- timeseries(endpoint, station = sta[1])
# sta <- timeseries(endpoint, station = id(sta[1]))


# filter by category
ts <- timeseries(endpoint, category = cat[1])[1:2]
# equivalent:
# ts <- timeseries(endpoint, category = id(cat[2]))

Accessing relations and attributes

Timeseries are complex classes with relations to nearly all other classes:

str(ts, max.level = 2)

To save bandwith, most relations are not filled:

category(ts)

If you really need the meta data you can fetch them:

ts <- fetch(ts)

Now you can access the relations using the respective getter:

E.g. to get the procedures of the timeseries:

label(procedure(ts))

Searching by keyword

Not implemented yet.

Downloading data

# as the timespan of the series is quite large...
lubridate::duration(lubridate::new_interval(time(firstValue(ts)), 
                                            time(lastValue(ts))))

# ... we should filter the data we want
# e.g. the last common week of data
last <- min(time(lastValue(ts)))
time <- lubridate::as.interval(lubridate::weeks(1), last - lubridate::weeks(1))
data <- getData(ts, timespan = time)
str(data)

Have a look at ?lubridate for further examples on how to express time intervals in R. <!--

pander::pandoc.table(as.data.frame(data[[1]]))

-->

Using data for further analysis

# coercion to timeseries 
ts <- ts[1]
data <- data[[1]]

xlab <- "Time"
ylab <- paste0(names(phenomenon(ts)), " (", uom(ts), ")")
main <- names(ts)

# convert to zoo
x <- zoo::as.zoo(data)
plot(x, main = main, xlab = xlab, ylab = ylab)

# convert to xts
x <- xts::as.xts(data)
plot(x, main = main, xlab = xlab, ylab = ylab)

# coercion to data.frame
x <- as.data.frame(data)
plot(x, main = main, xlab = xlab, ylab = ylab)

# summary and histogram
summary(data)
hist(data)

# coercion to Spatial stuff
as.SpatialPointsDataFrame(sta)

Currently unsupported features of the timeseries API

Caching

All requests to resources of the API are cached in a global list that can be accessed using get.cache and set.cache. The list contains the parsed JSON responses of the service for each resource:

get.cache.keys()
str(get.cache.value(get.cache.keys()[1]))

For more information consult ?cache.

Options

Logging

sensorweb4R uses the package futile.logger for logging and by default prints log statements only to the console. The default logging level is INFO can can be changed with flog.threshold(<level>, name = "sensorweb4R") to one of TRACE (most detailed), DEBUG, INFO, WARN, ERROR, FATAL (least verbose).

You can configure the level of the logger and log files and much more - just check the logging package documentation with ?futile.logger.

Source Code

sensorweb4R is open source software managed within the 52North Sensor Web Community. The code is available on GitHub: https://github.com/52North/sensorweb4R

Contribute

Please check the README.md on GitHub for developer documentation.

Support / Contact

Please direct support questions to the 52North Sensor Web Community mailing list/forum: http://sensorweb.forum.52north.org/ (and read the guidelines beforehand).

Add an issue/comment on GitHub if you found a bug or want to collaborate on new features.

Acknowledgements

This work was supported by Joaquin (Joint Air Quality Initiative).

License

This document is licensed a Creative Commons Attribution 4.0 International License (CC BY 4.0).

This R extension package sensorweb4R is licensed under The Apache Software License, Version 2.0.

[1] http://en.wikipedia.org/wiki/Sensor_web



52North/sensorweb4R documentation built on March 30, 2020, 11:39 p.m.