Spatial smoothing with `btb` R package

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.height = 5,
  fig.width = 5,fig.align = 'center'


)

This document will show you :

Furthermore, it will introduce a way to map your results using mapsf package and how to save your smoothed spatial data using sf.

Install btb

btb is available on CRAN :

install.packages("btb")

To get a bug fix or to use a feature from the development version, you can install the development version of from GitHub :

install.packages("remotes")
remotes::install_github("InseeFr/btb")

Perform spatial smoothing

Warning with personal data

Spatial smoothing generally reduces individual data disclosure. However, smoothed data can contain individual information. Please remain cautious in any case.

Smoothing gas station prices

The data

btb package provides several data tables.

For every gas station in metropolitan France, the dfPrix_SP95_2016 table gives :

library(btb)
data(dfPrix_SP95_2016)
head(dfPrix_SP95_2016)

Let's visualize these stations :

library(sf)
sfPrix_SP95_2016 <- st_as_sf(dfPrix_SP95_2016,coords = c("x","y"), crs=2154)
plot(sfPrix_SP95_2016$geometry)

Optional step : from points to aggregate grids

To figure out your spatial distribution before smoothing you data, it can be interesting to aggregate your points inside a grid (e.g : number of gas stations in 20 km pixels grid), because the smoothing is less time-consuming

btb provides the btb_add_centroids and the btb_ptsToGrid functions to make it easy :

dfPrix_SP95_2016 <- btb_add_centroids(dfPrix_SP95_2016, 
                                      iCellSize = 20000,
                                      names_coords = c("x","y"))
head(dfPrix_SP95_2016)
library(dplyr)
centro_values <- dfPrix_SP95_2016 %>%
  group_by(x_centro, y_centro) %>%
  summarise(pricemean=mean(SP95, rm.na = TRUE))
grid_values <- btb_ptsToGrid(centro_values, sEPSG = 2154,
                             iCellSize = 20000, 
                             names_centro = c("x_centro","y_centro"))
nrow(grid_values)
head(grid_values)

Once you have your polygons and your aggregated data, you can map it. Here, we use the mapsf package to do so.

library(mapsf)

mapsf::mf_map(x = grid_values,
       type = "choro",
       var="pricemean",
       breaks = "quantile",
       nbreaks = 5,
       lwd=1,
       leg_val_rnd = 1)

This map represents your aggregated (mean price) but is not smoothed yet.

Despite its patchwork aspect, this map is a good first step to better understand your data.

First smoothing : the density of gas stations

On the example below, we smooth the density of gas stations using 5\~000 km pixels and a 100 km bandwidth. Note that we need to create a new dummy variable (equals to 1 for every station) to count the stations.

pts_density <- dfPrix_SP95_2016[,c("x","y")]
# Create dummy
pts_density$stations_density <- 1L
head(pts_density)

# Smoothing
smooth_density <- btb_smooth(
  pts = pts_density,
  sEPSG = 2154,
  iBandwidth = 100000,
  iCellSize = 5000)

head(smooth_density)

# Map
mapsf::mf_map(x = smooth_density,
       type = "choro",
       var="stations_density",
       breaks = "quantile",
       nbreaks = 5,
       border = NA,
       leg_val_rnd = 1)

Note that btb_smooth is conservative :

Smoothing means : gas mean price

Smoothing a ratio works almost the same way :

Note that the btb_smooth function smoothes by default all numeric variables in the input points table (parameter pts).

# Prepare your data
pts_meanprice <- dfPrix_SP95_2016[,c("x","y","SP95")]
pts_meanprice$stations_density <- 1L
head(pts_meanprice)

# Smooth both prices and station density
smooth_density <- btb_smooth(
  pts = pts_meanprice,
  sEPSG = 2154,
  iBandwidth = 100000,
  iCellSize = 5000)

head(smooth_density)

# Calculate the smoothed mean (from smoothed nominator and denominator)
smooth_density <- smooth_density %>% mutate(meanprice=SP95/stations_density)
mapsf::mf_map(x = smooth_density,
       type = "choro",
       var="meanprice",
       breaks = "quantile",
       nbreaks = 5,
       border = NA,
       leg_val_rnd = 1)
Cstack_info()

Quantile smoothing : smooth the distribution of gas prices

Quantile smoothing is a different methodology.

Its major benefits are :

pts_quantiles <- dfPrix_SP95_2016[,c("x","y","SP95")]
head(pts_quantiles)

smooth_quantiles <- btb_smooth(pts = pts_quantiles, 
                               sEPSG = 2154, iBandwidth = 100000,
                               iCellSize = 5000,vQuantiles = c(0.5,0.9))

head(smooth_quantiles)

# Median smoothing : 
mapsf::mf_map(x = smooth_quantiles,
       type = "choro",
       var="SP95_05",
       breaks = "quantile",
       nbreaks = 5,
       border = NA,
       leg_val_rnd = 1)
# Smooth the 9th decile :
mapsf::mf_map(x = smooth_quantiles,
       type = "choro",
       var="SP95_09",
       breaks = "quantile",
       nbreaks = 5,
       border = NA,
       leg_val_rnd = 1)

The iNeighbor parameter

Here, we use data which indicates the number of poor households in squared data (200 meters) of an island called "La Réunion".

Each point is the centroid of the grid used to publish aggregated data (1\ 000 meters pixels).

Let's smooth the proportion of poors among households with an automatic grid (iNeighbor parameter absent in btb_smooth function).

In the following example, note that the btb_smooth function accepts sf points in input (also the case with btb_ptsToGrid).

# Load data
data("reunion")
head(reunion)

# Optional : transform as sf points
sfreunion <- sf::st_as_sf(reunion,coords= c("x","y"), crs = 3727)
plot(sfreunion$geometry)

# btb_smooth with an automatic grid
smooth_reunion <- btb_smooth(sfreunion,iCellSize = 500,iBandwidth = 5000)

# Calculate the ratio
smooth_reunion <- smooth_reunion %>% mutate(prop_poors = 100 * phouhold / houhold)

# map
mapsf::mf_map(x = smooth_reunion,
       type = "choro",
       var="prop_poors",
       breaks = "quantile",
       nbreaks = 5,
       border = NA,
       leg_val_rnd = 1)

Now, let's smooth the same ratio, with the same smoothing specifications (iBandwidth and iCellSize) but with iNeighbor = 0. In this case, the automatic grid only uses pixels that contain at least one data point (here, at least one household). The result is quite different.

smooth_reunion <- btb_smooth(sfreunion,iCellSize = 500,iBandwidth = 5000, iNeighbor = 0)
smooth_reunion <- smooth_reunion %>% mutate(prop_poors = 100 * phouhold / houhold)

mapsf::mf_map(x = smooth_reunion,
       type = "choro",
       var="prop_poors",
       breaks = "quantile",
       nbreaks = 5,
       border = NA,
       leg_val_rnd = 1)

Inspire naming

Using the Inspire norm, btb_smooth and btb_ptsToGrid allow you to name your pixels in a proper international way. It could be useful for reuse purpose, merge operations, etc.

You just need to use inspire = TRUE :

smooth_reunion <- btb_smooth(sfreunion,iCellSize = 500,
                             iBandwidth = 2000, iNeighbor = 0, inspire = TRUE)
smooth_reunion <- smooth_reunion %>% mutate(prop_poors = 100 * phouhold / houhold)
head(smooth_reunion)

Then, to export your geometric data, you can use the sf::write_sf function.

sf::write_sf("MY/REPOSITORY/myfile.gpkg")

References :

Some interesting use cases



Try the btb package in your browser

Any scripts or data that you put into this service are public.

btb documentation built on Oct. 24, 2022, 5:10 p.m.