SITS - Satellite Image Time Series Analysis for Earth Observation Data Cubes ================
sits
is an open source R package for satellite image time series
analysis. It enables users to apply machine learning techniques for
classifying image time series obtained from earth observation data
cubes. The basic workflow in sits
is:
Conceptual view of data cubes (source: authors)
Detailed documentation on how to use sits
is available in the e-book
“Satellite Image Time Series Analysis on Earth Observation Data
Cubes”.
sits
on KaggleThose that want to evaluate the sits
package before installing are
invited to run the examples available on
Kaggle. If you are new to
Kaggle, please follow the
instructions
to set up your account. These examples provide a fast-track introduction
to the package. We recommend running them in the following order:
The sits
package relies on the geospatial packages sf
, stars
,
gdalcubes
and terra
, which depend on the external libraries GDAL and
PROJ. Please follow the instructions for installing sits
from the
Setup chapter of the on-line sits
book.
sits
sits
can be installed from CRAN:
install.packages("sits")
The latest supported version is available on github. It may have additional fixes from the version available from CRAN.
devtools::install_github("e-sensing/sits", dependencies = TRUE)
# load the sits library
library(sits)
#> SITS - satellite image time series analysis.
#> Loaded sits v1.5.1.
#> See ?sits for help, citation("sits") for use in publication.
#> Documentation avaliable in https://e-sensing.github.io/sitsbook/.
Classification using torch-based deep learning models in sits
uses
CUDA compatible NVIDIA GPUs if available, which provides up 10-fold
speed-up compared to using CPUs only. Please see the installation
instructions for
more information on how to install the required drivers.
sits
Users create data cubes from analysis-ready data (ARD) image collections
available in cloud services. The collections accessible in sits
1.5.1
are:
Open data collections do not require payment of access fees. Except for those in the Brazil Data Cube, these collections are not regular. Irregular collections require further processing before they can be used for classification using machine learning models.
The following code defines an irregular data cube of Sentinel-2/2A
images available in the Microsoft Planetary Computer, using the open
data collection "SENTINEL-2-L2A"
. The geographical area of the data
cube is defined by the tiles "20LKP"
and "20LLKP"
, and the temporal
extent by a start and end date. Access to other cloud services works in
similar ways.
s2_cube <- sits_cube(
source = "MPC",
collection = "SENTINEL-2-L2A",
tiles = c("20LKP", "20LLP"),
bands = c("B03", "B08", "B11", "SCL"),
start_date = as.Date("2018-07-01"),
end_date = as.Date("2019-06-30"),
progress = FALSE
)
This cube is irregular. The timelines of tiles "20LKP"
and "20LLKP"
and the resolutions of the bands are different. Sentinel-2 bands "B03"
and "B08"
have 10-meters resolution, while band "B11"
and the cloud
band "SCL"
have 20-meters resolution. Irregular collections need an
additional processing step to be converted to regular data cubes, as
described below.
Conceptual view of data cubes (source: authors)
After defining an irregular ARD image collection from a cloud service
using sits_cube()
, users should run sits_regularize()
to build a
regular data cube. This function uses the gdalcubes R
package, described in Appel and
Pebesma, 2019.
gc_cube <- sits_regularize(
cube = s2_cube,
output_dir = tempdir(),
period = "P15D",
res = 60,
multicores = 4
)
The above command builds a regular data cube with all bands interpolated
to 60 m spatial resolution and 15-days temporal resolution. Regular data
cubes are the input to the sits
functions for time series retrieval,
building machine learning models, and classification of raster images
and time series.
sits
sits
has been designed to use satellite image time series to derive
machine learning models. After the data cube has been created, time
series can be retrieved individually or by using CSV or SHP files, as in
the following example. The example below uses a data cube in a local
directory, whose images have been obtained from the "MOD13Q1-6"
collection of the Brazil Data Cube.
library(sits)
# this data cube uses images from the Brazil Data Cube that have
# downloaded to a local directory
data_dir <- system.file("extdata/raster/mod13q1", package = "sits")
# create a cube from downloaded files
raster_cube <- sits_cube(
source = "BDC",
collection = "MOD13Q1-6.1",
data_dir = data_dir,
delim = "_",
parse_info = c("X1", "X2", "tile", "band", "date"),
progress = FALSE
)
# obtain a set of samples defined by a CSV file
csv_file <- system.file("extdata/samples/samples_sinop_crop.csv",
package = "sits"
)
# retrieve the time series associated with the samples from the data cube
points <- sits_get_data(raster_cube, samples = csv_file)
# show the time series
points[1:3, ]
#> # A tibble: 3 × 7
#> longitude latitude start_date end_date label cube time_series
#> <dbl> <dbl> <date> <date> <chr> <chr> <list>
#> 1 -55.8 -11.7 2013-09-14 2014-08-29 Cerrado MOD13Q1-6.1 <tibble>
#> 2 -55.8 -11.7 2013-09-14 2014-08-29 Cerrado MOD13Q1-6.1 <tibble>
#> 3 -55.7 -11.7 2013-09-14 2014-08-29 Soy_Corn MOD13Q1-6.1 <tibble>
After a time series has been obtained, it is loaded in a tibble. The first six columns contain the metadata: spatial and temporal location, label assigned to the sample, and coverage from where the data has been extracted. The spatial location is given in longitude and latitude coordinates. The first sample has been labelled “Pasture”, at location (-55.65931, -11.76267), and is considered valid for the period (2013-09-14, 2014-08-29).
sits
provides support for the classification of both individual time
series as well as data cubes. The following machine learning methods are
available in sits
:
sits_svm()
)sits_rfor()
)sits_xgboost()
)sits_mlp()
)sits_tempcnn()
)sits_tae()
)sits_lighttae()
)The following example illustrate how to train a dataset and classify an
individual time series. First we use the sits_train()
function with
two parameters: the training dataset (described above) and the chosen
machine learning model (in this case, TempCNN). The trained model is
then used to classify a time series from Mato Grosso Brazilian state,
using sits_classify()
. The results can be shown in text format using
the function sits_show_prediction()
or graphically using plot
.
# training data set
data("samples_modis_ndvi")
# point to be classified
data("point_mt_6bands")
# Train a deep learning model
tempcnn_model <- sits_train(
samples = samples_modis_ndvi,
ml_method = sits_tempcnn()
)
# Select NDVI band of the point to be classified
# Classify using TempCNN model
# Plot the result
point_mt_6bands |>
sits_select(bands = "NDVI") |>
sits_classify(tempcnn_model) |>
plot()
Classification of NDVI time series using TempCNN
The following example shows how to classify a data cube organized as a
set of raster images. The result can also be visualized interactively
using sits_view()
.
# Create a data cube to be classified
# Cube is composed of MOD13Q1 images from the Sinop region in Mato Grosso (Brazil)
data_dir <- system.file("extdata/raster/mod13q1", package = "sits")
sinop <- sits_cube(
source = "BDC",
collection = "MOD13Q1-6.1",
data_dir = data_dir,
delim = "_",
parse_info = c("X1", "X2", "tile", "band", "date"),
progress = FALSE
)
# Classify the raster cube, generating a probability file
# Filter the pixels in the cube to remove noise
probs_cube <- sits_classify(
data = sinop,
ml_model = tempcnn_model,
output_dir = tempdir()
)
# apply a bayesian smoothing to remove outliers
bayes_cube <- sits_smooth(
cube = probs_cube,
output_dir = tempdir()
)
# generate a thematic map
label_cube <- sits_label_classification(
cube = bayes_cube,
output_dir = tempdir()
)
# plot the the labelled cube
plot(label_cube,
title = "Land use and Land cover in Sinop, MT, Brazil in 2018"
)
Land use and Land cover in Sinop, MT, Brazil in 2018
If you use sits
, please cite the following paper:
Additionally, the sample quality control methods that use self-organized maps are described in the following reference:
The authors are thankful for the contributions of Edzer Pebesma, Jakub
Nowosad. Marius Appel, Martin Tennekes, Robert Hijmans, Ron Wehrens, and
Tim Appelhans, respectively chief developers of the packages
sf
/stars
, supercells
, gdalcubes
, tmap
, terra
, kohonen
, and
leafem
. The sits
package recognises the great work of the RStudio
team, including the tidyverse
. Many thanks to Daniel Falbel for his
great work in the torch
and luz
packages. Charlotte Pelletier shared
the python code that has been reused for the TempCNN machine learning
model. We would like to thank Maja Schneider for sharing the python code
that helped the implementation of the sits_lighttae()
and sits_tae()
model. We recognise the importance of the work by Chris Holmes and
Mattias Mohr on the STAC specification and API.
We acknowledge and thank the project funders that provided financial and material support:
Amazon Fund, established by the Brazilian government with financial contribution from Norway, through the project contract between the Brazilian Development Bank (BNDES) and the Foundation for Science, Technology and Space Applications (FUNCATE), for the establishment of the Brazil Data Cube, process 17.2.0536.1.
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior-Brasil (CAPES) and from the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), for providing MSc and PhD scholarships.
Sao Paulo Research Foundation (FAPESP) under eScience Program grant 2014/08398-6, for for providing MSc, PhD and post-doc scholarships, equipment, and travel support.
International Climate Initiative of the Germany Federal Ministry for the Environment, Nature Conservation, Building and Nuclear Safety (IKI) under grant 17-III-084- Global-A-RESTORE+ (“RESTORE+: Addressing Landscape Restoration on Degraded Land in Indonesia and Brazil”).
Microsoft Planetary Computer under the GEO-Microsoft Cloud Computer Grants Programme.
Instituto Clima e Sociedade, under the project grant “Modernization of PRODES and DETER Amazon monitoring systems”.
The Open-Earth-Monitor Cyberinfratructure project, which has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement No. 101059548.
FAO-EOSTAT initiative, which uses next generation Earth observation tools to produce land cover and land use statistics.
The sits
project is released with a Contributor Code of
Conduct.
By contributing to this project, you agree to abide by its terms.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.