knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = TRUE, fig.width = 7, fig.height = 7 )
library(AIgeomorphologist) library(sen2r)
The labelled points data are stored in SSCT_data
. The function to_spatial()
transform the data.frame
into sp
object.
labelled_points <- SSCT_data %>% to_spatial() %>% sf::st_as_sf() one_random_point <- labelled_points %>% dplyr::sample_n(1) one_random_point
First, we retrieve the list of the tile id for the S2 satellites over the spatial_extent
of interest.
id_tiles <- s2_list( spatial_extent = labelled_points, time_period = "full", level = "L2A", availability = "online" ) %>% as.data.frame() %>% dplyr::pull(id_tile) %>% unique() saveRDS(id_tiles, file.path(out_dir, "id_tiles.Rds"))
id_tiles
Second, we retrieve the entire list of the sensed tiles for the period of interest. This command can take a few hours to run.
search_results <- s2_list( tile = id_tiles, time_interval = c(as.Date("2019-06-01"), as.Date("2020-10-01")), time_period = "full", level = "L2A", availability = "online" ) saveRDS(search_results, file.path(out_dir, "search_results.Rds"))
search_results <- search_results %>% as.data.frame() dim(search_results) colnames(search_results) head(search_results)
We can now analyze these search results so that we only download the tiles we need.
One paramount parameter is the amount of clouds identified in the imagery.
To define the threshold we defined min_SAFE
as the minimum number of sensed tiles across all tiles for a given cloud threshold and sum_SAFE
as the total of sensed tiles for a given cloud threshold.
search_stats <- search_results %>% dplyr::group_by(id_tile) %>% dplyr::summarise(dplyr::across(dplyr::contains("clouds"), list(min = min, mean = mean))) %>% dplyr::summarise(clouds_min = min(clouds_min), clouds_mean = mean(clouds_mean)) clouds_stats <- data.frame( clouds_threshold = seq(ceiling(search_stats$clouds_min), ceiling(search_stats$clouds_mean), by = 0.1) ) %>% dplyr::mutate( min_SAFE = sapply(clouds_threshold, function(clouds_threshold){ search_results %>% dplyr::group_by(id_tile) %>% dplyr::group_map(~ {dplyr::filter(., clouds <= clouds_threshold) %>% nrow()}) %>% unlist() %>% min() }), sum_SAFE = sapply(clouds_threshold, function(clouds_threshold){ search_results %>% dplyr::group_by(id_tile) %>% dplyr::group_map(~ {dplyr::filter(., clouds <= clouds_threshold) %>% nrow()}) %>% unlist() %>% sum() }) )
The graph below shows the trade-off between incorporating more cloud: the minimum number of sensed tiles goes up, but so does the total number of sensed tiles. Each tile is roughly 1 GB.
p <- ggplot2::ggplot(clouds_stats %>% reshape2::melt(id.vars = c("clouds_threshold")), ggplot2::aes(x = clouds_threshold, y = value)) + ggplot2::geom_point() + ggplot2::facet_wrap(. ~ variable, scales = "free") + ggpubr::theme_pubclean() plotly::ggplotly(p)
From these two graphs, we select the following threshold:
clouds_stats %>% dplyr::filter(min_SAFE >= 20) %>% head(1) clouds_threshold <- clouds_stats %>% dplyr::filter(min_SAFE >= 20) %>% head(1) %>% dplyr::pull(clouds_threshold)
which is appropriate as the cloud percentage remains low, at least 20 tiles are sensed for any tile, and less than 3 TB of SAFE data have to be downloaded.
The selection of the product for downloading is then done by:
selected_search_results <- search_results %>% dplyr::filter(clouds <= clouds_threshold) dim(selected_search_results)
Additionally, we can visualize the sensing dates with the following codes:
p <- ggplot2::ggplot(selected_search_results, ggplot2::aes(x = sensing_datetime, y = clouds, color = id_tile)) + ggplot2::geom_point() + ggpubr::theme_pubclean() plotly::ggplotly(p)
Visual inspection of the above graph tile by tile does not indicate that there is a significant skew towards a given period for any tile.
Example of a tile with discontinuous record is 10TGK.
The tile with the lowest average cloud cover is
r selected_search_results %>%
dplyr::group_by(id_tile) %>%
dplyr::summarise(mean = mean(clouds)) %>%
dplyr::filter(mean == min(mean)) %>%
dplyr::pull(id_tile)
.
The following graph shows the distribution of the number of sensed tiles:
count_df <- data.frame(count = selected_search_results %>% dplyr::group_by(id_tile) %>% dplyr::group_map(~ nrow(.)) %>% unlist() ) ggplot2::ggplot(count_df, ggplot2::aes(x = count)) + ggplot2::geom_density() + ggplot2::labs(x = "number of sensed tiles") + ggpubr::theme_pubclean()
Running s2_download()
then allows to retrieve all the tile imagery provided that a safelist
is passed as s2_prodlist
.
class(search_results) selected_search_results <- selected_search_results %>% as("safelist") class(selected_search_results)
s2_download(s2_prodlist = selected_search_results, outdir = "path/to/your/outdir")
- Level-0 is compressed raw data. The Level-0 product contains all the information required to generate the Level-1 (and upper) product levels.
- Level-1
- Level-1A is uncompressed raw data with spectral bands coarsely coregistered and ancillary data appended.
- Level-1B data is radiometrically corrected radiance data. The physical geometric model is refined using available ground control points and appended to the product, but not applied.
- Level-1C product provides orthorectified Top-Of-Atmosphere (TOA) reflectance, with sub-pixel multispectral registration. Cloud and land/water masks are included in the product.
- Level-2A product provides orthorectified Bottom-Of-Atmosphere (BOA) reflectance, with sub-pixel multispectral registration. A Scene Classification map (cloud, cloud shadows, vegetation, soils/deserts, water, snow, etc.) is included in the product.
The SAFE format has been designed to act as a common format for archiving and conveying data within ESA Earth Observation archiving facilities. The SAFE format wraps a folder containing image data in a binary data format and product metadata in XML. This flexibility allows the format to be scalable enough to represent all levels of SENTINEL products.
A SENTINEL-2 product refers to a directory folder that contains a collection of information (Figure 1). It includes:
- a manifest.safe file which holds the general product information in XML
- a preview image in JPEG2000 format
- subfolders for measurement datasets including image data (granules/tiles) in GML-JPEG2000 format
- subfolders for datastrip level information
- a subfolder with auxiliary data (e.g. International Earth Rotation & Reference Systems (IERS) bulletin)
- HTML previews
from S2 MSI data formats
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.