sits_cube: Create data cubes from image collections

View source: R/sits_cube.R

sits_cubeR Documentation

Create data cubes from image collections

Description

Creates a data cube based on spatial and temporal restrictions in collections available in cloud services or local repositories. The following cloud providers are supported, based on the STAC protocol: Amazon Web Services (AWS), Brazil Data Cube (BDC), Digital Earth Africa (DEAFRICA), Microsoft Planetary Computer (MPC), Nasa Harmonized Landsat/Sentinel (HLS), USGS Landsat (USGS), and Swiss Data Cube (SDC). Data cubes can also be created using local files.

Usage

sits_cube(source, collection, ...)

## S3 method for class 'sar_cube'
sits_cube(
  source,
  collection,
  ...,
  orbit = "ascending",
  bands = NULL,
  tiles = NULL,
  roi = NULL,
  start_date = NULL,
  end_date = NULL,
  platform = NULL,
  multicores = 2,
  progress = TRUE
)

## S3 method for class 'stac_cube'
sits_cube(
  source,
  collection,
  ...,
  bands = NULL,
  tiles = NULL,
  roi = NULL,
  start_date = NULL,
  end_date = NULL,
  platform = NULL,
  multicores = 2,
  progress = TRUE
)

## S3 method for class 'local_cube'
sits_cube(
  source,
  collection,
  ...,
  data_dir,
  vector_dir = NULL,
  tiles = NULL,
  bands = NULL,
  vector_band = NULL,
  start_date = NULL,
  end_date = NULL,
  labels = NULL,
  parse_info = NULL,
  version = "v1",
  delim = "_",
  multicores = 2L,
  progress = TRUE
)

Arguments

source

Data source (one of "AWS", "BDC", "DEAFRICA", "MPC", "SDC", "USGS" - character vector of length 1).

collection

Image collection in data source (character vector of length 1). To find out the supported collections, use sits_list_collections()).

...

Other parameters to be passed for specific types.

orbit

Orbit name ("ascending", "descending") for SAR cubes.

bands

Spectral bands and indices to be included in the cube (optional - character vector). Use sits_list_collections() to find out the bands available for each collection.

tiles

Tiles from the collection to be included in the cube (see details below) (character vector of length 1).

roi

Region of interest (either an sf object, shapefile, or a numeric vector with named XY values ("xmin", "xmax", "ymin", "ymax") or named lat/long values ("lon_min", "lat_min", "lon_max", "lat_max").

start_date, end_date

Initial and final dates to include images from the collection in the cube (optional). (Date in YYYY-MM-DD format).

platform

Optional parameter specifying the platform in case of collections that include more than one satellite (character vector of length 1).

multicores

Number of workers for parallel processing (integer, min = 1, max = 2048).

progress

Logical: show a progress bar?

data_dir

Local directory where images are stored (for local cubes - character vector of length 1).

vector_dir

Local director where vector files are stored (for local vector cubes - character vector of length 1).

vector_band

Band for vector cube ("segments", "probs", "class")

labels

Labels associated to the classes (Named character vector for cubes of classes "probs_cube" or "class_cube").

parse_info

Parsing information for local files (for local cubes - character vector).

version

Version of the classified and/or labelled files. (for local cubes - character vector of length 1).

delim

Delimiter for parsing local files (single character)

Value

A tibble describing the contents of a data cube.

Note

To create cubes from cloud providers, users need to inform:

  1. source: One of "AWS", "BDC", "DEAFRICA", "HLS", "MPC", "SDC" or "USGS";

  2. collection: Collection available in the cloud provider. Use sits_list_collections() to see which collections are supported;

  3. tiles: A set of tiles defined according to the collection tiling grid;

  4. roi: Region of interest. Either a named vector ("lon_min", "lat_min", "lon_max", "lat_max") in WGS84, a sfc or sf object from sf package in WGS84 projection.

Either tiles or roi must be informed. The parameters bands, start_date, and end_date are optional for cubes created from cloud providers.

GeoJSON geometries (RFC 7946) and shapefiles should be converted to sf objects before being used to define a region of interest. This parameter does not crop a region; it only selects images that intersect the roi.

To create a cube from local files, users need to inform:

  1. source: Provider from where the data has been downloaded (e.g, "BDC");

  2. collection: Collection where the data has been extracted from. (e.g., "SENTINEL-2-L2A" for the Sentinel-2 MPC collection level 2A);

  3. data_dir: Local directory where images are stored.

  4. parse_info: Parsing information for files. Default is c("X1", "X2", "tile", "band", "date").

  5. delim: Delimiter character for parsing files. Default is "_".

To create a cube from local files, all images should have the same spatial resolution and projection and each file should contain a single image band for a single date. Files can belong to different tiles of a spatial reference system and file names need to include tile, date, and band information. For example: "CBERS-4_WFI_022024_B13_2018-02-02.tif" and "SENTINEL-2_MSI_20LKP_B02_2018-07-18.jp2" are accepted names. The user has to provide parsing information to allow sits to extract values of tile, band, and date. In the examples above, the parsing info is c("X1", "X2", "tile", "band", "date") and the delimiter is "_", which are the default values.

It is also possible to create result cubes for these are local files produced by classification or post-classification algorithms. In this case, more parameters that are required (see below). The parameter parse_info is specified differently, as follows:

  1. band: Band name associated to the type of result. Use "probs", for probability cubes produced by sits_classify(); "bayes", for smoothed cubes produced by sits_smooth(); "segments", for vector cubes produced by sits_segment(); "entropy" when using sits_uncertainty(), and "class" for cubes produced by sits_label_classification();

  2. labels: Labels associated to the classification results;

  3. parse_info: File name parsing information to deduce the values of "tile", "start_date", "end_date" from the file name. Default is c("X1", "X2", "tile", "start_date", "end_date", "band"). Unlike non-classified image files, cubes with results have both "start_date" and "end_date".

In MPC, sits can access are two open data collections: "SENTINEL-2-L2A" for Sentinel-2/2A images, and "LANDSAT-C2-L2" for the Landsat-4/5/7/8/9 collection. (requester-pays) and "SENTINEL-S2-L2A-COGS" (open data).

Sentinel-2/2A level 2A files in MPC are organized by sensor resolution. The bands in 10m resolution are "B02", "B03", "B04", and "B08". The 20m bands are "B05", "B06", "B07", "B8A", "B11", and "B12". Bands "B01" and "B09" are available at 60m resolution. The "CLOUD" band is also available.

All Landsat-4/5/7/8/9 images in MPC have bands with 30 meter resolution. To account for differences between the different sensors, Landsat bands in this collection have been renamed "BLUE", "GREEN", "RED", "NIR08", "SWIR16" and "SWIR22". The "CLOUD" band is also available.

In AWS, there are two types of collections: open data and requester-pays. Currently, sits supports collection "SENTINEL-2-L2A" (open data) and LANDSAT-C2-L2 (requester-pays). There is no need to provide AWS credentials to access open data collections. For requester-pays data, users need to provide their access codes as environment variables, as follows: Sys.setenv( AWS_ACCESS_KEY_ID = <your_access_key>, AWS_SECRET_ACCESS_KEY = <your_secret_access_key> )

Sentinel-2/2A level 2A files in AWS are organized by sensor resolution. The AWS bands in 10m resolution are "B02", "B03", "B04", and "B08". The 20m bands are "B05", "B06", "B07", "B8A", "B11", and "B12". Bands "B01" and "B09" are available at 60m resolution.

For DEAFRICA, sits currently works with collections "S2_L2A" for Sentinel-2 level 2A and "LS8_SR" for Landsat-8 ARD collection. (open data). These collections are located in Africa (Capetown) for faster access to African users. No payment for access is required.

For USGS, sits currently works with collection "LANDSAT-C2L2-SR", which corresponds to Landsat Collection 2 Level-2 surface reflectance data, covering Landsat-8 dataset. This collection is requester-pays and requires payment for accessing.

All BDC collections are regularized. BDC users need to provide their credentials using environment variables. To create your credentials, please see <brazil-data-cube.github.io/applications/dc_explorer/token-module.html>. Accessing data in the BDC is free. After obtaining the BDC access key, please include it as an environment variable, as follows: Sys.setenv( BDC_ACCESS_KEY = <your_bdc_access_key> )

Examples

if (sits_run_examples()) {
    # --- Access to the Brazil Data Cube
    # create a raster cube file based on the information in the BDC
    cbers_tile <- sits_cube(
        source = "BDC",
        collection = "CBERS-WFI-16D",
        bands = c("NDVI", "EVI"),
        tiles = "007004",
        start_date = "2018-09-01",
        end_date = "2019-08-28"
    )
    # --- Access to Digital Earth Africa
    # create a raster cube file based on the information about the files
    # DEAFRICA does not support definition of tiles
    cube_deafrica <- sits_cube(
        source = "DEAFRICA",
        collection = "SENTINEL-2-L2A",
        bands = c("B04", "B08"),
        roi = c(
            "lat_min" = 17.379,
            "lon_min" = 1.1573,
            "lat_max" = 17.410,
            "lon_max" = 1.1910
        ),
        start_date = "2019-01-01",
        end_date = "2019-10-28"
    )
    # --- Access to Digital Earth Australia
    cube_deaustralia <- sits_cube(
        source = "DEAUSTRALIA",
        collection = "GA_LS8CLS9C_GM_CYEAR_3",
        bands = c("RED", "GREEN", "BLUE"),
        roi = c(
            lon_min = 137.15991,
            lon_max = 138.18467,
            lat_min = -33.85777,
            lat_max = -32.56690
        ),
        start_date = "2018-01-01",
        end_date = "2018-12-31"
    )
    # --- Access to CDSE open data Sentinel 2/2A level 2 collection
    # --- remember to set the appropriate environmental variables
    # It is recommended that `multicores` be used to accelerate the process.
    s2_cube <- sits_cube(
        source = "CDSE",
        collection = "SENTINEL-2-L2A",
        tiles = c("20LKP"),
        bands = c("B04", "B08", "B11"),
        start_date = "2018-07-18",
        end_date = "2019-01-23"
    )

    ## --- Sentinel-1 SAR from CDSE
    # --- remember to set the appropriate environmental variables
    roi_sar <- c("lon_min" = 33.546, "lon_max" = 34.999,
                 "lat_min" = 1.427, "lat_max" = 3.726)
    s1_cube_open <- sits_cube(
       source = "CDSE",
       collection = "SENTINEL-1-RTC",
       bands = c("VV", "VH"),
       orbit = "descending",
       roi = roi_sar,
       start_date = "2020-01-01",
       end_date = "2020-06-10"
    )

    # --- Access to AWS open data Sentinel 2/2A level 2 collection
    s2_cube <- sits_cube(
        source = "AWS",
        collection = "SENTINEL-S2-L2A-COGS",
        tiles = c("20LKP", "20LLP"),
        bands = c("B04", "B08", "B11"),
        start_date = "2018-07-18",
        end_date = "2019-07-23"
    )

    # --- Creating Sentinel cube from MPC
    s2_cube <- sits_cube(
        source = "MPC",
        collection = "SENTINEL-2-L2A",
        tiles = "20LKP",
        bands = c("B05", "CLOUD"),
        start_date = "2018-07-18",
        end_date = "2018-08-23"
    )

    # --- Creating Landsat cube from MPC
    roi <- c("lon_min" = -50.410, "lon_max" = -50.379,
             "lat_min" = -10.1910 , "lat_max" = -10.1573)
    mpc_cube <- sits_cube(
        source = "MPC",
        collection = "LANDSAT-C2-L2",
        bands = c("BLUE", "RED", "CLOUD"),
        roi = roi,
        start_date = "2005-01-01",
        end_date = "2006-10-28"
    )

    ## Sentinel-1 SAR from MPC
    roi_sar <- c("lon_min" = -50.410, "lon_max" = -50.379,
                 "lat_min" = -10.1910, "lat_max" = -10.1573)

    s1_cube_open <- sits_cube(
       source = "MPC",
       collection = "SENTINEL-1-GRD",
       bands = c("VV", "VH"),
       orbit = "descending",
       roi = roi_sar,
       start_date = "2020-06-01",
       end_date = "2020-09-28"
    )
    # --- Access to World Cover data (2021) via Terrascope
    cube_terrascope <- sits_cube(
        source = "TERRASCOPE",
        collection = "WORLD-COVER-2021",
        roi = c(
            lon_min = -62.7,
            lon_max = -62.5,
            lat_min = -8.83,
            lat_max = -8.70
        )
    )
    # --- Create a cube based on a local MODIS data
    data_dir <- system.file("extdata/raster/mod13q1", package = "sits")
    modis_cube <- sits_cube(
        source = "BDC",
        collection = "MOD13Q1-6.1",
        data_dir = data_dir
    )
}

sits documentation built on Sept. 11, 2024, 6:36 p.m.