knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
# Check that required packages are available without installing them required_pkgs <- c("dplyr", "leaflet", "withr", "sf") missing_pkgs <- required_pkgs[!vapply( required_pkgs, requireNamespace, logical(1), quietly = TRUE)] if (length(missing_pkgs) > 0) { stop(paste0("Missing packages: ", paste(missing_pkgs, collapse = ", "), ".\nPlease install them manually before running this vignette.")) } # Define a temporary path for SQLite database path_to_db <- withr::local_tempdir()
# Load the necessary packages library(geeLite) library(withr) # Set chunk evaluation flag based on GEE authentication status run_chunks <- geeLite:::check_rgee_ready() # Set the time zone to UTC temporarily during vignette execution withr::local_timezone("UTC")
geeLite
simplifies the process of building, managing, and updating local SQLite databases containing geospatial features extracted from Google Earth Engine (GEE). This vignette covers the installation, configuration, data collection, and analysis workflows using the geeLite
package.
For more detailed information and updates, visit the geeLite GitHub repository.
To install geeLite
from GitHub, use the following commands:
# Install devtools if not installed # install.packages("devtools") # Install geeLite from GitHub devtools::install_github("mtkurbucz/geeLite") # Install Python dependencies and setup rgee geeLite::gee_install()
The workflow for setting up and using geeLite
includes configuration, data collection, modification, and data analysis.
The first step in using geeLite
is setting up a configuration file. This file specifies the regions of interest, the datasets to collect from GEE, and other parameters for data collection.
# Define the path for the SQLite database path <- path_to_db # Create a configuration for Somalia (SO) and Yemen (YE) to collect NDVI data set_config( path = path, regions = c("SO", "YE"), source = list( "MODIS/061/MOD13A2" = list( "NDVI" = c("mean", "sd") ) ), resol = 3, start = "2020-01-01" )
This configuration will create a JSON file at the specified path that defines the parameters for collecting data from GEE.
Once the configuration is set, you can collect data from GEE using the run_geelite()
function. This function retrieves data based on the configuration file and stores it in a local SQLite database.
# Collect the data and store it in the SQLite database run_geelite(path = path)
The function will store the collected geospatial data in the SQLite database and log the progress.
If you want to modify the configuration (e.g., to add new statistics or datasets), you can use the modify_config()
function to make updates without rebuilding the entire database.
# Add more statistics to the NDVI band and include EVI data modify_config( path = path, keys = list( c("source", "MODIS/061/MOD13A2", "NDVI"), c("source", "MODIS/061/MOD13A2", "EVI") ), new_values = list( c("mean", "min", "max"), c("mean", "sd") ) )
After modifying the configuration, run run_geelite()
again to update the database with the new settings.
Once the data has been collected and stored in the database, you can read and analyze it using the read_db()
function. This function allows you to aggregate data at different frequencies (e.g., daily, monthly) and apply preprocessing functions.
# Read the data from the database, aggregate to monthly frequency by default db <- read_db(path = path) # Read the data with custom aggregation functions db <- read_db(path = path, aggr_funs = list( function(x) mean(x, na.rm = TRUE), function(x) sd(x, na.rm = TRUE)) )
This example demonstrates how to use geeLite
to gather NDVI data for Somalia and Yemen, aggregate it monthly, and visualize the results using the leaflet
package.
# Define the path for the SQLite database path <- path_to_db # Set the configuration file for NDVI data collection set_config( path = path, regions = c("SO", "YE"), source = list( "MODIS/061/MOD13A2" = list( "NDVI" = c("mean", "sd") ) ), resol = 3, start = "2020-01-01" ) # Collect the data run_geelite(path = path) # Read the data from the database db <- read_db(path = path, freq = "month")
Once the data is collected, you can visualize it using the leaflet
package. The following code shows how to plot the mean NDVI values for each region.
# Load necessary packages library(leaflet) library(dplyr) library(sf) # Read database and merge grid with MODIS data sf <- merge(db$grid, db$`MODIS/061/MOD13A2/NDVI/mean`, by = "id") # Select the date to visualize ndvi <- sf$`2020-01-01` # Create a color palette function based on the values color_pal <- colorNumeric(palette = "viridis", domain = ndvi) # Create the leaflet map leaflet(data = sf) %>% addTiles() %>% # Add base tiles addPolygons( fillColor = color_pal(ndvi), # Fill color color = "#BDBDC3", # Border color weight = 1, # Border weight opacity = 1, # Border opacity fillOpacity = 0.9 # Fill opacity ) %>% addScaleBar(position = "bottomleft") %>% # Add scale bar addLegend( pal = color_pal, # Color palette values = ndvi, # Data values to map title = "Mean NDVI", # Legend title position = "bottomright" # Legend position )
Additional features of the geeLite
package include a drive
mode for efficiently handling large data requests, as well as command-line interface (CLI) support for automation and integration with job scheduling systems like cron
.
To efficiently handle large data requests, drive
mode exports data in parallel batches to Google Drive before importing it into your local SQLite database. Ensure that adequate Google Drive storage is available before using drive
mode.
# Collect and store data using drive mode run_geelite(path = path, mode = "drive")
geeLite
provides a CLI that allows you to run the main functions of the package directly from the command line. This is useful for automating workflows or integrating geeLite
into larger systems.
```{bash cli-usage, eval = FALSE}
Rscript /path/to/geeLite/cli/set_cli.R --path "path/to/db"
cd "path/to/db"
Rscript cli/set_config.R --regions "SO YE" --source "list('MODIS/061/MOD13A2' = list('NDVI' = c('mean', 'min')))" --resol 3 --start "2020-01-01"
Rscript cli/run_geelite.R
Rscript cli/modify_config.R --keys "list(c('source', 'MODIS/061/MOD13A2', 'NDVI'), c('source', 'MODIS/061/MOD13A2', 'EVI'))" --new_values "list(c('mean', 'min', 'max'), c('mean', 'sd'))"
Rscript cli/run_geelite.R
### 5.3 Automating Data Collection with Cron {#cron} You can automate database updates using the Linux `cron` job scheduler. Here’s an example cron job that updates the database monthly: ```{bash cron-job, eval = FALSE} # Open the cron jobs file crontab -e # Add the following line to schedule monthly updates (runs on the 1st of every month) 0 0 1 * * Rscript /path/to/db/cli/run_geelite.R
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.