geeLite' R Package"

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
# Check that required packages are available without installing them
required_pkgs <- c("dplyr", "leaflet", "withr", "sf")
missing_pkgs <- required_pkgs[!vapply(
  required_pkgs, requireNamespace, logical(1), quietly = TRUE)]

if (length(missing_pkgs) > 0) {
  stop(paste0("Missing packages: ", paste(missing_pkgs, collapse = ", "),
              ".\nPlease install them manually before running this vignette."))
}

# Define a temporary path for SQLite database
path_to_db <- withr::local_tempdir()
# Load the necessary packages
library(geeLite)
library(withr)

# Set chunk evaluation flag based on GEE authentication status
run_chunks <- geeLite:::check_rgee_ready()

# Set the time zone to UTC temporarily during vignette execution
withr::local_timezone("UTC")

1. Introduction {#introduction}

geeLite simplifies the process of building, managing, and updating local SQLite databases containing geospatial features extracted from Google Earth Engine (GEE). This vignette covers the installation, configuration, data collection, and analysis workflows using the geeLite package.

For more detailed information and updates, visit the geeLite GitHub repository.

2. Installation {#installation}

To install geeLite from GitHub, use the following commands:

# Install devtools if not installed
# install.packages("devtools")

# Install geeLite from GitHub
devtools::install_github("mtkurbucz/geeLite")

# Install Python dependencies and setup rgee
geeLite::gee_install()

3. Workflow {#workflow}

The workflow for setting up and using geeLite includes configuration, data collection, modification, and data analysis.

3.1 Configuration {#configuration}

The first step in using geeLite is setting up a configuration file. This file specifies the regions of interest, the datasets to collect from GEE, and other parameters for data collection.

# Define the path for the SQLite database
path <- path_to_db

# Create a configuration for Somalia (SO) and Yemen (YE) to collect NDVI data
set_config(
  path = path,
  regions = c("SO", "YE"),
  source = list(
    "MODIS/061/MOD13A2" = list(
      "NDVI" = c("mean", "sd")
    )
  ),
  resol = 3,
  start = "2020-01-01"
)

This configuration will create a JSON file at the specified path that defines the parameters for collecting data from GEE.

3.2 Data Collection {#data-collection}

Once the configuration is set, you can collect data from GEE using the run_geelite() function. This function retrieves data based on the configuration file and stores it in a local SQLite database.

# Collect the data and store it in the SQLite database
run_geelite(path = path)

The function will store the collected geospatial data in the SQLite database and log the progress.

3.3 Modifying Configuration {#modifying-configuration}

If you want to modify the configuration (e.g., to add new statistics or datasets), you can use the modify_config() function to make updates without rebuilding the entire database.

# Add more statistics to the NDVI band and include EVI data
modify_config(
  path = path,
  keys = list(
    c("source", "MODIS/061/MOD13A2", "NDVI"),
    c("source", "MODIS/061/MOD13A2", "EVI")
  ),
  new_values = list(
    c("mean", "min", "max"),
    c("mean", "sd")
  )
)

After modifying the configuration, run run_geelite() again to update the database with the new settings.

3.4 Reading and Analyzing Data {#reading-and-analyzing-data}

Once the data has been collected and stored in the database, you can read and analyze it using the read_db() function. This function allows you to aggregate data at different frequencies (e.g., daily, monthly) and apply preprocessing functions.

# Read the data from the database, aggregate to monthly frequency by default
db <- read_db(path = path)

# Read the data with custom aggregation functions
db <- read_db(path = path, aggr_funs = list(
  function(x) mean(x, na.rm = TRUE),
  function(x) sd(x, na.rm = TRUE))
)

4. Usage Example {#usage-example}

This example demonstrates how to use geeLite to gather NDVI data for Somalia and Yemen, aggregate it monthly, and visualize the results using the leaflet package.

4.1. Collecting the Data {#collecting-data}

# Define the path for the SQLite database
path <- path_to_db

# Set the configuration file for NDVI data collection
set_config(
  path = path,
  regions = c("SO", "YE"),
  source = list(
    "MODIS/061/MOD13A2" = list(
      "NDVI" = c("mean", "sd")
    )
  ),
  resol = 3,
  start = "2020-01-01"
)

# Collect the data
run_geelite(path = path)

# Read the data from the database
db <- read_db(path = path, freq = "month")

4.2. Visualizing the Data {#visualizing-data}

Once the data is collected, you can visualize it using the leaflet package. The following code shows how to plot the mean NDVI values for each region.

# Load necessary packages
library(leaflet)
library(dplyr)
library(sf)

# Read database and merge grid with MODIS data
sf <- merge(db$grid, db$`MODIS/061/MOD13A2/NDVI/mean`, by = "id")

# Select the date to visualize
ndvi <- sf$`2020-01-01`

# Create a color palette function based on the values
color_pal <- colorNumeric(palette = "viridis", domain = ndvi)

# Create the leaflet map
leaflet(data = sf) %>%
  addTiles() %>%                            # Add base tiles
  addPolygons(
    fillColor = color_pal(ndvi),            # Fill color
    color = "#BDBDC3",                      # Border color
    weight = 1,                             # Border weight
    opacity = 1,                            # Border opacity
    fillOpacity = 0.9                       # Fill opacity
  ) %>%
  addScaleBar(position = "bottomleft") %>%  # Add scale bar
  addLegend(
    pal = color_pal,                        # Color palette
    values = ndvi,                          # Data values to map
    title = "Mean NDVI",                    # Legend title
    position = "bottomright"                # Legend position
  )

5. Extras {#extras}

Additional features of the geeLite package include a drive mode for efficiently handling large data requests, as well as command-line interface (CLI) support for automation and integration with job scheduling systems like cron.

5.1 Drive Mode {#drive}

To efficiently handle large data requests, drive mode exports data in parallel batches to Google Drive before importing it into your local SQLite database. Ensure that adequate Google Drive storage is available before using drive mode.

# Collect and store data using drive mode
run_geelite(path = path, mode = "drive")

5.2 Command-Line Interface (CLI) {#cli}

geeLite provides a CLI that allows you to run the main functions of the package directly from the command line. This is useful for automating workflows or integrating geeLite into larger systems.

```{bash cli-usage, eval = FALSE}

Setting the CLI files

Rscript /path/to/geeLite/cli/set_cli.R --path "path/to/db"

Change directory to where the database will be generated

cd "path/to/db"

Set up the configuration via CLI

Rscript cli/set_config.R --regions "SO YE" --source "list('MODIS/061/MOD13A2' = list('NDVI' = c('mean', 'min')))" --resol 3 --start "2020-01-01"

Collecting GEE data via CLI

Rscript cli/run_geelite.R

Modifying the configuration via CLI

Rscript cli/modify_config.R --keys "list(c('source', 'MODIS/061/MOD13A2', 'NDVI'), c('source', 'MODIS/061/MOD13A2', 'EVI'))" --new_values "list(c('mean', 'min', 'max'), c('mean', 'sd'))"

Updating the database via CLI

Rscript cli/run_geelite.R

### 5.3 Automating Data Collection with Cron {#cron}

You can automate database updates using the Linux `cron` job scheduler. Here’s an example cron job that updates the database monthly:

```{bash cron-job, eval = FALSE}
# Open the cron jobs file
crontab -e

# Add the following line to schedule monthly updates (runs on the 1st of every month)
0 0 1 * * Rscript /path/to/db/cli/run_geelite.R


Try the geeLite package in your browser

Any scripts or data that you put into this service are public.

geeLite documentation built on Aug. 9, 2025, 1:08 a.m.