Functions and Data Sources that Depend on SoilWeb

knitr::opts_chunk$set(
  echo = TRUE,
  eval = !as.logical(Sys.getenv("R_SOILDB_SKIP_LONG_EXAMPLES", unset = "TRUE")),
  message = FALSE,
  warning = FALSE,
  fig.height = 7,
  fig.width = 7,
  fig.retina = 2
)
# get source data for explanations
library(soilDB)
library(aqp)

x <- fetchOSD(c('miami', 'pierre', 'tristan', 'lucy', 'musick'), extended = TRUE)

Summary

The soilDB package relies on a number of data sources curated as part of the SoilWeb ecosystem of web applications, APIs, web coverage services (WCS), and internally-used lookup tables. These data are always updated shortly after the beginning of each fiscal year (October 1st) as part of the SSURGO refresh cycle. Within-cycle updates are typically performed quarterly.

Many of these curated data sources and their evolution through time have been documented in:

Note that this vignette as published on CRAN or from a local R package installation will not contain output from inline code sections. See the Articles tab within the soilDB package website for a more comprehensive version of this document.

Primary Data Sources

soilDB Functions

The following functions interact with SoilWeb APIs or Web Coverage Service (WCS) to request and download tabular or spatial data. Some functions return SoilProfileCollection objects, as defined in the aqp package.

Tabular data and/or SoilProfileCollection:

Spatial data:

WCS:

The fetchOSD() function, when specified with the extended = TRUE argument, will include the last updated date for curated sources.

# typical invocation
library(soilDB)
library(aqp)

x <- fetchOSD(c('lucy'), extended = TRUE)

# access metadata
x$soilweb.metadata

An abbreviated example of these metadata will look like this:

|product |last_update | |:-----------------------------|:-----------| |ISSR800 |2024-10-30 | |KSSL snapshot |2025-01-21 | |MLRA membership |2025-10-07 | |OSD fulltext |2026-01-02 | |OSD morphology |2026-01-02 | |SC database |2026-01-02 | |series climate summary |2025-10-08 | |series-ESID cross-tabulation |2025-10-07 | |SSURGO geomorphology |2025-10-06 | |SSURGO geomorphon |2026-01-11 | |SSURGO NCCPI Stats |2025-10-19 | |SSURGO parent material |2025-10-06 | |series extent |2025-10-04 | |taxa extent |2025-10-21 |

Details about these data sources are provided below.

Functions with Questionable Future

Some functions represent temporary solutions to data delivery problems that made sense in the past. fetchKSSL() relies on an older snapshot of the NCSS LDM (2020), fetchRaCA() relies on a pre-release of the current RaCA database, and SoilWeb_spatial_query() has been superseded by soilDB functions which interface with Soil Data Access (see SDA_query() and SDA_spatialQuery).

There are currently no replacements for full functionality provided by fetchKSSL() and fetchRaCA(). See fetchLDM() for a modern approach to downloading NCSS LDM snapshot data.

Related Web Applications

The OSDquery() function provides a convenient interface to the API behind the SoilWeb OSD Search applications:

The web-based version of this search is limited to 100 records but OSDquery() has no record limit.

The seriesExtent() function provides an interface to the data behind the SoilWeb Series Extent Explorer (SEE) application. The SEE website includes a short description of how the source data were created.

The taxaExtent() function provides an interface to the data behind the SoilWeb Soil Taxonomy Explorer (STE) application. the STE website includes a short description of how the source data were created.

Relevant Tutorials

SoilWeb Curated Data Sources

SC and OSD Derivatives

Competing (same family classification) soil series information are derived from the Soil Classification database. Geographically associated soils are derived from the OSD records. These data are available via fetchOSD().

Competing soil series.

# series names listed in "competing" have the family classification as "series"
head(x$competing)
knitr::kable(head(x$competing))

Geographically associated series. Series in the same region may have several geographically associated series in common, and can be modeled using directed graphs.

# series names listed in "gas" are geographically associated with "series"
head(x$geog_assoc_soils)
# series names listed in "gas" are geographically associated with "series"
knitr::kable(head(x$geog_assoc_soils))

SSURGO Derivatives

SSURGO components are often named for soil series. Soil series summaries derived from SSURGO are coordinated using a normalized form of component name and soil series:

See the Soil Survey Manual for more complete definitions of "map unit", "component", and "soil series".

The Querying Soil Series Data tutorial contains additional, relevant examples.

MLRA

Derived from the spatial intersection between MLRA and SSURGO polygons, with area computed on the ellipsoid from geographic coordinates. Membership values are area proportions by soil series.

MLRA "membership" for the LUCY soil series.

.mlra <- x$mlra[x$mlra$series == 'LUCY', ]
.mlra[order(.mlra$membership, decreasing = TRUE), ]
.mlra <- x$mlra[x$mlra$series == 'LUCY', ]
knitr::kable(.mlra[order(.mlra$membership, decreasing = TRUE), ], digits = 2, row.names = FALSE)

Soil series can be found in multiple MLRA, therefore MLRA membership can be modeled using directed graphs.

Geomorphic Summaries

Geomorphic summaries are computed from SSURGO component "geomorphology" tables. These represent a cross-tabulation of soil series name x geomorphic position in several landform and surface shape description systems. These systems are defined in the Field Book for Describing and Sampling Soils.

There are several associated functions in the sharpshootR package for visualizing these summaries (e.g. vizHillslopePosition()).

Note that the n column in each table is the number of component geomorphic data records associated with each soil series. It is possible for a single component to have multiple geomorphic positions defined.

# hillslope position
head(x$hillpos)
# geomorphic component: hills
head(x$geomcomp)
# geomorphic component: mountains
head(x$mtnpos)
# geomorphic component: terraces
head(x$terrace)
# geomorphic component: flats
head(x$flats)
# surface curvature across slope
head(x$shape_across)
# surface curvature down slope
head(x$shape_down)
knitr::kable(head(x$hillpos), digits = 2, caption = 'hillslope position')

knitr::kable(head(x$geomcomp), digits = 2, caption = 'geomorphic component: hills')

knitr::kable(head(x$mtnpos), digits = 2, caption = 'geomorphic component: mountains')

knitr::kable(head(x$terrace), digits = 2, caption = 'geomorphic component: terraces')

knitr::kable(head(x$flats), digits = 2, caption = 'geomorphic component: flats')

knitr::kable(head(x$shape_across), digits = 2, caption = 'surface curvature across-slope')

knitr::kable(head(x$shape_down), digits = 2, caption = 'surface curvature down-slope')

Siblings

SoilWeb defines the term "siblings" as those components or soil series that co-occur within map units. The siblings() function returns siblings for a single soil series, with a tabulation of how many times each sibling shares a common map unit.

Siblings of the PIERRE soil series, limited to just major components. The n column describes how many map units are shared between a sibling and the PIERRE series.

sib <- siblings('PIERRE', only.major = TRUE)
head(sib$sib)
sib <- siblings('PIERRE', only.major = TRUE)
knitr::kable(sib$sib)

Other

Percentiles of the National Commodity Crop Productivity Index are computed from SSURGO component records, by soil series name. These include both irrigated and non-irrigated versions of the NCCPI.

head(x$NCCPI)
knitr::kable(head(x$NCCPI), digits = 2)

Ecological classification membership are computed from map unit polygon area and component percentages.

head(x$ecoclassid)
knitr::kable(head(x$ecoclassid), digits = 2)

Parent Material Summaries

Parent material kind and origin, tabulated by soil series name. The n column is the number of component parent material records associated with a specific parent material kind or origin. The total column is the total number of component parent material records by soil series. The final column, P, is the associated proportion.

head(x$pmkind)
head(x$pmorigin)
knitr::kable(head(x$pmkind), digits = 2)
knitr::kable(head(x$pmorigin), digits = 2)

Series and Taxa Extent

Soil series extent vector data are available for all areas where SSURGO has been completed. Vector extents are described as polygons which generalize the underlying SSURGO map unit polygons (EPSG:4326). Soil series extent raster data are available as 800m grids (EPSG:5070) within CONUS only. Taxonomic classes and formative elements are available as 800m grids (EPSG:5070) within CONUS only. Grids are

Web Coverage Services

Web Coverage Services (WCS) provided by SoilWeb are still an experimental interface to snapshots of authoritative soil survey and derived data sources. The update cycle is slower than the typical SSURGO refresh. All WCS requests return data as compressed GeoTIFF (EPSG:5070).

Climate Data and Derivatives

These maps are derived from the daily, 800m resolution, PRISM data spanning 1981--2010.

Percentiles of each variable are computed by soil series, from a sampling of one point per SSURGO map unit polygon.

An example of annual and monthly climate percentiles.

head(x$climate.annual)

head(x$climate.monthly)
knitr::kable(head(x$climate.annual), digits = 0)

knitr::kable(head(x$climate.monthly), digits = 0)

Frost-Free Period

Number of days in the 50%, 80%, and 90% probability frost-free period, derived from daily minimum temperatures greater than 0 degrees C.

These maps are based on 50/80/90 percent probability estimates for the last spring frost and first fall frost (day of year). See the related algorithm documentation for details.

Values have been cross-checked with 300+ weather stations in CA.

Linear Regression Model

 ols(formula = ffd.50 ~ prism_ffd, data = z)
 

Model Likelihood
Ratio Test
Discrimination
Indexes
Obs 328 LR χ2 526.00 R2 0.799
σ 42.2446 d.f. 1 R2adj 0.798
d.f. 326 Pr(>χ2) 0.0000 g 95.935

Residuals

      Min       1Q   Median       3Q      Max 
 -278.344  -16.875    2.436   14.323  274.604 
 

β S.E. t Pr(>|t|)
Intercept 15.1397 5.3455 2.83 0.0049
prism_ffd 0.9407 0.0261 35.98 <0.0001

Frost-Free Days

Percentiles of frost-free days (FFD) at the 50% probability threshold.

Design Freeze Index

From NSSH Part 618.33 Frost Action, Potential:

Part 618, Subpart B, Exhibits, Section 618.85 is a map that shows the design freezing index values in the continental United States. The values are the number of degree days below 0 deg C for the coldest year in a period of 10 years . The values indicate duration and intensity of freezing temperatures. The 250 isoline is the approximate boundary below which frost action ceases to be a problem.

Methods:

Notes:

Growing Degree Days (C)

Mean (Celsius) growing degree days, derived from the 800m PRISM daily minimum/maximum temperature data over the interval of 1981--2010.

Calculation reference: http://agron-www.agron.iastate.edu/courses/Agron541/classes/541/lesson02b/2b.1.1.html

$$GDD_i = [ min(T_{max}, upper_{threshold}) + max(Tmin, lower_{threshold}) / 2 ] - T_{base}$$

$$GDD_i = max(GDD_i, 0)$$

Effective Precipitation

Annual sum of monthly (total) precipitation - monthly (estimated) evapotranspiration, averaged over the interval of 1981--2010. Potential evapotranspiration (PET) estimated via Thornthwaite's method of 1948. Input sources included:

Processing in GRASS GIS.

Fraction of Precipitation as Rain

This map contains estimates of the fraction of total (annual) precipitation as rain, derived from 800m daily PRISM Tmax and PPT grids (1981--2010). Calculations were performed using GRASS GIS, with methods and estimated parameters of the conditional snow probability function from Rajagopal and Harpold (2016).

Partition PPT into snow/rain:

$$rain = PPT - snow$$

$$snow = PPT * Pr(snow)$$

compute $Pr(snow)$ as a function of $Tmax$ using exponential identity for hyperbolic tangent function:

Evaluate conditional probability (fraction) of snow on a daily basis:

$$Pr(snow) = a * ( tanh(b * (Tmax - c) ) - d )$$

a:-0.5, b:0.21, c:0.5, d:1

$$tanh(x) = (1 - exp(-2x)) / (1 + exp(-2x))$$

$$Pr(snow) = -0.5 * ( (1 - exp(-2 * (0.21 * (Tmax - 0.5) ))) / (1 + exp(-2 * (0.21 * (Tmax - 0.5) ))) - 1 )$$

$$rain = PPT - (PPT * Pr(snow))$$

For each year($i$):

$$rain fraction_i = sum(rain_i) / sum(PPT_i)$$

Percentages have been converted to integers ranging from 0 to 100.

Rajagopal, S. and A.A. Harpold. 2016. Testing and Improving Temperature Thresholds for Snow and Rain Prediction in the Western United States. Journal of the American Water Resources Association, 52: 1142-1154.

DEM Derivatives

Geomorphons

A cross-tabulation of geomorphon by soil series name was computed from the current gSSURGO (30m) grid and a 30m grid of geomorphons. Currently, these data are only available within CONUS.

These maps were generated using the r.geomorphon GRASS GIS module, with the following parameters:

r.geomorphon --o dem=elev30_int forms=forms30 search=75 skip=5 flat=1.718

The source DEM was a 10m / 30m resolution compilation of USGS NED data, rounded to integers. The "flat" threshold (1.718 deg) is based on a 3% slope break.

Jasiewicz, J., Stepinski, T., 2013, Geomorphons - a pattern recognition approach to classification and mapping of landforms, Geomorphology, vol. 182, 147-156. (https://doi.org/10.1016/j.geomorph.2012.11.005)

Proportions are weighted by total soil series area (within CONUS) as informed by the component name and associated component percentage.

head(x$geomorphons)
knitr::kable(head(x$geomorphons), digits = 2, caption = 'geomorphon summary')


Try the soilDB package in your browser

Any scripts or data that you put into this service are public.

soilDB documentation built on April 3, 2026, 9:07 a.m.