The CAMELS data set (Catchment Attributes and Meteorology for Large-sample Studies) was produced by NCAR and close collaborators and consists of two data sets: hydrometeorological time series introduced in Newman et al. (2015) and the catchment attributes introduced in Addor et al. (2017).

The LSTM model is a novel way to apply machine learning principles to hydrologic prediction. The attributes presenting in Addor et al. have proven useful for LSTM modeling. As part of the NextGen framework, we want to be able to test the performance of LSTM models for various catchments.

This means we need a way to quickly produce the needed catchment attributes, for a given hydrofabric which provides an ideal use case for geogrids and zonal. Here we document the attributes and the way they are processed for a NextGen hydrofabric covering HUC-01.

This process is being added to the hydroresolve workflow for creating and releasing refactored and aggregated catchment networks that conform to the HY_Features standard.

Define Hydrofabric

{.tabset .tabset-fade .tabset-pills}

Rain {.tabset .tabset-fade}

Mean precipitation

Mean daily precipitation (mm/day)

We use daily GridMet files from 2018-2020..

plot of chunk unnamed-chunk-1

Snowfall Fraction

Fraction of precipitation falling as snow

We use the daily GridMet files for tmin_ and tmax_ to create tavg_*. We use a temperature threshold following Keith Jenning's paper and the data found here. With this temperature threshold, we partition all PPT values into daily snow/rain. Total snow and precipitation are computed for each basin, and the fraction of Snow to PPT is computed.

plot of chunk unnamed-chunk-2

Seasonality

Seasonality and timing of precipitation (estimated using sine curves to represent the annual temperature and preciptiation cycles, positive [negative] values indicate that precipitation peaks in summer [winter], values close to 0 indicate uniform precipitation (eq 14)

NOT SURE YET.

High Frequency

Frequency of high precipitation days (≥5 times mean daily precipitation) (days/year)

Basin wide meanPPT is multiplied by 5 and used to classify each day as "intense" (TRUE) or not (FALSE). The sum of "intense" days is computed for each basin and divided by the number of years processed.

plot of chunk unnamed-chunk-3

Low Frequency

frequency of dry days (<1 mm/day) (days/year)

Daily PPT is defined as "low" (TRUE) or not (FALSE) based on if it is larger the 1mm. The sum of "low" days is computed for each basin and divided by the number of years processed .

plot of chunk unnamed-chunk-4

High Duration

average duration of high precipitation events (number of consecutive days ≥5 times mean daily precipitation) (days)

The average duration of intense rain is computed by transposing the intense PPT TRUE/FALSE matrix and computing a cumulative sum for each catchment resetting the count with each FALSE. Only the max value in each intense period is kept and the rest set to NA. The mean for each catchment is computed. .

plot of chunk unnamed-chunk-5

Low duration

average duration of dry periods (number of consecutive days <1 mm/day) (days)

TThe average duration of low rain is computed by transposing the low PPT TRUE/FALSE matrix and computing a cumulative sum for each catchment resetting the count with each FALSE. Only the max value in each low period is kept and the rest set to NA. The mean for each catchment is computed. .

plot of chunk unnamed-chunk-6

Energy {.tabset .tabset-fade}

Mean PET

Mean daily PET (estimated by N15 using Priestley-Taylor formulation calibrated for each catchment)

Mean daily PET is computed from the PET Gridment files. .

plot of chunk unnamed-chunk-7

Aridity

Aridity (PET/P), ratio of mean PET to mean precipitation. Daily GridMet files for pet_ and pr_ from 2018-2020 are used to compute to annual mean PET and PPT.

WAn aridity index is computed as the mean annual PET divided by the mean annual PPT computed from the Gridmet files .

plot of chunk unnamed-chunk-8

Terrain {.tabset .tabset-fade}

Mean elevation (meters)

Catchment mean elevation (meters)

The 30m National Elevation Dataset is used to compute mean elevation (meters) .

plot of chunk unnamed-chunk-9

Mean slope

Catchment mean slope (m/km)

The 30m National Elevation Dataset is used to derive a 30m slope grid that is used to compute a mean slope value for each catchment.

plot of chunk unnamed-chunk-10

Soils {.tabset .tabset-fade}

Geological permeability

Subsurface permeability (log10) m2 from GLHYMPS

Data from: Gleeson, T., Moosdorf ,N., Hartmann,J. and van Beek, L.P.H. (2014) A glimpse beneath earth's surface: GLobal HYdrogeology MaPS (GLHYMPS) of permeability and porosity. Geophyscial Research Letters, 41: 2014GL059856 doi: 10.1002/2014gl059856

Using field: logK_Ice_x100_INT from the supplied shapefile the data is rasterized to the CONUS-Soil 1KM2 grid.

plot of chunk unnamed-chunk-11

Frac. carbonate sedimentary rock

Fraction of the catchment area characterized as “Carbonate sedimentary rocks” from GLiM

Data from: Hartmann, J., and Moosdorf, N. (2012), The new global lithological map database GLiM: A representation of rock properties at the Earth surface, Geochem. Geophys. Geosyst., 13, Q12004, doi:10.1029/2012GC004370.

Using field: xx from the supplied shapefile where polygons == 'sc'. The data is rasterized to the CONUS-Soil 1KM2 grid.

plot of chunk unnamed-chunk-12

Clay fraction

clay fraction (of the soil material smaller than 2 mm, layers marked as oragnic material, water, bedrock and “other” were excluded) % (STATSGO)

Data from the CONUS Soil clay.bsq is built into a raster field and the percentage of clay is computed for the top 1m of the soil column. .

plot of chunk unnamed-chunk-13

Silt Fraction

silt fraction (of the soil material smaller than 2 mm, layers marked as oragnic material, water, bedrock and “other” were excluded) % (STATSGO)

Data from the CONUS Soil silt.bsq is built into a raster field and the percentage of silt is computed for the top 1m of the soil column. .

plot of chunk unnamed-chunk-14

Sand fraction

sand fraction (of the soil material smaller than 2 mm, layers marked as organic material, water, bedrock and “other” were excluded) % (STATSGO)

Data from the CONUS Soil sand.bsq is built into a raster field and the percentage of sand is computed for the top 1m of the soil column. .

plot of chunk unnamed-chunk-15

Saturated hydraulic conductivity

saturated hydraulic conductivity (estimated using a multiple linear regression based on sand and clay fraction for the layers marked as USDA soil texture class and a default value [36 cm/hr] for layers marked as organic material, layers marked as water, bedrock and “other” were excluded) cm/hr Table 4

$-0.60 + (0.0126 x %sand) - (0.0064 x %clay)$

Using the mean sand and clay percentages in the top 1m of soil, the above formula can be applied.

plot of chunk unnamed-chunk-16

Volumetric porosity

volumetric porosity (saturated volumetric water content estimated using a multiple linear regression based on sand and clay fraction for the layers marked as USDA soil texture class and a default value [0.9] for layers marked as organic material, layers marked as water, bedrock and “other” were excluded)

$50.5 - (0.142 x %sand) - (0.037 x %clay)$

Using the mean sand and clay percentages in the top 1m of soil, the above formula can be applied.

plot of chunk unnamed-chunk-17

Soil depth to bedrock

soil depth (maximum 1.5 m, layers marked as water and bedrock were excluded) (m)

Data from the CONUS Soil depth.bsq is built into a raster field to compute catchment mean .

plot of chunk unnamed-chunk-18

Max. water content

maximum water content (combination of porosity and soil_depth_statsgo, layers marked as water, bedrock and “other” were excluded)

The MWC is computed by multiplying the porosity and the soil depth.

plot of chunk unnamed-chunk-19

Land Cover / Veg {.tabset .tabset-fade}

Fraction of Forest

forest fraction

The percent forest from the MODIS land cover is computed. MODIS land cover is represented on the the 1km Soil Grid and forest is computed as the aggregate of Evergreen broadleaf forests (2); Deciduous needleleaf forests (3); Deciduous broadleaf forests (4) and (5) Mixed forests .

plot of chunk unnamed-chunk-20

Green Vegitative Fraction (GVF)

The Green Vegetative Fraction (GVF) is computed from NDVI. NDVI is taken from the MODIS Terra mission.

$GVF = \frac{NDVI_{i} - NDVI_{barren}}{ NDVI_{lc} - NDVI_{barren}}$

Where, $NDVI_{i}$ is a time varying NDVI observation (by pixel); $NDVI_{barren}$ is a global constant for barren land - here we use 0.05; and $NDVI_{lc}$ is a long term, landcover based constant for each pixel.

For us, we computed monthly GVF over the last 20 years which were reduced to month-based, pixel-wise averages from which, the maximum and maximum-minimum were computed pixel-wise. From these, a catchment mean-maximum and mean-difference were computed.

maximum monthly mean of the green vegetation fraction (based on 12 monthly means)

plot of chunk unnamed-chunk-21

difference between the maximum and mimumum monthly mean of the green vegetation fraction (based on 12 monthly means)

plot of chunk unnamed-chunk-22

Leaf Area Index (LAI) diff.

8-day MODIS LAI measurement were used to generate monthly rasters between 2000 and 2020. These were then used to compute a month based, pixel-wise averages from which, the maximum and maximum-minimum were computed pixel-wise. From these, a catchment mean-maximum and mean-difference were computed.

maximum monthly mean of the leaf area index (based on 12 monthly means)

Will Use Monthly MODIS LAI.

plot of chunk unnamed-chunk-23

difference between the maximum and mimumum monthly mean of the leaf area index (based on 12 monthly means)

plot of chunk unnamed-chunk-24



mikejohnson51/geogrids documentation built on June 16, 2022, 12:36 a.m.