cmgo: Deriving basic channel metrics from bank and long-profile...
In AntoniusGolly/cmgo: Derive principle Channel metrics from bank points

Description Getting started General information on the global data object Parameters Work flow Run cmgo Time series analyses Technical fails and how to prevent them

Principle channel metrics, as for example channel width or gradient, convey immanent information that can be exploited for geomorphic research. For example, a snap-shot of the current local channel geometry can provide an integrated picture of the processes leading to its formation, if examined in a statistically sound manner. Repeated surveys, as time-series of channel gradients, can reveal local erosional characteristics that sharpen our understanding of the underlying processes and facilitate, inspire and motivate further research. However, these geometrical metrics are not directly available. Typically, the measurable quantities are limited to the position of features, such as the channel banks (or water surface) or the water flow path (thalweg) in two- or three-dimensional coordinates. This package derives with a scale-free approach principle channel metrics, as channel width and slope. It does that by first generating a reference line in the middle of the channel (centerline) based on which then the channel width is calculated. It also allows for analyzing the evolution of channel metrics over time if multiple surveys are provided. Furthermore, secondary spatial information (as for example the occurrence of knickpoints, the abundance of certain species, etc.) can be projected to the reference line allowing for a spatial correlation of different variables.

To start a new project you will need to go through the full stack of the main functions of this package cmgo. To do this read the instructions under Run cmgo. However, if you just want a quick demo run just write CM.run() into the console and execute.

All the data and parameters used in cmgo are stored in one variable of type list: the global data object, in the following examples named cmgo.obj. Its structure is:

cmgo.obj = list(
  data = list()  # the data set(s), different surveys of the channel
    set1 = list(), # survey 1
    set2 = list()  # survey 2
    # ...
  ),
  par  = list() # all plotting and model parameters
)

The global data object then has to be passed to and is returned from all main functions of the package as in

cmgo.obj = CM.generatePolygon(cmgo.obj)
cmgo.obj = CM.calculateCenterline(cmgo.obj)
cmgo.obj = CM.processCenterline(cmgo.obj)
CM.writeData(cmgo.obj)
CM.plotPlanView(cmgo.obj)

The global data object is initialized with cmgo.obj = CM.ini() where CM.ini() will either create the object from input files, from a previously saved user workspace or from a demo data set. See the documentation of CM.ini() for detailed information on how to create the object.

The parameters are stored in the global data object, for example cmgo.obj$par. They can always be accessed or edited directly by e.g.

1	obj$par$plot.planview = TRUE

Alternatively, users can load custom parameters from files previously created (e.g. to save project settings or reproduce certain plots). Parameter files are loaded with

1	cmgo.obj$par = CM.par("custom_par_list.r")

See the documentation of CM.par() for detailed information on how to load parameters.

The program can be divided into three main parts which you will go through if you start a project: 1. initialization (loading data and parameters), 2. data processing (calculating channel metrics) and 3. review results (plotting or writing results to file). The initialization covers the loading of the parameters in CM.par() and loading of the data in CM.ini(). See their documentation for details. The work flow of the data processing is shown in the plan view plots below.

Figure: processing Figure 1: A visualization of the work flow of the package, a) data input, b) polygon generation, c-e) centerline generation, f) transect generation, g) channel width calculation.

Channel bank points (Fig. 1a) represent the required input data for the package. The algorithm then creates a polygon from these points (Fig. 1b) where the points are linearly interpolated to increase their spatial resolution. The maximum distance the points have is defined by the parameter bank.interpolate.max.dist. From these points Voronoi polygons are calculated (Fig. 1c). Voronoi polygons around points denote the areas within which all points are closest to that point. In Fig. 1c you can already notice a centerline evolving in the middle of the channel polygon. Fig. 1d shows the segments that represent the centerline filtered by the algorithm. These centerline segments will be connected to one consistent line and get smoothed (Fig. 1e). The degree of smoothing can be adjusted through the parameter centerline.smoothing.width (defaults to the same value as bank.interpolate.max.dist). This centerline represents the reference of the river, for which length, local width and slope are calculated next. Note, that the length of the centerline depends on the smoothing in 1e). The pros and cons of the smoothing are explained in the documentation of the function CM.calculateCenterline(). To derive the local channel width, transects are calculated perpendicular to portions of the centerline (Fig. 1f). The transects are lines perpendicular to a group of centerline points where the size of that group is defined by the parameter transects.span. See CM.processCenterline() for detailed information on how the transects are generated. In the final step the intersections of the transects with the banks are calculated (Fig. 1g). The distance of the centerline point to the bank is stored sepearately for the left and the right bank. When the transects cross the banks multiple times, the minimum distance is taken.

The described algorithm is hosted in the following functions:

load data points in CM.ini(), step a
generate a polygon from bank points in CM.generatePolygon(), step b
calculate centerlin from the polygon in CM.calculateCenterline(), steps c-e
process the centerline in CM.processCenterline(), steps f-g

The main functions of cmgo (described in Work flow) should be exectued in this order:

cmgo.obj = CM.ini(cmgo.obj, par="par/my_parameters.r")  # read data, optional: path to a parameter file, alternatively leave empty to use defaults
cmgo.obj = CM.generatePolygon(cmgo.obj)       # generate polygon
cmgo.obj = CM.calculateCenterline(cmgo.obj)   # get centerline (calculate or load)
cmgo.obj = CM.processCenterline(cmgo.obj)     # process centerline (calculate width)
CM.plotPlanView(cmgo.obj)          # plot results
CM.plotMetrics(cmgo.obj)           # plot channel width and bank retreat
CM.writeData(cmgo.obj)             # data to workspace and export data to csv-files (see par)#'

If the generated polygon from CM.generatePolygon() has more than 10,000 vertices, the execution time can be extensive. Thus, CM.calculateCenterline(), as the other main CM functions, has a chaching mechanism of the data. If you call the function when the resulting data already exists, the data will not be processed. You will have to explicitly force the generation of the data. When you change a parameter regarding the generation of the polygon (see CM.generatePolygon()) the program will detect this change and will calculate the centerline without forcing it.

The package cmgo can handle time series of channel geometries and offers the opportunity to compare them. To do this, simply put the an input file for each data set in the input directory (see CM.ini()). The function will create a data object for each file in the global data object under, cmgo.obj$data$set1, cmgo.obj$data$set2, cmgo.obj$data$set3, etc. All functions will iterate over the data sets automatically.

If you want to address via a variable use cmgo.obj$data[[set]], where is a string of the data set, e.g. "set1". The order of data sets will be determined by the filenames. So, make sure to name the files accordingly, e.g. "channelsurvey_a.csv","channelsurvey_b.csv". The mapping of the filenames to data sets will be printed to the console. Also, you can view the file names belonging to a data set with print(cmgo.obj$data[[set]]$filename).

Reference centerline
The channel metrics are calculated based on a centerline. Normally, for a river plan geometry one centerline exists and this is the basis for the metrics. However, when there are multiple time lines two options exist. Metrics are either calculated for each channel geometry individually. This way you have the most accurate representation of the channel metrics for that channel observation. For example, channel width is most accurately measured. However, different time series of observations are difficult to compare since the basis for the calculations – the centerlines – differ. Thus, when comparing time series, a second approach exists where you can determine a reference centerline for all metrics calculations. To do this set:

1 2	cmgo.obj$par$centerline.use.reference = TRUE cmgo.obj$par$centerline.reference = "set1"

Now, all metrics for the different bank surveys will be calculated based on the centerline of the data set "set1". Use this option only if your bank surveys differ only slightly. Otherwise, the calculated channel metrics might not be representative (see Fig. 10).

Figure: reference centerline Figure 10: For channel geometries that differ drastically, the usage of a reference centerline is not advised. The centerline of a data set (blue line) is not useful for calculating the metrics of the dashed channel geometry.

All parameters are stored in the global data object (see the section 'Global data object') under the sub-list $par. For example, if your global data object is named obj the parameter $plot.planview is accessible by obj$par$transects.span. The available parameters of the model are described in the documentation of the function CM.par(). Here we only describe a few cases why editing the default parameters can be desired.

There are certain geometrical cases in which the algorithm can fail with the default parametrization. To prevent this a wise parametrization of the model is required. The program will inform you during runtime if the generation of the centerline fails and will offer you ways to elaborate the issue. The main reason for failure occurs if the resolution of channel bank points (controlled via $bank.interpolate.max.dist) is relatively low compared to the channel width. The following image illustrates the problem:

Figure: Gap in Centerline Figure 2: A gap in the centerline occurs since the spacing of the bank points was too high.

Apparently, the centerline segments are too scraggy and do not lie entirely within the bank polygons. Since the filter mechanism (step c-d) checks for segments within the polygon first, this will create a gap in the centerline. The program will not be able to fill this gap automatically. Thus, if you experience problems with the calculation of the centerline consider to increase the spatial resolution of bank points. The following example illustrates how this fixes the problem.

Figure: fix Figure 3: The same location of the channel with two different bank point spacings.

Another problem can arise from an unsuitable settings of the span for the calculation of the transects (step f). The transects are perpendicular to a line that goes through the outer points of a group of n points of the centerline. This n equals 3 by default (parameter $transects.span). The following plan view plot illustrates to what misinterpretation of the channel width can lead:

$span\_issues$ Figure 4, left: the transects (perpendiculars to the centerline) do not intersect with banks properly, thus channel width is overrepresented. Right: an increased transect span fixes the problem and channel width is now identified correctly.

It can be seen that one of the red transects does not touch the left bank of the channel, thus leading to an overestimated channel width at this location. To prevent this, you can increase the span of the transect calculation.

AntoniusGolly/cmgo documentation built on Sept. 24, 2021, 1:33 a.m.