R/subbasin.R

Defines functions calc_subbas

Documented in calc_subbas

# lumpR/subbasin.R
# Copyright (C) 2014-2018 Tobias Pilz
# 
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# 
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
# 
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.


#' Calculation of hydrological subbasins using GRASS GIS
#' 
#' Takes DEM from a GRASS location and a file of drainage locations to calculate 
#' hydrological subbasins for each drainage point using GRASS functions.
#' 
#' @param dem Digital Elevation Model in GRASS location used for delineation of
#'      subbasins. Should be larger than the expected catchment, otherwise artefacts
#'      close to boundaries may occur.
#' @param drain_points \code{SpatialPoints} object containing drainage locations in
#'      units of and compliant with the projection of your respective GRASS location.
#'      Can, e.g., be imported from GRASS with
#'      drain_points = read_VECT(vname = "subbas_outlets", layer=1).
#'      At least the watershed drainage point has to be given. If it contains column
#'      'subbas_id' in the attribute table, this will be used for numbering the subbasins.
#'      IDs of additionally delineated subbasins (if \code{thresh_sub != NULL}) will be appended.
#' @param river River vector map in GRASS location if available. If set to \code{NULL}
#'      (default value) river network will be calculated by GRASS function \emph{r.watershed}.
#' @param flowaccum (optional) Existing raster map of flow accumulation in GRASS location. Must correspond to map generated by \code{r.watershed}.
#'      If \code{drainage_dir} or \code{flowaccum} is set to \code{NULL} (default value), both will be calculated by GRASS function \emph{r.watershed}.
#' @param drainage_dir (optional) Existing raster map of drainage direction in GRASS location. Must correspond to map generated by \code{r.watershed}.
#'      If \code{flowaccum} or \code{flowaccum} is set to \code{NULL} (default value), both will be calculated by GRASS function \emph{r.watershed}.
#' @param disk_swap (optional) Only needed if memory requirements exceed available RAM (large DEMs): If set to \code{TRUE}, \code{r.watershed} uses the "-m"-flag (slow)
#' @param basin_out Output: Name of subbasins raster map exported into GRASS location.
#' @param stream Output: Prefix of calculated stream segments vector (<stream>_vect) and
#'      raster (<stream>_rast) maps exported into GRASS location. Only generated if
#'      \code{river} is not set. Default: \code{NULL}.
#' @param points_processed Output: Prefix of point vector files exported to GRASS location.
#'      \code{<points_processed>_snapped_t} are given \code{drain_points} snapped to river.
#'      \code{<points_processed>_calc_t} are internally calculated drain points (only if
#'      parameter \code{thresh_sub} is not \code{NULL}, see below).
#' @param outlet Integer (row number) defining the catchment outlet in \code{drain_points}.
#'      If there are \code{drain_points} outside the watershed delineated for the
#'      outlet point these will be omitted. If \code{NULL} (default) and \code{drain_points}
#'      contains only one point this will be assumed as catchment outlet.
#' @param thresh_stream Integer defining threshold for stream calculation. Raster
#'      cells in accumulation map with values greater than thresh_stream are
#'      considered as streams. Needs to be set only if \code{river} is not set.
#'      Default: \code{NULL}.
#' @param thresh_sub Integer defining threshold for subbasin calculation. Parameter for
#'      GRASS function \emph{r.watershed} defining the minimum size of an exterior
#'      watershed basin in number of grid cells. If \code{NULL} (default) only the
#'      given drainage points are used for subbasin delineation.
#' @param snap_dist Integer defining maximum distance for snapping of \code{drain_points}
#'      to stream segments in units of your GRASS location.
#' @param rm_spurious \code{numeric}. If greater zero, spurious subbasins will
#'      be removed, i.e. those subbasins being smaller than \code{rm_spurious} times \code{thresh_sub}.
#'      Spurious subbasins are 'interior' watersheds created by GRASS function
#'      \emph{r.watershed} around stream segments below multiple tributaries. If they
#'      are very small they induce unnecessary computational burden when used within a
#'      hydrological model. If removed, these areas will be related to the next upstream
#'      subbasins, respectively. If \code{thresh_sub = NULL} (default) \code{rm_spurious}
#'      will be automatically set to \code{0}. Default: 0.01.
#' @param keep_temp \code{logical}. Set to \code{TRUE} if temporary files shall be kept
#'      in the GRASS location, e.g. for debugging or further analyses. Default: \code{FALSE}.
#' @param overwrite \code{logical}. Shall output of previous calls of this function be
#'      deleted? If \code{FALSE} the function returns an error if output already exists.
#'      Default: \code{FALSE}.
#' @param silent \code{logical}. Shall the function be silent (also suppressing warnings
#'      of internally used GRASS functions)? Default: \code{FALSE}.
#'      
#' @return Function returns nothing. Various output is generated in the GRASS-location:
#'  \itemize{
#'    \item{stream segments}{If \code{river} is not supplied,  vector (<stream>_vect) and raster (<stream>_rast) will be generated.} 
#'    \item{Subbasin map}{raster map \code{basin_out}} 
#'  }  
#'  If \code{keep_temp=TRUE}, the following temporary maps are preserved and can be used for tracing errors:
#'  \itemize{
#'    \item{\code{accum_t}}{If \code{flowaccum} is not supplied, contains map of flow accumulation.} 
#'    \item{\code{drain_t}}{If \code{flowaccum} is not supplied, contains map of darinage direction.} 
#'    \item{\code{<points_processed>_t}}{Raw subbasin outlet points.} 
#'    \item{\code{<points_processed>_centered_t}}{Subbasin outlet points centered at cell centers.} 
#'    \item{\code{<points_processed>_shifted_t}}{Subbasin outlet points shifted slightly out of centers (for preventing pathological cases while snapping).} 
#'    \item{\code{<points_processed>_snapped_t}}{Raw subbasin outlet points snapped to closest river.} 
#'    \item{\code{<points_processed>_calc_t}}{internally calculated drain points (only if parameter thresh_sub is not NULL).} 
#'    \item{\code{<points_processed>_all_t}}{combined drain points as raster (easier to identify double drain points sharing one raster cell).} 
#'  }  
#'      
#' @details
#'  The function constructs subbasins based on \code{param drain_points} (if given) and \code{param thresh_sub}
#'  (if given). Both are merged, preserving the former outlet points, if present.
     
#' @note \bold{Prepare GRASS} location and necessary raster files in advance and start
#'      GRASS session in R using \code{\link[rgrass7]{initGRASS}}. Location
#'      should not contain any maps ending on *_t as these will be removed by
#'      calling the function to remove temporary maps.
#'      
#'      You should select your DEM \bold{sufficiently large}. Otherwise, the resulting
#'      catchment might be truncated or boundaries influence the calculation
#'      of stream segments.
#'      
#'      \bold{Check the results} (subbasins and snapped points). In case points have been snapped
#'      to the wrong stream segment, adjust point locations manually in GRASS and re-run
#'      the function with the updated locations (use \code{\link[rgrass7]{read_VECT}}
#'      to import the updated drainage points).
#'      
#'      Generated raster and vector stream \bold{maps might slightly deviate} from each other
#'      as the raster map is thinned (GRASS function \emph{r.thin}) prior to conversion
#'      to a vector map to ensure strictly linear features.
#'      
#'      If you run into \bold{memory issues}, consider argument \code{disk_swap} (see also 
#'      \link[GRASS homepage]{https://grass.osgeo.org/grass74/manuals/r.watershed.html#in-memory-mode-and-disk-swap-mode})
#'      and see discussion on \link[lumpR's github page]{https://github.com/tpilz/lumpR/issues/16}.
#'      
#' @references 
#'      lumpR package introduction with literature study and sensitivity analysis:\cr
#'      Pilz, T.; Francke, T.; Bronstert, A. (2017): lumpR 2.0.0: an R package facilitating
#'      landscape discretisation for hillslope-based hydrological models.
#'      \emph{Geosci. Model Dev.}, 10, 3001-3023, doi: 10.5194/gmd-10-3001-2017
#' 
#' @author Tobias Pilz \email{tpilz@@uni-potsdam.de}

calc_subbas <- function(
  ### INPUT ###
  dem=NULL,
  drain_points=NULL,
  river=NULL,
  flowaccum=NULL,
  drainage_dir=NULL,
  disk_swap=FALSE,
  
  ### OUTPUT ###
  basin_out=NULL,
  stream=NULL,
  points_processed=NULL,
  
  ### PARAMETER ###
  outlet=NULL,
  thresh_stream=NULL,
  thresh_sub=NULL,
  snap_dist=NULL,
  rm_spurious=0.01,
  keep_temp=F,
  overwrite=F,
  silent=F
) {
  
  ### PREPROCESSING ###----------------------------------------------------------
  
  if(!silent) message("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%")
  if(!silent) message("% START calc_subbas()")
  if(!silent) message("%")
  if(!silent) message("% Initialise function...")
  
  # CHECKS #
  tryCatch(gmeta(), error = function(e) stop("Cannot execute GRASS commands. Maybe you forgot to run initGRASS()?"))
  if(is.null(dem) | dem=="")
    stop("The name of a DEM within the mapset of your initialised GRASS session has to be given!")
  if(is.null(drain_points) | !grepl("SpatialPoints", class(drain_points)))
    stop("drain_points has to be given as SpatialPoints* object with at least one catchment outlet point!")
  if(is.null(river) & (is.null(thresh_stream | is.null(stream))))
    stop("If no river object is given, stream as name prefix for the generated stream maps and the parameter thresh_stream have to be specified for internal calculation of the river network!")
  if(!is.null(river))
    check_vector(river, "river")
  if (identical(flowaccum,"")) flowaccum=NULL
  if (identical(drainage_dir,"")) drainage_dir=NULL
  if(xor(!is.null(flowaccum), !is.null(drainage_dir))) 
    stop("If using existing rasters flowaccum and drainage_dir, both have to be specified (or neither).")
  if(is.null(basin_out)  | basin_out=="")
    stop("You have to specify basin_out as name for the subbasin map to be generated!")
  if(is.null(points_processed))
    stop("You have to specify points_processed!")
  if(!is.numeric(snap_dist))
    stop("You have to specify snap_dist as a number!")
  if(is.null(outlet)) {
    if(nrow(drain_points@coords) > 1)
      stop("You have to give 'outlet' if the given number of drain_points is greater than one!")
    outlet <- 1
  }
  if(is.null(thresh_sub))
    rm_spurious <- 0
  if(!is.numeric(rm_spurious))
    stop("Argument 'rm_spurious' has to be numeric (behaviour changed in version 2.0.4)!")
  
  
  # CLEAN UP AND RUNTIME OPTIONS #  
  cmd_out <- execGRASS("g.version", intern=TRUE)
  if (cmd_out=="")
    stop("Couldn't connect to GRASS-session. Try removing any open sinks by calling 'sink()' repeatedly. Or restart R.")
  
  # suppress annoying GRASS outputs 
  tmp_file <- file(tempfile(), open="wt")
  sink(tmp_file, type="output")
  
  
  # also supress warnings in silent mode
  if(silent){
    tmp_file2 <- file(tempfile(), open="wt")
    sink(tmp_file2, type="message")
    oldw <- getOption("warn")
    options(warn = -1)
  }
  
  
  ### CALCULATIONS ###-----------------------------------------------------------
  tryCatch({
    
    # remove mask if there is any (and ignore error in case there is no mask)
    tryCatch(suppressWarnings(execGRASS("r.mask", flags=c("r","quiet"))), error=function(e){})
    
    # remove output of previous function calls if overwrite=T
    if (overwrite) {
      remove_pattern=paste0("*_t,", basin_out, ",", points_processed, "_*")
      # keep rivermap if prespecified
      if (is.null(river))
        remove_pattern <- paste0(remove_pattern, paste0(",",stream,"_*")) #keep rivermap if prespecified
      cmd_out <- execGRASS("g.remove", type="raster,vector", pattern=remove_pattern, flags=c("f", "b"), intern=T)
    } else {
      # remove temporary maps in any case
      cmd_out <- execGRASS("g.remove", type="raster,vector", pattern="*_t", flags=c("f", "b"), intern=T)
    }
    
    if(!is.null(flowaccum) )
    {
      # copy existing maps
      x <- execGRASS("g.copy", raster=paste0(flowaccum,",accum_t"), intern=TRUE, ignore.stderr = TRUE)
      x <- execGRASS("g.copy", raster=paste0(drainage_dir,",drain_t"), intern=TRUE, ignore.stderr = TRUE)
    }
    
    if(!silent) message("% OK")
    
    
    ### calc stream segments or use user defined input---------------------------
    
    if(disk_swap) {
      ws_flags <- c("overwrite","m", "s")
    } else {
      ws_flags <- c("overwrite", "s")
    }
    if(is.null(river)) {
      if(!silent) message("%")
      if(!silent) message("% Calculate drainage and river network...")
      # GRASS watershed calculation #
      # flags to use for r.watershed
      
      if(is.null(flowaccum) )
        execGRASS("r.watershed", elevation=dem, accumulation="accum_t", drainage="drain_t", flags = ws_flags)
      # check thresh_stream parameter
      cmd_out <- execGRASS("r.univar", map="accum_t", separator="comma", flags=c("t"), intern=TRUE, ignore.stderr = TRUE)
      
      cmd_out <- strsplit(cmd_out, ",")
      cmd_cols <- grep("^max$|^min$", cmd_out[[1]]) # ignore negative accumulation values (warning will be issued)
      max_acc <- max(abs(as.numeric(cmd_out[[2]][cmd_cols])))
      if(thresh_stream > max_acc)
        stop(paste0("Parameter 'thresh_stream' (", thresh_stream, ") is larger than the maximum flow accumulation within the study area (", max_acc, "). Choose a smaller parameter value!"))
      # calculate stream segments (don't use output of r.watershed as streams should be finer than generated therein)
      cmd_out <- execGRASS("r.mapcalc", expression=paste0(stream, "_rast = if(abs(accum_t)>", format(thresh_stream, scientific = F), ",1,null())"), intern=T)
      # thin
      cmd_out <- execGRASS("r.thin", input=paste0(stream, "_rast"), output=paste0(stream, "_thin_t"), iterations=10000, intern=T)
      # convert to vector
      cmd_out <- execGRASS("r.to.vect", input=paste0(stream, "_thin_t"), output=paste0(stream, "_vect"), type="line", flags="quiet", intern=T)
      river <- paste0(stream, "_vect")
      if(!silent) message("% OK")
      
    } else {
      
      if(is.null(flowaccum) )
      {
        if(!silent) message("%")
        if(!silent) message("% Calculate drainage...")
        cmd_out <- execGRASS("r.watershed", elevation=dem, drainage="drain_t", flags = ws_flags, intern = T)
        if(!silent) message("% OK")
      }
    }
    
    # check flowaccum raster for negative values
    cmd_out <- execGRASS("r.univar", map="accum_t", separator="comma", flags=c("t"), intern=TRUE, ignore.stderr = TRUE)
    cmd_out <- strsplit(cmd_out, ",")
    cmd_cols <- grep("^min$", cmd_out[[1]])
    min_acc <- as.numeric(cmd_out[[2]][cmd_cols])
    if(min_acc < 0) warning("Negative flow accumulation values detected! This happens if cells get runoff from regions outside the study area, i.e. the extension of your DEM might be too small. Check if this could be a problem!")
    
    
    
    ### calculate subbasins without given drainage points (optional)-------------
    if(is.numeric(thresh_sub)) {
      if(!silent) message("%")
      if(!silent) message("% Calculate subbasins based on given area threshold...")
      
      # calculate subbasins
      cmd_out <- execGRASS("r.watershed", elevation=dem, basin="basin_calc_t", threshold=thresh_sub, flags = ws_flags, intern=T)
      if(!silent) message("% OK")
    }
    
    
    ### snap given drainage points to streams------------------------------------
    if(!silent) message("%")
    if(!silent) message("% Snap given drainage points to streams...")
    
    
    ### ensure that given drainage points are NOT precisely at cell centres, because this may cause pathologic
    ### cases when snapping to streams (ending up at cell corners instead of cell interior)
    # add data slot and columns subbas_id and cat, if not given
    if(!any(slotNames(drain_points) == "data"))
      drain_points <- SpatialPointsDataFrame(drain_points, data=data.frame(subbas_id=1:length(drain_points)))
    
    if (any(duplicated(drain_points@data$subbas_id))) 
    {  
      warning("Duplicated subbas_id in drainage points. Ignoring IDs, using row numbers instead.")
      drain_points@data$subbas_id=NULL 
    }
    
    if(!any(colnames(drain_points@data) == "subbas_id"))
      drain_points@data <- cbind(drain_points@data, subbas_id=1:length(drain_points))
    if(!any(colnames(drain_points@data) == "cat"))
      drain_points@data <- cbind(drain_points@data, cat=1:length(drain_points))
    # force conversion to numeric 
    drain_points@data$subbas_id=as.numeric(as.character(drain_points@data$subbas_id)) 
    if (any(!is.finite(drain_points$subbas_id)))
      stop("The column 'subbasin_id' in drain_points contains non-numeric entries.")
    # write to GRASS
    suppressWarnings(proj4string(drain_points) <- CRS(getLocationProj()))
    suppressWarnings(writeVECT(drain_points, paste0(points_processed,"_t"), v.in.ogr_flags = c("o","quiet")))
    clean_temp_dir(paste0(points_processed, "_t"))
    
    # move drainage points to centers of raster cells
    x <- execGRASS("v.to.rast", input=paste0(points_processed,"_t"), output=paste0(points_processed,"_t"), use="attr", attribute_column="subbas_id", flags="overwrite", intern=T)
    x <- execGRASS("r.to.vect", input=paste0(points_processed,"_t"), output=paste0(points_processed,"_centered_t"), type="point", flags = c("overwrite", "quiet"))
    drain_points_centered <- read_VECT(vname = paste0(points_processed,"_centered_t"), layer=1)
    clean_temp_dir(paste0(points_processed,"_centered_t"))
    
    colnames(drain_points_centered@data) <- c("cat","subbas_id")
    
    # determine raster resolution
    res <- gmeta()
    res <- sum(c(res$nsres, res$ewres)) / 2
    
    # create shifted version of drainage points (shifted by 1/4 of resolution)
    drain_points_shifted <- drain_points_centered
    drain_points_shifted@coords <- drain_points_shifted@coords + res/4
    suppressWarnings(writeVECT(drain_points_shifted, paste0(points_processed,"_shifted_t"), v.in.ogr_flags = c("o","quiet")))
    clean_temp_dir(paste0(points_processed,"_shifted_t"))
    
    outlet_id <- drain_points@data$subbas_id[outlet] #remember outlet by ID, not by row
    drain_points <- drain_points_shifted
    rm(drain_points_shifted, drain_points_centered)
    
    # read stream vector
    streams_vect <- read_VECT(river)
    clean_temp_dir(river)
    
    # snap points to streams
    drain_points_snap <- suppressWarnings(snapPointsToLines1(drain_points, streams_vect, maxDist=snap_dist))
    drain_points_snap$nearest_line_id=NULL #we don't need this and this long field name causes trouble
    
    # export drain_points_snap to GRASS
    suppressWarnings(writeVECT(drain_points_snap, paste0(points_processed, "_snapped_t"), v.in.ogr_flags = c("o","quiet")))
    clean_temp_dir(paste0(points_processed, "_snapped_t"))
    
    if (length(drain_points_snap) < length(drain_points)) stop("Less points after snapping than in drain_points input!\nComputed stream segments are probably too coarse. Try a smaller value of thresh_stream to create a fine river network.")
    
    if(!silent) message("% OK") 
    
    
    ### calculate catchments for every drainage point----------------------------
    if(!silent) message("%")
    if(!silent) message("% Calculate catchments for every drainage point...")
    
    # update index to outlet, as its order may have changed during previous steps
    outlet <- which(drain_points_snap@data$subbas_id == outlet_id)
    outlet_coords <- coordinates(drain_points_snap)[outlet,]
    
    cmd_out <- execGRASS("r.water.outlet", input="drain_t", output=paste0("basin_outlet_t"), coordinates=outlet_coords, flags="overwrite", intern = T)
    cmd_out = execGRASS("r.stats", input=paste0("basin_outlet_t"), flag=c("c","n","quiet"), intern = TRUE)
    ncells = as.numeric(strsplit(cmd_out, split = " ")[[1]][2])
    if (!is.finite(ncells) | ncells < 100)
      stop(paste0("Number of cells in calculated catchment is very low (",ncells,"). Try using a filled DEM or correcting this point."))
    
    # get drainage points of calculated subbasins (optional)
    if(is.numeric(thresh_sub)) {
      
      # set watershed of outlet point as mask
      cmd_out <- execGRASS("g.copy", raster=paste0("basin_outlet_t,MASK"), flags = "overwrite", intern = T)
      
      # the following calculations only make sense if thresh_sub is small enough to produce more subbasins than determined by drain_points
      no_catch_calc <- length(as.numeric(execGRASS("r.stats", input="basin_calc_t", flags=c("n"), intern=T, ignore.stderr = T)))
      if(no_catch_calc > 1) {
        
        # read raster data from GRASS for processing
        basins <- raster(readRAST("basin_calc_t", ignore.stderr = T))
        basins = as.integer(basins)
        
        accum <- raster(readRAST("accum_t", ignore.stderr = T))
        accum <- abs(accum) # ignore negative accumulation values (warning will be issued)
        accum = as.integer(accum)
        
        # calculate zonal statistics: Maximum accumulation for every subbasin (=outlet)
        stats <- zonal(accum, basins, fun="max")
     
        # remove calculated watershed outlet (point of maximum flow accumulation) as this has been given as input
        stats <- stats[-which(stats[2] == max(stats[2])), ]
        stats = apply(FUN=as.integer, MAR=2, X=stats) #convert to integer
        
        # get coordinates of outlets
        outs <- apply(stats, 1, function(x) {
          
          #system.time(
          cell_no <- Which(accum==x[2], cells=T)  

          # if there is more than one cell, get the one in the right subbasin
          if(length(cell_no) > 1) {
            cell_bas <- Which(basins==x[1], cells=T)
            cell_no <- cell_no[which(cell_no %in% cell_bas)]
          }
          #)                                       #74 sec
        
          # Test, if joint command can save time -> no, takes longer
            # system.time(
            # cell_no <- Which(accum==x[2] & basins==x[1], cells=T)) #145 (float accum & basins, float stats)
            #                                                       #130 (integer accum, float x, integer stats)
            #                                                      #110  (all integer)

          # get coordinates of cell_no
          if (length(cell_no) > 1) warning("Outlet cell could not be found inequivocally (R/GRASS bug). Please check")          
          res <- round(xyFromCell(accum, cell_no),0)[1,] #ignore any potential multiple cells (bug in raster comparison for rasters with large numbers/scientific notation)
          res <- c(res, x[1])
          return(res)
        })
        
     
        # delete raster objects
        rm(accum, basins)
        gc(verbose = F); gc(verbose = F)
        
        # re-arrange data
        outs <- t(outs)
        colnames(outs) <- c("x", "y", "cat")
        outs <- as.data.frame(outs)
        # subbas_id, make sure they are distinct from drain_points_snap
        new_ids = 1:(nrow(outs) + nrow(drain_points_snap@data)) # potential new IDs
        new_ids = setdiff(new_ids, drain_points_snap@data$subbas_id) #remove IDs that are alredy in use
        new_ids = new_ids[1:nrow(outs)]  #use only as many as needed
        outs <- cbind(outs, subbas_id = new_ids)
        coordinates(outs) <- c("x", "y")
        
        # as SPDF
        drain_points_calc <- SpatialPointsDataFrame(coordinates(outs), outs@data, proj4string = CRS(getLocationProj()))
        
        # write to GRASS location
        writeVECT(drain_points_calc, paste0(points_processed, "_calc_t"), ignore.stderr = T, v.in.ogr_flags = c("o","quiet"))
        
        # drain_points_snap df requires columns 'cat' and 'subbas_id' as drain_points_calc_t
        drain_points_snap@data <- drain_points_snap@data[,c("cat", "subbas_id")]
        suppressWarnings(proj4string(drain_points_snap) <- CRS(getLocationProj()))
        
        # merge with existing drain points object (snapped points first as there the outlet is identified)
        drain_points_snap <- rbind(drain_points_snap, drain_points_calc)
        
      } # more than one subbasin
      
      # remove mask
      cmd_out <- execGRASS("r.mask", flags=c("r"), intern = T)
    }
    
    # combine drain points as raster (easier to identify double drain points sharing one raster cell)
    suppressWarnings(writeVECT(drain_points_snap, paste0(points_processed, "_all_t"), v.in.ogr_flags = c("o","quiet")))
    clean_temp_dir(paste0(points_processed,"_all_t"))
    
    x <- execGRASS("v.to.rast", input=paste0(points_processed, "_all_t"), output=paste0(points_processed, "_all_t_t"), use="attr", attribute_column="subbas_id", flags="overwrite", intern=T)
    #? how do we ensure duplicate drainage points keep the manually assigned subbas_id? i.e., not being overwritten by automatically generated ID.
    x <- execGRASS("r.mapcalc", expression=paste0(points_processed, "_all_t=round(", points_processed, "_all_t_t)"), flags = c("overwrite"), intern=T) #convert raster map to integer
    x <- execGRASS("g.remove", type="raster", pattern=paste0(points_processed, "_all_t_t"), flags="f", intern=T) #remove temporary map required in previous line
    
    # get coordinates of drain point cells
    drainp_coords <- execGRASS("r.stats", input = paste0(points_processed, "_all_t"), flags=c("n", "g", "quiet"), intern=T)
    drainp_coords <- matrix(as.numeric(unlist(strsplit(drainp_coords, " "))), ncol = 3, byrow = T)
    
    # loop over drainage points of subbasins; TODO: This step is slow!
    for (p in 1:nrow(drainp_coords)) {
      
      # outlet coordinates
      outlet_coords <- drainp_coords[p,c(1,2)]
      
      # drain points subbasin id (optionally defined in 'drain_points' input)
      id <- drainp_coords[p,3]
      
      # basin
      cmd_out <- execGRASS("r.water.outlet", input="drain_t", output=paste0("basin_", id, "_t"), coordinates=outlet_coords, intern = T)
      # TODO: better clip any already existing subbasins from the automatically generated map
      
      cmd_out = execGRASS("r.stats", input=paste0("basin_", id, "_t"), flag=c("c","n","quiet"), intern = TRUE)
      ncells = as.numeric(strsplit(cmd_out, split = " ")[[1]][2])
      if (!is.finite(ncells) | ncells < 100)
        warning(paste0("Number of cells in calculated catchment ",id," is very low (",ncells,"). Try using a filled DEM and check outlet points."))
      
      # reclass (for crossing later on)
      # ii: can this be done with reclass to conserve space?
      cmd_out <- execGRASS("r.mapcalc", expression=paste0("basin_recl_", id, "_t = if(basin_", id, "_t,", id, ")"), intern=T)
      
    }
    
    no_catch <- nrow(drainp_coords)
    
    if(!silent) message(paste("% -> Identified", no_catch, "subbasins."))
    if(!silent) message("% OK")
    
    
    
    ### merge all sub-catchments-------------------------------------------------
    if(!silent) message("%")
    if(!silent) message("% Merge calculated catchments...")
    
    # put sub-catchments together
    subcatch_rasts <- paste0("basin_recl_",drainp_coords[,3], "_t")
    
    iteration_nr= 0 #initialise counting variable; number of iterations needed to remove spurious subbas
    
    # if more than one sub-catchment
    if(no_catch > 1) {
      
      # iterate until configuration without 'spurious' sub-catchments is found (if rm_spurious > 0) TODO: This step is slow in case many iterations are needed!
      while (TRUE) {
        
        # max 30 maps at once, create multiple cross products if necessary
        x <- execGRASS("g.remove", type="raster", pattern="basin_cross_*", flags="f", intern=T) # remove old basin_cross_*
        iterations <- ceiling(length(subcatch_rasts)/30) 
        for (j in 1:iterations){
          if (j == iterations) {
            if(length(subcatch_rasts) %% 30 == 1) { #special case: only one subbasin remaining
              x <- execGRASS("g.copy", raster=paste(subcatch_rasts[((j-1)*30+1):length(subcatch_rasts)], paste0("basin_cross_", j, "_t"), sep=","), 
                             intern=T, ignore.stderr=T)
            } else { #general case: more than one remaining
              x <- execGRASS("r.cross", input=paste(subcatch_rasts[((j-1)*30+1):length(subcatch_rasts)], collapse=","),
                             output=paste0("basin_cross_", j, "_t"), flags = c("overwrite"), intern=T, ignore.stderr=T)
            }
          } else {
            x <- execGRASS("r.cross", input=paste(subcatch_rasts[((j-1)*30+1):(j*30)], collapse=","),
                           output=paste0("basin_cross_", j, "_t"), flags = c("overwrite"), intern=T, ignore.stderr=T)
          }
        }
        
        # merge cross products
        cross_rasts <- execGRASS("g.list", type="raster", pattern=paste0("basin_cross_[0-9]*_t"), intern=T)
        
        if(length(cross_rasts) == 1) {
          x <- execGRASS("g.rename", raster=paste(cross_rasts, "basin_all_t", sep=","), intern=T, ignore.stderr=T)
        } else {
          x <- execGRASS("r.cross", input=paste(cross_rasts,collapse=","), output="basin_all_t",
                         flags = c("overwrite"), intern=T, ignore.stderr=T)
        }
  
  #Former Problem: if thresh_sub and rm_spurious are rather high, 
  #        subbas with specified outlets (drain_points) are removed, while
  #        larger subbas (according to thresh_sub) with "artificial" outlets are kept
  #Aim: keep manually specified outlets, even if produced subbas < rm_spurious
  #     if specified subbas too small, remove upstream neighbours instead and merge with specified subbas
        
        #find out subbas combinations / upstream subbas
        cmd_out <- execGRASS("r.stats", input="basin_all_t", flags=c("n","l"), intern=T, ignore.stderr = T)
        #reformat list data
        cmd_out=  gsub(x=cmd_out, pattern = "^(\\d*) ", repl="\\1;") # seperate 1st ID in line with ;
        cmd_out = gsub(x=cmd_out, pattern = "category", repl="")     # remove "category"
        cmd_out = gsub(x=cmd_out, pattern = " *", repl="")           # remove blank space
        cmd_out = gsub(x=cmd_out, pattern = "NULL", repl="")         # remove "NULL"
        cmd_out = strsplit(cmd_out, split = ";")                     # split entries
        
        #get maximum subbas_id in entire list
        max_subbas_id=max(as.numeric(sapply(FUN=max, cmd_out))) #number of columns
        
        #n_comb = max(sapply(FUN=length, cmd_out))  #get number of columns to produce
        #subbas_combinations = array(NA, c(length(cmd_out), n_comb))
        
        #create array of subbas combinations, 
        subbas_combinations = array(NA, c(length(cmd_out), max_subbas_id)) #array with n_lines = length(cmd_out) and n_cols = max_subbas_id
        map_id = array(NA, length(cmd_out))
        cmd_out2 = list()
        for(i in 1:length(cmd_out))
        {
          col_indices=as.numeric(cmd_out[[i]])
          map_id[i] = col_indices [1]
          cmd_out2[[i]] =  na.omit(as.numeric(cmd_out[[i]][-1]))
          subbas_combinations[i, na.omit(col_indices [-1])] = TRUE
        }
        subbas_combinations[is.na(subbas_combinations)] = FALSE #replace NAs with FALSE
        
        # check for and correct error in r.cross, see https://lists.osgeo.org/pipermail/grass-user/2018-February/077934.html
        if(any(map_id == 0)) {
          cmd_out <- execGRASS("r.mapcalc", expression="basin_all_t2=basin_all_t+1", flags=c("overwrite"), intern=T)
          cmd_out <- execGRASS("g.rename", raster="basin_all_t2,basin_all_t", flags=c("overwrite"), intern=T)
          map_id = map_id + 1
        }
        
        # check size of sub-catchments and identify and remove 'spurious' sub-catchments
        if(rm_spurious>0) {
          if(!silent) message(paste("% Iteratively remove spurious subcatchments ...(Iteration:",iteration_nr,")"))
          iteration_nr=iteration_nr+1
          # get sub-catchments and sizes (cell counts) and identify spurious ones
          cmd_out <- execGRASS("r.stats", input="basin_all_t", flags=c("n", "c"), intern=T, ignore.stderr = T) #internal subbas IDs of basin_all_t
          sub_sizes <- matrix(as.numeric(unlist(strsplit(cmd_out, " "))), ncol=2, byrow=T)
          #sub_sizes <- sub_sizes[-which(sub_sizes[,1] == 0),]
          sub_rm <- sub_sizes[which(sub_sizes[,2] < rm_spurious*thresh_sub),1] #internal subbas IDs below removal threshold
          if(length(sub_rm)>0) {
          # get external IDs in basin_recl_* to be removed (not identical with internal raster values of basin_all_t!)
            #cmd_out <- execGRASS("r.univar", map=paste0(points_processed, "_all_t"), zones="basin_all_t", separator="comma", flags=c("t"), intern=T, ignore.stderr = T)
            cmd_out <- execGRASS("r.stats", input=paste0("basin_all_t,", points_processed, "_all_t"), flags=c("n"), separator="comma", intern=T, ignore.stderr = T)
            #todo: do this only once, not in all iterations
            cmd_out <- strsplit(cmd_out, ",") #conversion table internal "ID of r.cross" to external "ID of subbasins" 
            #cmd_cols <- grep("zone|^mean$", cmd_out[[1]])
            #basins_points <- do.call(rbind, cmd_out)[-1,cmd_cols, drop=F]
            
          # convert output to matrix; this is the translation of internal temp. subbas IDs (1st column) to external drain point subbas IDs (2nd column)
            basins_points = matrix(unlist(cmd_out), ncol=2, byrow = TRUE) 
          # get external subbas IDs to be removed
            sub_rm_f <- as.numeric(basins_points[which(as.numeric(basins_points[,1]) %in% sub_rm),2]) 
            
            manually_specified = intersect(sub_rm_f, drain_points@data$subbas_id) #these points are manually specified outlets (=external IDs) - don't remove them
            remove_instead = NULL
            if (length(manually_specified)>0) #if manually specified external subbas ID detected, remove their upstream neighbours instead
            {
              for (cur_sub in manually_specified) #cur_sub = external ID
              {
                cross_id = basins_points[basins_points[,2] == cur_sub, 1]   #cross_ID = internal ID in temporary basin map
                #curr_line = subbas_combinations[ as.numeric(cross_id),] #configuration of current subbasin
                #curr_line = na.omit(as.numeric(cmd_out[[ cross_id ]][-1])) #configuration of current subbasin
                
                curr_line = cmd_out2[[ as.numeric(cross_id) ]] #configuration of current subbasin, contains temporary internal subbas IDs of all encompassing higher hierarchy basins
                
                #tt =apply(subbas_combinations, MARGIN = 1, FUN= function(x,y){sum(x!=y)}, y=curr_line)
                tt =unlist(lapply(cmd_out2, FUN= function(x,y){length(c (setdiff(x, y), setdiff(y,x)) )}, y=curr_line)) #number of differences in cofiguration; 0 = currently active subbas; 1 = nearest upstream or downstream neighbour
                
                #FUN(x=cmd_out2[[13]], y= curr_line)
                
                upstream_neighbours =  which(tt == 1 ) #internal subbas ID
                for (j in upstream_neighbours)   # remove first downstream internal subbas ID (only preserve upstream neighbours)
                {
                  if (!cur_sub %in% cmd_out2[[j]])
                    upstream_neighbours = setdiff(upstream_neighbours, j)
                }
                
                #& subbas_combinations[ map_id == cur_sub,]
                #cmd_out2[as.numeric(c(cross_id, upstream_neighbours))]
                #subbas_combinations[ c(cross_id, upstream_neighbours),]
                remove_instead = c(remove_instead, upstream_neighbours) #upstream internal subbas IDs to be removed

              }
              
              sub_rm_f = setdiff(sub_rm_f, manually_specified) #external ID; don't remove the manually specified ones
              # translate internal ID of subbas to remove into external ID 
              remove_instead_ext = basins_points[basins_points[,1] %in% remove_instead, 2]
              sub_rm_f = unique(c(sub_rm_f, remove_instead_ext)) #external ID; instead, use their upstream neighbours
              
            }  
            
        # remove this temporary map from processing and drain points raster map and try again (back to start of while loop)
                  
            subcatch_rasts <- grep(paste0("basin_recl_", sub_rm_f, "_t", collapse="|"), subcatch_rasts, invert = T, value = T)
            x <- execGRASS("g.remove", type="raster", name="basin_all_t", flags = "f", intern=T)
            tmp_file <- tempfile()
            stats_t <- as.integer(execGRASS("r.stats", input = paste0(points_processed, "_all_t"), flags=c("n", "quiet"), intern=T)) #get external drain point subbas IDs
            stats_t <- stats_t[!(stats_t %in% sub_rm_f)]  #!!without IDs to be removed
            write(paste(paste(sub_rm_f, "NULL", sep = " = ", collapse = "\n"), paste(stats_t, stats_t, sep=" = ", collapse = "\n"), sep="\n"), file=tmp_file)
            x <- execGRASS("r.reclass", input=paste0(points_processed, "_all_t"), output=paste0(points_processed, "_all2_t"),
                           rules = tmp_file, flags = "overwrite")
            x <- execGRASS("r.mapcalc", expression=paste0(points_processed, "_all_t=", points_processed, "_all2_t"), flags="overwrite", intern=T) # convert to regular map
            # update no_catch
            no_catch <- no_catch - length(sub_rm_f)
          } else {
            break
          }
        } else {
          break # exit while loop
        }
        
      } # end while-loop
      
      # constrain to catchment of outlet point
      cmd_out <- execGRASS("r.mapcalc", expression=paste0(basin_out, "_t = basin_outlet_t * basin_all_t"), intern=T, flags="overwrite")
      
      
    } else { # only one sub-catchment
      
      cmd_out <- execGRASS("g.copy", raster=paste0("basin_outlet_t,", basin_out, "_t"), intern=T)
      
      if(no_catch == 0)
        stop("Number of identified sub-catchments is zero. Check input data!")
    }
    
    # assign correct ids (from 'subbas_id') to basin_out
    cmd_out <- execGRASS("r.to.vect", input = paste0(points_processed, "_all_t"), output = paste0(points_processed, "_all_t"),
                         type = "point", column = "subbas_id", flags = c("overwrite", "quiet"), intern=T)
    cmd_out <- execGRASS("v.db.addcolumn", map=paste0(points_processed, "_all_t"), columns="temp_id int", intern=TRUE) 
    cmd_out <- execGRASS("v.what.rast", raster=paste0(basin_out, "_t"), map=paste0(points_processed, "_all_t"), column="temp_id" ,intern=T, ignore.stderr = T)
    drain_points_snap <- read_VECT(paste0(points_processed, "_all_t"))
    nas = which(is.na(drain_points_snap@data$temp_id))
    if (any(nas))
    {  
      warning("Drainage point(s) ", paste0(nas, collapse=", "), " seem to lie outside catchment, please check.")
      drain_points_snap@data = drain_points_snap@data[-nas,]
    }  
    dat_rules <- paste(drain_points_snap@data$temp_id, "=", drain_points_snap@data$subbas_id, collapse = "\n") 
    tmp_file <- tempfile()
    write(dat_rules, file=tmp_file) # GRASS Gis reclass file: Old_ID = New_ID, temp_id is changed to subbas_id
    cmd_out <- execGRASS("r.reclass", input = paste0(basin_out, "_t"), output = paste0(basin_out, "2_t"), rules = tmp_file)
    cmd_out <- execGRASS("r.mapcalc", expression = paste0(basin_out, "=", basin_out, "2_t"), intern = T)
    
    no_cross <- length(execGRASS("r.stats", input=basin_out, flags=c("n"), intern=T, ignore.stderr = T))
    if(no_catch != no_cross) warning(paste0("\nNumber of categories in ", basin_out, " not equal to number of drainage points!\nThis might be because there are drainage points outside the catchment of the defined outlet or due to small inconsistencies between calculated and manually defined (and snapped) drainage points. However, you should check the output with the GRASS GUI and consider the help pages of this function. 
                                            Try correcting the drainage points manually by running 'v.digit map=drain_points_snap bgcmd=d.rast stream_accum_rast'"))
    
    
    
    # remove temporary maps
    if(keep_temp == FALSE)
      execGRASS("g.remove", type="raster,vector", pattern="*_t", flags=c("f"))
    
    
    if(!silent & rm_spurious > 0) message(paste("% -> Checked for spurious subbasins; ", no_catch, "subbasins left."))
    if(!silent) message("% OK")
    if(!silent) message("%")
    if(!silent) message("% -> Check the results for plausibility (e.g. inaccuracies at snapping of drain_points to streams may occur).")
    if(!silent) message("% -> If manual adjustments are necessary re-run this function. Existing grids of flow accumulation and direction may be used for speed-up.")
    if(!silent) message("% -> Note: Subbasin IDs of the calculated map 'subbas' might not be consecutively numbered! Please check.")
    if(!silent) message("%")
    if(!silent) message("% DONE!")
    if(!silent) message("%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%")
    
    
    # stop sinking
    closeAllConnections()
    
    # restore original warning mode
    if(silent)
      options(warn = oldw)
    
    
    
    
    # exception handling
  }, error = function(e) {
    
    # stop sinking
    closeAllConnections()
    
    # restore original warning mode
    if(silent)
      options(warn = oldw)
    
    # remove mask if there is any (and ignore error in case there is no mask)
    cmd_out <-tryCatch(suppressWarnings(execGRASS("r.mask", flags=c("r"), intern = T)), error=function(e){})
    
    if(keep_temp == FALSE)
      cmd_out <- execGRASS("g.remove", type="raster,vector", pattern=paste0("*_t,",stream,"_*,", basin_out, ",", points_processed, "_*"), flags=c("f", "b"), intern = T)
    
    stop(paste(e))
  })
  
  
} # EOF
tpilz/lumpR documentation built on Aug. 5, 2023, 1:31 a.m.