R/MK_dPCIIC_links.R

Defines functions connections_dPC connections_dIIC MK_dPCIIC_links

Documented in connections_dIIC connections_dPC MK_dPCIIC_links

#' @name MK_dPCIIC_links
#'
#' @title Estimate the differential of the connectivity indexes IIC and PC to prioritize links.
#'
#' @description This function calculates the dPC or dIIC index to estimate the link importance for conservation and restoration. It calculates the contribution of each individual link to \bold{maintain} (mode: link removal) or \bold{improve} (mode: link change) the overall connectivity under one or several distance thresholds.
#'
#' @param nodes Object containing nodes (e.g., habitat patches or fragments) information. It can be of the following classes:\cr
#' -   \code{Data.frame} with at least two columns: the first for node IDs and the second for attributes.\cr
#' -   Spatial data of type vector (class \code{sf, SpatVector, SpatialPolygonsDataFrame}). It must be in a projected coordinate system.\cr
#' -   Raster (class \code{RasterLayer, SpatRaster}). It must be in a projected coordinate system. The values must be integers representing the ID of each habitat patch or node, with non-habitat areas represented by NA values (see \link[raster]{clump} or \link[terra]{patches}).
#' @param attribute \code{Character or vector}. If \code{NULL} (only applicable when \code{nodes} is of spatial data of vector or raster type) the area of the nodes will be used as the node attribute. The unit of area can be selected using the \code{area_unit} parameter. To use an alternative attribute, consider the class type of the object in the \code{nodes} parameter: \cr
#' -   If \code{nodes} is a spatial vector or data.frame, specify \bold{the name of the column} containing the attribute for the nodes. \cr
#' -   If \code{nodes} is a raster layer then it must be a numeric vector with the node's attribute. The length of the vector must be equal to the number of nodes. If the parameter \bold{weighted} is \code{TRUE} then the numeric vector is multiplied by the area of each node to obtain a weighted habitat index.
#' @param LA \code{numeric}. (\emph{optional, default = } \code{NULL}). The maximum landscape attribute, which is the attribute value that would correspond to a hypothetical habitat patch covering all the landscape with the best possible habitat, in which IIC and PC would be equal to 1. For example, if nodes attribute corresponds to the node area, then LA equals total landscape area. If nodes attribute correspond to a quality-weighted area and the quality factor ranges from 0 to 100, LA will be equal to 100 multiplied by total landscape area. The value of LA does not affect at all the importance of the nodes and is only used to calculate the overall landscape connectivity. If no LA value is entered (default) and  \code{overall = TRUE} or \code{onlyoverall = TRUE}, the function will only calculate the numerator of the global connectivity indices and the equivalent connected ECA or EC index.
#' @param area_unit \code{character}. (\emph{optional, default = } \code{"m2"}) \cr. A \code{character} indicating the area units when \code{attribute} is \code{NULL}. Some options are "m2" (the default), "km2", "cm2", or "ha";  See \link[Makurhini]{unit_convert} for details.
#' @param distance  A \code{matrix} or \code{list} to establish the distance between each pair of nodes. Distance between nodes may be Euclidean distances (straight-line distance) or effective distances (cost distances) by considering the landscape resistance to the species movements. \cr
#' - If it is a \code{matrix}, then the number of columns and rows must be equal to the number of nodes. This distance matrix could be generated by the \link[Makurhini]{distancefile} function.\cr
#' - If it is a \code{list} of parameters, then it must contain the distance parameters necessary to calculate the distance between nodes. For example, two of the most important parameters: \code{“type”} and \code{“resistance”}. For \code{"type"} choose one  of the distances:  \bold{"centroid" (faster), "edge", "least-cost" or "commute-time"}. If the type is equal to \code{"least-cost"} or \code{"commute-time"}, then you must use the \code{"resistance"} argument. For example: \code{distance(type = "least-cost", resistance = raster_resistance)}. \cr
#' To see more arguments see the \link[Makurhini]{distancefile} function.
#' @param metric A \code{character} indicating the connectivity metric to use: \code{"PC"} (the default and recommended) to calculate the probability of connectivity index, and \code{"IIC"} to calculate the binary integral index of connectivity.
#' @param probability A \code{numeric} value indicating the probability that corresponds to the distance specified in the \code{distance_threshold}. For example, if the \code{distance_threshold} is a median dispersal distance, use a probability of 0.5 (50\%). If the \code{distance_threshold} is a maximum dispersal distance, set a probability of 0.05 (5\%) or 0.01 (1\%). Use in case of selecting the "PC" metric. If \code{probability = NULL}, then a probability of 0.5 will be used.
#' @param distance_thresholds A \code{numeric} indicating the dispersal distance or distances (meters) of the considered species. If \code{NULL} then distance is estimated as the median dispersal distance between nodes. Alternatively, the \link[Makurhini]{dispersal_distance} function can be used to estimate the dispersal distance using the species home range.
#' @param threshold \code{numeric}. Pairs of nodes with a distance value greater than this threshold will be discarded in the analysis which can speed up processing.
#' @param removal \code{logical} indicating whether the link removal mode should be used to calculate link importance (see details). Default is TRUE.
#' @param change (\emph{optional, default} \code{NULL}). A \code{numeric} indicating the new distances  used to calculate link importance under the link change mode should be (see details). By default, link change is not calculated.
#' @param overall \code{logical}. If \code{TRUE}, then the EC index will be added to the result which is transformed into a list. Default equal to FALSE
#' @param parallel  (\emph{optional, default =} \code{NULL}).
#' A \code{numeric} specifying the number of cores to parallelize the index estimation of the PC or IIC index and its deltas.Particularly useful when you have more than 1000 nodes. By default the analyses are not parallelized.
#' @param parallel_mode (\emph{optional, default =} \code{1}).
#' A \code{numeric} indicating the mode of parallelization: Mode \bold{1} (the default option, and recommended for less than 1000 nodes) parallelizes on the connectivity delta estimate, while Mode \bold{2} (recommended for more than 1000 nodes)
#' parallelizes on the shortest paths between vertices estimate.
#' @param write \code{Character} indicating the path and initial prefix of the objects to save, for example, "C:/example". By default, nothing is saved. The saved objects are: \cr
#' -   The importance of each link. The format of the output file is a csv.\cr
#' -   Overall landscape connectivity table (if the \code{overall} argument is \code{TRUE}). The format of the output file is a csv.
#' @param intern \code{logical}. Show the progress of the process, \code{default = TRUE}. Sometimes the advance process does not reach 100 percent when operations are carried out very quickly.
#' @references
#' -   Saura, S. & Torné, J. 2012. Conefor 2.6 user manual (May 2012). Universidad Politécnica de Madrid. Available at \url{www.conefor.org}.\cr
#' -   Pascual-Hortal, L. & Saura, S. 2006. Comparison and development of new graph-based landscape connectivity indices: towards the priorization of habitat patches and corridors for conservation. Landscape Ecology 21 (7): 959-967.\cr
#' -   Saura, S. & Pascual-Hortal, L. 2007. A new habitat availability index to integrate connectivity in landscape conservation planning: comparison with existing indices and application to a case study. Landscape and Urban Planning 83 (2-3): 91-103.\cr
#' -   Hanski, I. and Ovaskainen, O. 2000. The metapopulation capacity of a fragmented landscape. Nature 404: 755–758.
#' @export
#' @examples
#' \dontrun{
#' library(Makurhini)
#'
#' #Link removal option
#' data("habitat_nodes_spain", package = "Makurhini")
#' nrow(habitat_nodes_spain) # Number of patches
#' #For this example we only select first 50 nodes
#' nodes_test <- habitat_nodes_spain[1:50,]
#'
#' #Distance
#' data("dist_original", package = "Makurhini")
#' #select the distances between the first 50 nodes
#' dist_test <- dist_original[1:50,1:50]
#'
#' #We previusly estimate an Effective median dispersal distance
#' #of 980,966.64 m*cost. That is:
#' #mean resistance x 10 km of Euclidean median dispersal distance
#' delta <- MK_dPCIIC_links(nodes = nodes_test,
#'                          attribute = "attribute",
#'                          area_unit = "ha",
#'                          distance = dist_test,
#'                          removal = TRUE,
#'                          metric = "PC",
#'                          probability = 0.5,
#'                          distance_thresholds = 980966.64,
#'                          parallel = 4,
#'                          parallel_mode = 1,
#'                          intern = TRUE)
#' #Link change option
#' data("dist_restoration", package = "Makurhini")
#' #select the new distances between the first 50 nodes
#' dist_test_change <- dist_restoration[1:50,1:50]
#'
#' delta <- MK_dPCIIC_links(nodes = nodes_test,
#'                          attribute = "attribute",
#'                          area_unit = "ha",
#'                          distance = dist_test,
#'                          change = distance_change
#'                          removal = TRUE,
#'                          metric = "PC",
#'                          probability = 0.5,
#'                          distance_thresholds = 980966.64,
#'                          parallel = 4,
#'                          parallel_mode = 1,
#'                          intern = TRUE)
#' delta
#' }
#' @note The link importance analysis can be much more time consuming than the node importance analysis (i.e., \link[Makurhini]{MK_dPCIIC}). Sometimes the advance process does not reach 100 percent when operations are carried out very quickly.
#' @details
#' This function calculates the \bold{importance or contribution of each link} to the overall landscape connectivity. The importance of a link is calculated as the variation in the value of the \bold{PC} or \bold{IIC} indices after certain changes affecting that link.
#' The link importance analysis can be performed under the following two different modes:\cr
#' -   If \code{removal = TRUE}, the function removes one by one each of the links existing in the landscape network and calculates the impact of that link loss on landscape connectivity with the dPC or dIIC metrics. This mode is useful to identify the priority links to conserve: the ones with the highest contribution to the overall landscape connectivity (highest dPC or dIIC value).\cr
#' -   If \code{change! = NULL}, the function replaces one by one each of the links existing in the landscape network and calculates the impact of that link change on landscape connectivity with the dPC or dIIC metrics. This mode is useful to identify the priority links to both conserve and restore. Positive dPC or dIIC values correspond to links losses or degradation, and the priority links to conserve correspond to those with the highest positive values. Negative dPC or dIIC values correspond to links improvements, and the priority links to restore correspond to those with the smallest negative values. This mode requires additional information, a \bold{distance matrix} with the new distance values between all pairs of nodes. These new distance values will be in general different than the ones in the \code{distance} parameter. A smaller distance corresponds to an increase in the quality or strength of the link between two patches in a given change or restoration scenario. A higher distance means that the connection between those two patches gets weaker corresponding to a degradation scenario. All types of combinations and different types of changes are possible for each of the links. For example, some connections may be improved, some others may decrease their quality or even disappear completely (i.e., new distance = NA), and some other links may suffer no change at all in the same analysis, depending on the particular new distance values for each link.
#' @returns
#'-   If only \bold{one distance} was used in the parameter \code{distance_thresholds} then return an object of class \code{data.frame} with the link removal or/and change values.\cr
#'-   If you add \bold{\code{overall = TRUE}}, then a list containing the \code{data.frame} class object with the link removal or/and change values and a \code{data.frame} with the overall connectivity values will be returned.\cr
#'-   If you use \bold{multiple distance thresholds} (e.g, \code{distance_thresholds = c(1000, 5000, 80000)}), the resulting data should be returned in the form of a \code{list}, wherein each \code{list} item contains the resulting objects for each distance threshold.
#' @importFrom utils txtProgressBar setTxtProgressBar write.csv
#' @importFrom raster unique
#' @importFrom sf st_as_sf
#' @importFrom stats median

MK_dPCIIC_links <- function(nodes,
                            attribute = NULL,
                            LA = NULL,
                            area_unit = "m2",
                            distance = list(type = "centroid", resistance = NULL),
                            metric = c("IIC","PC"),
                            probability = NULL,
                            distance_thresholds = NULL,
                            threshold = NULL,
                            removal = TRUE,
                            change = NULL,
                            overall = FALSE,
                            parallel = NULL,
                            parallel_mode = NULL,
                            write = NULL,
                            intern = TRUE) {
  . = NULL
  if (missing(nodes)) {
    stop("error missing file of nodes")
  } else {
    if (is.numeric(nodes) | is.character(nodes)) {
      stop("error missing file of nodes")
    }
  }

  if (isTRUE(unique(c("IIC", "PC") %in% metric))) {
    metric = "IIC"; instr_dist <- TRUE
  } else {
    if (metric == "PC") {
      if (!is.null(probability) & !is.numeric(probability)) {
        stop("error missing probability")
      }
      instr_dist <- FALSE
    } else if (metric == "IIC"){
      instr_dist <- TRUE
    } else {
      stop("Type must be either 'IIC', or 'PC'")
    }
  }

  if(missing(write)){
    write = NULL
  } else {
  if(!is.null(write)) {
    if (!dir.exists(dirname(write))) {
      stop("error, output folder does not exist")
    }
   }
  }

  if(isFALSE(parallel)){
    parallel <- NULL
  } else if (isTRUE(parallel)){
    message(paste0("The number of available cores is ", as.numeric(availableCores()),
                   ", so ", as.numeric(availableCores()), " cores will be used."))
    parallel <- as.numeric(availableCores())-2
    parallel_mode <- 1
  } else if((!is.null(parallel))){
    if(!is.numeric(parallel)){
      stop("if you use parallel argument then you need a numeric value")
    }
    if(is.null(parallel_mode)){
      parallel_mode <- 1
    }
  } else {
    parallel <- NULL
  }

  #Nodes
  if(class(nodes)[1] == "SpatialPolygonsDataFrame" | class(nodes)[1] == "sf" | class(nodes)[1] == "SpatVector"){
    if (nrow(nodes) < 2) {
      stop("error, you need more than 2 nodes")
    }

    if(class(nodes)[1] == "SpatialPolygonsDataFrame"){
      nodes <- st_as_sf(nodes)
    }

    nodes$IdTemp <- 1:nrow(nodes); idT <- "IdTemp"
    attribute_1 <- nodesfile(nodes, id = idT, attribute = attribute,
                             area_unit = area_unit,
                             restoration = NULL,
                             write = NULL)
  } else if (class(nodes)[1] == "RasterLayer"){
    if(length(raster::unique(nodes)) < 2){
      stop("error, you need more than 2 nodes")
    }
    idT <- NULL
    attribute_1 <- nodesfile(nodes, id = idT, attribute = attribute,
                             area_unit = area_unit,
                             restoration = NULL,
                             write = NULL)
    attribute_1[,1] <- 1:nrow(attribute_1)
  } else if (class(nodes)[1] == "data.frame"){
    attribute_1 <- nodes; idT <- NULL
    attribute_1[,1] <- 1:nrow(attribute_1); names(attribute_1)[1:2] <- c("IdTemp", "Area")
    if(dim(nodes)[2] < 2){
      stop("You need two or three columns, see the ?MK_dPCIIC")
    }
  } else {
    stop("error missing file of nodes")
  }

  #Distance
  if(is.matrix(distance)){
    dist <- distance

    if(unique(dim(dist)) != nrow(nodes)){
      stop("nrow(dist) != nrow(nodes)")
    }

    if(isFALSE(unique(rownames(dist) %in% nodes[[1]]))){
      stop("Distance matrix and nodes have different id")
    }

    rownames(dist) <- 1:nrow(dist); colnames(dist) <- 1:ncol(dist)
  } else {
    if(isTRUE(intern)){
      if(!is.null(distance$resistance)){
        message("Estimating distances. This may take several minutes depending on the number of nodes and raster resolution")
      } else {
        if(nrow(attribute_1) > 1000){
          message("Estimating distances. This may take several minutes depending on the number of nodes")
        }
      }
    }

    dist <- tryCatch(distancefile(nodes = nodes,
                                  id = idT,
                                  type = distance$type,
                                  distance_unit = distance$distance_unit,
                                  keep = distance$keep,
                                  resistance = distance$resistance,
                                  resist.units = distance$resist.units,
                                  CostFun = distance$CostFun,
                                  ngh = distance$ngh,
                                  mask = distance$mask,
                                  threshold = distance$threshold,
                                  geometry_out = distance$geometry_out,
                                  bounding_circles = distance$bounding_circles,
                                  parallel = distance$parallel,
                                  ActiveParallel = FALSE,
                                  least_cost.java = distance$least_cost.java,
                                  cores.java = distance$cores.java,
                                  ram.java = distance$ram.java,
                                  pairwise = FALSE,
                                  write = NULL), error = function(err)err)
    #dist <- distance
    if(inherits(dist, "error")){
      close_multiprocess()
      stop(dist)
    }
    if(is.null(idT)){
      rownames(dist) <- 1:nrow(dist); colnames(dist) <- 1:ncol(dist)
    }
  }

  if(is.null(distance_thresholds)){
    distance_thresholds <- median(dist)
  }

  if(isTRUE(intern)){
    message(paste0("Estimating ", metric, " link index. This may take several minutes depending on the number of nodes"))
  }

  if(length(distance_thresholds) > 1 & isTRUE(intern)){
    pb <- txtProgressBar(0,length(distance_thresholds), style = 3)
  }

  if(length(threshold) > 1){dist.i <- dist}

  result_d <- lapply(1:length(distance_thresholds), function(x) {
    x.1 <- distance_thresholds[x]

    if(isTRUE(threshold)){
      if(length(threshold)>1){
        dist <- dist.i
        dist[which(dist <= threshold[x])] <- NA
      } else{
        dist[which(dist <= threshold)] <- NA
      }
    }
    mat1 <- tryCatch(get_sdist(dist_nodes = dist,
                               attr_nodes = attribute_1[,2],
                               metric = metric,
                               probability = probability,
                               distance_threshold = x.1,
                               igraph_Dijkstra = instr_dist,
                               parallel = parallel,
                               return_graph = TRUE,
                               loop = TRUE, G1 = 1000,
                               intern = intern), error = function(err)err)

    if(inherits(mat1, "error")){
      stop("Error in short distance estimation")
    } else {
      attribute_2 <- attribute_1[,2]
      num <- sum(mat1[[1]], na.rm = TRUE)
    }

    if(isTRUE(overall)){
      if(!is.null(LA)){
        if(LA > sum(attribute_2)){
          overall.2 <- data.frame(Index = c(paste0(metric, "num"),
                                            paste0("EC(", metric, ")"),
                                            metric),
                                  Value = c(num, sqrt(num), num / (LA^2)))
          if(overall.2$Value[3] > 1){
            overall.2$Value[3] <- 1
          }

          if(overall.2$Value[2] > LA){
            overall.2$Value[2] <- LA
          }

        } else {
          stop("LA must be greater than the sum of the attributes of the nodes")
        }
      } else {
        overall.2 <- data.frame(Index = c(paste0(metric, "num"),
                                          paste0("EC(", metric, ")")),
                                Value = c(num, sqrt(num)))
      }

      if (!is.null(write)){
        write.csv(overall.2, file = paste0(write, "_Overall","_", "d", x.1,  ".csv"), row.names = FALSE)
      }

      resultp <- list(overall.2); names(resultp) <- paste0("overall_d", x.1)
    }

    connections <- which(col(dist) < row(dist))

    if(isTRUE(removal)) {
      if(metric == "IIC"){
        delta <- connections_dIIC(connections = connections,
                                  rows = nrow(dist),
                                  cols = ncol(dist),
                                  graph_n = mat1[[2]],
                                  attrib_n = attribute_2,
                                  num = num,
                                  G1 = 1000,
                                  min_nodes = 2000,
                                  parallel = parallel,
                                  parallel_mode = if(is.null(parallel_mode)){0}else{parallel_mode},
                                  intern = intern)
      } else {
        delta <- connections_dPC(connections = connections,
                                 rows = nrow(dist),
                                 cols = ncol(dist),
                                 graph_n = mat1[[2]],
                                 attrib_n = attribute_2,
                                 num = num,
                                 G1 = 1000,
                                 min_nodes = 2000,
                                 parallel = parallel,
                                 parallel_mode = if(is.null(parallel_mode)){0}else{parallel_mode},
                                 intern = intern)
      }

      row.i<-(connections - 1) %% nrow(dist) + 1; col.i<-(connections - 1) %/% ncol(dist) + 1

      metric_conn <- data.frame(Id = 1:length(delta),
                                Source = row.i,
                                Destination = col.i,
                                delta = round(delta, 7),
                                check.names = FALSE)
      names(metric_conn)[ncol(metric_conn)] <- paste0("d", metric, "_removal")

      if(isTRUE(overall)){
        resultp[[2]] <- metric_conn; names(resultp)[2] <- paste0("Link_removal_importances_d", x.1)
      } else {
        if(!is.null(change)){
          resultp <- list(metric_conn); names(resultp) <- paste0("Link_removal_importances_d", x.1)
        } else {
          resultp <- metric_conn
        }
      }

      if(!is.null(write)){
        write.csv(metric_conn, file = paste0(write, "_Link_removal_d", x.1,  ".csv"), row.names = FALSE)
      }
    }

    if(!is.null(change)){
      if(metric == "IIC"){
        delta <- connections_dIIC(connections = connections,
                                  rows = nrow(dist),
                                  cols = ncol(dist),
                                  graph_n = mat1[[2]],
                                  attrib_n = attribute_2,
                                  change = change,
                                  dist = dist,
                                  distance_threshold = x.1,
                                  num = num,
                                  G1 = 1000,
                                  min_nodes = 2000,
                                  parallel = parallel,
                                  parallel_mode = if(is.null(parallel_mode)){0}else{parallel_mode},
                                  intern = intern)
      } else {
        delta <- connections_dPC(connections = connections,
                                 rows = nrow(dist),
                                 cols = ncol(dist),
                                 graph_n = mat1[[2]],
                                 dist = dist,
                                 attrib_n = attribute_2,
                                 change = change,
                                 distance_threshold = x.1,
                                 probability= probability,
                                 num = num,
                                 G1 = 1000,
                                 min_nodes = 2000,
                                 parallel = parallel,
                                 parallel_mode = if(is.null(parallel_mode)){0}else{parallel_mode},
                                 intern = intern)
      }

      row.i<-(connections - 1) %% nrow(dist) + 1;
      col.i<-(connections - 1) %/% ncol(dist) + 1

      metric_conn <- data.frame(Id = 1:length(delta),
                                Source = row.i,
                                Destination = col.i,
                                delta = round(delta, 7),
                                check.names = FALSE)
      names(metric_conn)[ncol(metric_conn)] <- paste0("d", metric,"_change")

      if(is.list(resultp)){
        resultp[[length(resultp)+1]] <- metric_conn; names(resultp)[length(resultp)] <- paste0("Link_change_importances_d", x.1)
      } else {
        resultp <- metric_conn
      }

      if (!is.null(write)){
        write.csv(metric_conn, file = paste0(write, "_Link_change_d", x.1,  ".csv"), row.names = FALSE)
      }
    }

    if(length(distance_thresholds) > 1 & isTRUE(intern)) {
      setTxtProgressBar(pb, x)
    }
    return(resultp)
  })
  if(isTRUE(intern)){message("")}
  if(length(distance_thresholds) == 1){
    result_d <- result_d[[1]]
  } else {
    names(result_d) <- paste0("Link_importances_d", distance_thresholds)
  }
  if(isTRUE(intern)){
    message("Done!")
  }
  return(result_d)
}


#' Estimate dIIC link
#'
#' @param connections numeric
#' @param rows numeric
#' @param cols numeric
#' @param change matrix. New distance matrix or connections values
#' @param dist matrix. First distance matrix or connections values
#' @param distance_threshold numeric
#' @param attrib_n numeric. Nodes attribute
#' @param num numeric. numPC ID patches = 1
#' @param graph_n Graph from igraph package
#' @param G1 numeric
#' @param min_nodes numeric
#' @param parallel numeric
#' @param parallel_mode numeric
#' @param intern logical
#' @importFrom utils object.size
#' @importFrom future multicore multisession plan availableCores
#' @importFrom furrr future_map_dbl future_map_dfc
#' @importFrom purrr map_dbl map_dfc
#' @importFrom igraph distances as_ids V delete_edges add_edges
#' @keywords internal

connections_dIIC <- function(connections = NULL,
                             rows = NULL,
                             cols = NULL,
                             change = NULL,
                             dist = NULL,
                             distance_threshold = NULL,
                             attrib_n = NULL,
                             num = NULL,
                             graph_n = NULL,
                             G1 = 1000,
                             min_nodes = 2000,
                             parallel = NULL,
                             parallel_mode = 0,
                             intern = FALSE){
  if(parallel_mode == 0){
    delta <- map_dbl(connections, function(i) {
      graph_nodes.i <- graph_n
      row.i <- (i - 1) %% rows + 1; col.i <- (i - 1) %/% cols + 1

      if(is.null(change)){
        delet.i <- tryCatch(delete_edges(graph_nodes.i, paste0(row.i, "|", col.i)),
                            error = function(err) err)
        if(!inherits(delet.i, "error")){
          graph_nodes.i <- delet.i
        }
      } else {
        d.i <- change[row.i,col.i]

        if(d.i != dist[row.i,col.i]){
          if(d.i< distance_threshold){
            add.i <- add_edges(graph_nodes.i, c(row.i,col.i))
            if(!inherits(add.i, "error")){
              graph_nodes.i <- add.i
            }
          } else {
            return(0)
          }
        } else {
          return(0)
        }
      }

      smat.i <- tryCatch(igraph::distances(graph_nodes.i,
                                        weights = NULL),
                         error = function(err) err)

      if (inherits(smat.i, "error")) {
        stop("Error in short distance estimation")
      } else {
        smat2 <- outer(attrib_n, attrib_n) / (1 + smat.i); smat2[which(is.infinite(smat.i))] <- 0
        num.i <- sum(smat2, na.rm = TRUE); dC <- (num - num.i)/num * 100
        return(dC)
      }
    }, .progress = intern)
    return(delta)
  } else {
    if(parallel_mode == 1){
      works <- as.numeric(availableCores())-1; works <- if(parallel > works){works}else{parallel}

      if(.Platform$OS.type == "unix") {
        strat <- future::multicore
      } else {
        strat <- future::multisession
      }

      m <- as.numeric(object.size(graph_n))* 0.001

      if(m > 600){
        m <- (m + round(m/3)) *1024^2
        options(future.globals.maxSize = m)
      }

      plan(strategy = strat, gc = TRUE, workers = works)
      delta <- future_map_dbl(connections, function(i) {
        graph_nodes.i <- graph_n
        row.i <- (i - 1) %% rows + 1; col.i <- (i - 1) %/% cols + 1

        if(is.null(change)){
          delet.i <- tryCatch(delete_edges(graph_nodes.i, paste0(row.i, "|", col.i)),
                              error = function(err) err)
          if(!inherits(delet.i, "error")){
            graph_nodes.i <- delet.i
          }
        } else {
          d.i <- change[row.i,col.i]

          if(d.i != dist[row.i,col.i]){
            if(d.i< distance_threshold){
              add.i <- add_edges(graph_nodes.i, c(row.i,col.i))
              if(!inherits(add.i, "error")){
                graph_nodes.i <- add.i
              }
            } else {
              return(0)
            }
          } else {
            return(0)
          }
        }

        smat.i <- tryCatch(igraph::distances(graph_nodes.i,
                                          weights = NULL),
                           error = function(err) err)

        if (inherits(smat.i, "error")) {
          stop("Error in short distance estimation")
        } else {
          smat2 <- outer(attrib_n, attrib_n) / (1 + smat.i); smat2[which(is.infinite(smat.i))] <- 0
          num.i <- sum(smat2, na.rm = TRUE); dC <- (num - num.i)/num * 100
          return(dC)
        }
      }, .progress = intern); close_multiprocess(works)
      return(delta)
    } else {
      works <- as.numeric(availableCores())-1;works <- if(parallel > works){works}else{parallel}

      if(.Platform$OS.type == "unix") {
        strat <- future::multicore
      } else {
        strat <- future::multisession
      }

      m <- as.numeric(object.size(graph_n))* 0.001

      if(m > 600){
        m <- (m + round(m/3)) *1024^2
        options(future.globals.maxSize= m)
      }

      plan(strategy = strat, gc = TRUE, workers = works)

      delta <- map_dbl(connections, function(i) {
        graph_nodes.i <- graph_n
        row.i <- (i - 1) %% rows + 1; col.i <- (i - 1) %/% cols + 1

        if(is.null(change)){
          delet.i <- tryCatch(delete_edges(graph_nodes.i, paste0(row.i, "|", col.i)),
                              error = function(err) err)
          if(!inherits(delet.i, "error")){
            graph_nodes.i <- delet.i
          }
        } else {
          d.i <- change[row.i,col.i]

          if(d.i != dist[row.i,col.i]){
            if(d.i< distance_threshold){
              add.i <- add_edges(graph_nodes.i, c(row.i,col.i))
              if(!inherits(add.i, "error")){
                graph_nodes.i <- add.i
              }
            } else {
              return(0)
            }
          } else {
            return(0)
          }
        }
        toV <- as_ids(V(graph_nodes.i))

        if(length(toV) > min_nodes){
          seq_n <- seq(1,length(toV), G1)
          if(!is.null(parallel)){
            smat.i <- tryCatch(future_map_dfc(seq_n, function(y){
              toV.i <- toV[y:(y + G1-1)];y.1 <- igraph::distances(graph_nodes.i,
                                                               to = toV.i[which(!is.na(toV.i))],
                                                               weights = NULL)
              return(y.1)}, .progress = intern),
              error = function(err)err)|> as.matrix(x = _)
            if(inherits(smat.i, "error")){
              close_multiprocess(works)
              message("error probably due to the lack of memory in the parallel process. Makurhini will try using sequential process or you can change allocated memory (e.g., options(future.globals.maxSize= 2118123520)), before run again this function")
              smat.i <- map_dfc(seq_n, function(y){
                toV.i <- toV[y:(y + G1-1)];y.1 <- igraph::distances(graph_nodes.i,
                                                                 to = toV.i[which(!is.na(toV.i))],
                                                                 weights = NULL)
                return(y.1)}, .progress = intern) |> as.matrix(x = _)
            }
          } else {
            smat.i <- tryCatch(map_dfc(seq_n, function(y){
              toV.i <- toV[y:(y + G1-1)];y.1 <- igraph::distances(graph_nodes.i,
                                                               to = toV.i[which(!is.na(toV.i))],
                                                               weights = NULL)
              return(y.1)}, .progress = intern),
              error = function(err)err)|> as.matrix(x = _)
          }
        } else {
          smat.i <- tryCatch(igraph::distances(graph_nodes.i,
                                            weights = NULL),
                             error = function(err) err)
        }

        if (inherits(smat.i, "error")) {
          stop("Error in short distance estimation")
        } else {
          smat2 <- outer(attrib_n, attrib_n) / (1 + smat.i); smat2[which(is.infinite(smat.i))] <- 0
          num.i <- sum(smat2, na.rm = TRUE); dC <- (num - num.i)/num * 100
          return(dC)
        }
      }, .progress = intern); close_multiprocess(works)
      return(delta)
    }
  }
}


#' Estimate dPC link
#'
#' @param connections numeric
#' @param rows numeric
#' @param cols numeric
#' @param change matrix. New distance matrix or connections values
#' @param dist matrix. First distance matrix or connections values
#' @param distance_threshold numeric
#' @param probability numeric
#' @param attrib_n numeric. Nodes attribute
#' @param num numeric. numPC ID patches = 1
#' @param graph_n Graph from igraph package
#' @param G1 numeric
#' @param min_nodes numeric
#' @param parallel numeric
#' @param parallel_mode numeric
#' @param intern logical
#' @importFrom utils object.size
#' @importFrom future multicore multisession plan availableCores
#' @importFrom furrr future_map_dbl future_map_dfc
#' @importFrom purrr map_dbl map_dfc
#' @importFrom cppRouting get_distance_matrix
#' @keywords internal
connections_dPC <- function(connections = NULL,
                            rows = NULL,
                            cols = NULL,
                            change = NULL,
                            dist = NULL,
                            distance_threshold = NULL,
                            probability = NULL,
                            graph_n = NULL,
                            attrib_n = NULL,
                            num = NULL,
                            G1 = 1000,
                            min_nodes = 2000,
                            parallel = NULL,
                            parallel_mode = 0,
                            intern = FALSE){
  if(parallel_mode == 0){
    delta <- map_dbl(connections, function(i) {
      graph_nodes.i <- graph_n
      row.i <- (i - 1) %% rows; col.i <- (i - 1) %/% cols
      toV <- as.numeric(graph_nodes.i$dict$ref)
      if(is.null(change)){
        graph_nodes.i$data <- graph_nodes.i$data[-which(graph_nodes.i$data$from == row.i &
                                                          graph_nodes.i$data$to == col.i |
                                                          graph_nodes.i$data$from == col.i &
                                                          graph_nodes.i$data$to == row.i),]

      } else {
        change.i <- change[row.i+1, col.i+1]
        if(change.i != dist[row.i+1,col.i+1]){
          if(is.null(probability)){
            k <- (1 / distance_threshold); d.i <- -log(exp(-k * change.i))
          } else {
            d.i <- -log(exp((change.i * log(probability))/distance_threshold))
          }
          graph_nodes.i$data$dist[which(graph_nodes.i$data$from == row.i &
                                          graph_nodes.i$data$to == col.i |
                                          graph_nodes.i$data$from == col.i &
                                          graph_nodes.i$data$to == row.i)] <- d.i

        } else {
          return(0)
        }
      }

      smat.i <- get_distance_matrix(Graph = graph_nodes.i,
                                    from = toV, to = toV, algorithm = "phast")

      if (inherits(smat.i, "error")) {
        stop("Error in short distance estimation")
      } else {
        smat.i <- exp(-smat.i); smat.i[is.infinite(smat.i)] <- 0
        smat.i <- outer(attrib_n, attrib_n) * smat.i
        num.i <- sum(smat.i, na.rm = TRUE); dC <- (num - num.i)/num * 100
        #dC <- (num.i - num)/num.i * 100 seria así? de esta forma si hay ganancias son porcentajes positivos y no negativos
        return(dC)
      }
    }, .progress = intern)
  } else {
    if(parallel_mode == 1){
      works <- as.numeric(availableCores())-1; works <- if(parallel > works){works}else{parallel}

      if(.Platform$OS.type == "unix") {
        strat <- future::multicore
      } else {
        strat <- future::multisession
      }

      m <- as.numeric(object.size(graph_n))* 0.001

      if(m > 600){
        m <- (m + round(m/3)) *1024^2
        options(future.globals.maxSize = m)
      }

      plan(strategy = strat, gc = TRUE, workers = works)
      delta <- future_map_dbl(connections, function(i) {
        graph_nodes.i <- graph_n
        row.i <- (i - 1) %% rows; col.i <- (i - 1) %/% cols
        toV <- as.numeric(graph_nodes.i$dict$ref)
        if(is.null(change)){
          graph_nodes.i$data <- graph_nodes.i$data[-which(graph_nodes.i$data$from == row.i &
                                                            graph_nodes.i$data$to == col.i |
                                                            graph_nodes.i$data$from == col.i &
                                                            graph_nodes.i$data$to == row.i),]

        } else {
          change.i <- change[row.i+1, col.i+1]
          if(change.i != dist[row.i+1,col.i+1]){
            if(is.null(probability)){
              k <- (1 / distance_threshold); d.i <- -log(exp(-k * change.i))
            } else {
              d.i <- -log(exp((change.i * log(probability))/distance_threshold))
            }
            graph_nodes.i$data$dist[which(graph_nodes.i$data$from == row.i &
                                            graph_nodes.i$data$to == col.i |
                                            graph_nodes.i$data$from == col.i &
                                            graph_nodes.i$data$to == row.i)] <- d.i

          } else {
            return(0)
          }
        }

        smat.i <- get_distance_matrix(Graph = graph_nodes.i,
                                      from = toV, to = toV, algorithm = "phast")

        if (inherits(smat.i, "error")) {
          stop("Error in short distance estimation")
        } else {
          smat.i <- exp(-smat.i); smat.i[is.infinite(smat.i)] <- 0
          smat.i <- outer(attrib_n, attrib_n) * smat.i
          num.i <- sum(smat.i, na.rm = TRUE); dC <- (num - num.i)/num * 100
          #dC <- (num.i - num)/num.i * 100 seria así? de esta forma si hay ganancias son porcentajes positivos y no negativos
          return(dC)
        }
      }, .progress = intern); close_multiprocess(works)
      return(delta)
    } else {
      works <- as.numeric(availableCores())-1;works <- if(parallel > works){works}else{parallel}

      if(.Platform$OS.type == "unix") {
        strat <- future::multicore
      } else {
        strat <- future::multisession
      }

      m <- as.numeric(object.size(graph_n))* 0.001

      if(m > 600){
        m <- (m + round(m/3)) *1024^2
        options(future.globals.maxSize= m)
      }

      plan(strategy = strat, gc = TRUE, workers = works)
      delta <- map_dbl(connections, function(i) {
        graph_nodes.i <- graph_n
        row.i <- (i - 1) %% rows; col.i <- (i - 1) %/% cols
        toV <- as.numeric(graph_nodes.i$dict$ref)

        if(is.null(change)){
          graph_nodes.i$data <- graph_nodes.i$data[-which(graph_nodes.i$data$from == row.i &
                                                            graph_nodes.i$data$to == col.i |
                                                            graph_nodes.i$data$from == col.i &
                                                            graph_nodes.i$data$to == row.i),]

        } else {
          change.i <- change[row.i+1, col.i+1]
          if(change.i != dist[row.i+1,col.i+1]){
            if(is.null(probability)){
              k <- (1 / distance_threshold); d.i <- -log(exp(-k * change.i))
            } else {
              d.i <- -log(exp((change.i * log(probability))/distance_threshold))
            }
            graph_nodes.i$data$dist[which(graph_nodes.i$data$from == row.i &
                                            graph_nodes.i$data$to == col.i |
                                            graph_nodes.i$data$from == col.i &
                                            graph_nodes.i$data$to == row.i)] <- d.i

          } else {
            return(0)
          }

        }

        if(length(toV) >= min_nodes){
          seq_n <- seq(1,length(toV), G1)
          if(!is.null(parallel)){
            smat.i <- tryCatch(future_map_dfc(seq_n, function(y){
              toV.i <- toV[y:(y + G1-1)];y.1 <- get_distance_matrix(Graph = graph_nodes.i,
                                                                    from = toV, to = toV.i[which(!is.na(toV.i))],
                                                                    allcores=FALSE, algorithm = "phast")
              invisible(gc())
              return(y.1)
            }, .progress = intern),
            error = function(err)err) |> as.matrix(x = _)

            if(inherits(smat.i, "error")){
              close_multiprocess(works)
              message("error probably due to the lack of memory in the parallel process. Makurhini will try using sequential process or you can change allocated memory (e.g., options(future.globals.maxSize= 2118123520)), before run again this function")
              smat.i <- map_dfc(seq_n, function(y){
                toV.i <- toV[y:(y + G1-1)];y.1 <- get_distance_matrix(Graph = graph_nodes.i,
                                                                      from = toV,
                                                                      to = toV.i[which(!is.na(toV.i))],
                                                                      allcores = FALSE,
                                                                      algorithm = "phast")
                return(y.1)
              }, .progress = intern) |> as.matrix(x = _)
            }
          } else {
            smat.i <- tryCatch(map_dfc(seq_n, function(y){
              toV.i <- toV[y:(y + G1-1)];y.1 <- get_distance_matrix(Graph = graph_nodes.i,
                                                                    from = toV,
                                                                    to = toV.i[which(!is.na(toV.i))],
                                                                    allcores = FALSE,
                                                                    algorithm = "phast")
              return(y.1)
            }, .progress = intern),
            error = function(err)err) |> as.matrix(x = _)
          }
        } else {
          smat.i <- tryCatch(get_distance_matrix(Graph = graph_nodes.i,
                                                 from = toV, to = toV,
                                                 algorithm = "phast"),
                             error = function(err)err)
        }

        if (inherits(smat.i, "error")) {
          stop("Error in short distance estimation")
        } else {
          smat.i <- exp(-smat.i); smat.i[is.infinite(smat.i)] <- 0
          smat.i <- outer(attrib_n, attrib_n) * smat.i
          num.i <- sum(smat.i, na.rm = TRUE); dC <- (num - num.i)/num * 100
          return(dC)
        }
      }, .progress = intern)
    }
  }
}
OscarGOGO/Makurhini documentation built on Jan. 9, 2025, 1:20 p.m.