Nothing
#' Parallel computation in lidR
#'
#' This document explains how to process point clouds taking advantage of parallel processing in the
#' lidR package. The lidR package has two levels of parallelism, which is why it is difficult to
#' understand how it works. This page aims to provide users with a clear overview of how to take advantage
#' of multicore processing even if they are not comfortable with the parallelism concept.
#'
#' @section Algorithm-based parallelism:
#' When processing a point cloud we are applying an algorithm on data. This algorithm may or may not be
#' natively parallel. In lidR some algorithms are fully computed in parallel, but some are not because they are
#' not parallelizable, while some are only partially parallelized. It means that some portions of the code
#' are computed in parallel and some are not. When an algorithm is natively parallel in lidR it is always
#' a C++ based parallelization with OpenMP. The advantage is that the computation is faster without any
#' consequence for memory usage because the memory is shared between the processors. In short,
#' algorithm-based parallelism provides a significant gain without any cost for your R session and
#' your system (but obviously there is a greater workload for the processors). By default lidR uses
#' half of your cores but you can control this with \link{set_lidr_threads}. For example, the \link{lmf}
#' algorithm is natively parallel. The following code is computed in parallel:
#' \preformatted{
#' las <- readLAS("file.las")
#' tops <- locate_trees(las, lmf(2))
#' }
#' However, as stated above, not all algorithms are parallelized or even parallelizable. For example,
#' \link{li2012} is not parallelized. The following code is computed in serial:
#' \preformatted{
#' las <- readLAS("file.las")
#' dtm <- segment_trees(las, li2012())
#' }
#' To know which algorithms are parallelized users can refer to the documentation or use the
#' function \link{is.parallelised}.
#' \preformatted{
#' is.parallelised(lmf(2)) #> TRUE
#' is.parallelised(li2012()) #> FALSE
#' }
#' @section Chunk-based parallelism:
#' When processing a `LAScatalog`, the internal engine splits the dataset into chunks and each chunk is
#' read and processed sequentially in a loop. This loop can be parallelized with the
#' `future` package. By default the chunks are processed sequentially, but they can be processed
#' in parallel by registering an evaluation strategy. For example, the following code is evaluated
#' sequentially:
#' \preformatted{
#' ctg <- readLAScatalog("folder/")
#' out <- pixel_metrics(ctg, ~mean(Z))
#' }
#' But this one is evaluated in parallel with two cores:
#' \preformatted{
#' library(future)
#' plan(multisession, workers = 2L)
#' ctg <- readLAScatalog("folder/")
#' out <- pixel_metrics(ctg, ~mean(Z))
#' }
#' With chunk-based parallelism any algorithm can be parallelized by processing several subsets of
#' a dataset. However, there is a strong cost associated with this type of parallelism. When processing several
#' chunks at a time, the computer needs to load the corresponding point clouds. Assuming the user processes
#' one square kilometer chunks in parallel with 4 cores, then 4 chunks are loaded in the computer
#' memory. This may be too much and the speed-up is not guaranteed since there is some overhead involved in
#' reading several files at a time. Once this point is understood, chunk-based parallelism is very
#' powerful since all the algorithms can be parallelized whether or not they are natively parallel.
#' It also allows to parallelize the computation on several machines on the network or to work on a HPC.
#'
#' @section Nested parallelism - part 1:
#' Previous sections stated that some algorithms are natively parallel, such as \link{lmf}, and some are
#' not, such as \link{li2012}. Anyway, users can split the dataset into chunks to process them simultaneously
#' with the LAScatalog processing engine. Let's assume that the user's computer has four cores,
#' what happens in this case:
#' \preformatted{
#' library(future)
#' plan(multisession, workers = 4L)
#' set_lidr_threads(4L)
#' ctg <- readLAScatalog("folder/")
#' out <- locate_trees(ctg, lmf(2))
#' }
#' Here the catalog will be split into chunks that will be processed in parallel. And each computation
#' itself implies a parallelized task. This is a nested parallelism task and it is dangerous! Hopefully
#' the lidR package handles such cases and chooses by default to give precedence to chunk-based
#' parallelism. In this case chunks will be processed in parallel and the points will be processed
#' serially by disabling OpenMP.
#'
#' @section Nested parallelism - part 2:
#' We explained rules of precedence. But actually the user can tune the engine more accurately. Let's
#' define the following function:
#' \preformatted{
#' myfun = function(las, ws, ...)
#' {
#' las <- normalize_height(las, tin())
#' tops <- locate_tree(las, lmf(ws))
#' return(tops)
#' }
#'
#' out <- catalog_map(ctg, myfun, ws = 5)
#' }
#' This function used two algorithms, one is partially parallelized (`tin`) and one is fully
#' parallelized `lmf`. The user can manually use both OpenMP and future. By default the engine
#' will give precedence to chunk-based parallelism because it works in all cases but the user can
#' impose something else. In the following 2 workers are attributed to future and 2 workers are
#' attributed to OpenMP.
#' \preformatted{
#' plan(multisession, workers = 2L)
#' set_lidr_threads(2L)
#' catalog_map(ctg, myfun, ws = 5)
#' }
#' The rule is simple. If the number of workers needed is greater than the number of
#' available workers then OpenMP is disabled. Let suppose we have a 4 cores:
#' \preformatted{
#' # 2 chunks 2 threads: OK
#' plan(multisession, workers = 2L)
#' set_lidr_threads(2L)
#'
#' # 4 chunks 1 threads: OK
#' plan(multisession, workers = 4L)
#' set_lidr_threads(1L)
#'
#' # 1 chunks 4 threads: OK
#' plan(sequential)
#' set_lidr_threads(4L)
#'
#' # 3 chunks 2 threads: NOT OK
#' # Needs 6 workers, OpenMP threads are set to 1 i.e. sequential processing
#' plan(multisession, workers = 3L)
#' set_lidr_threads(2L)
#' }
#'
#' @section Complex computing architectures:
#' For more complex processing architectures such as multiple computers controlled remotely
#' or HPC a finer tuning might be necessary. Using
#' \preformatted{
#' options(lidR.check.nested.parallelism = FALSE)
#' }
#' lidR will no longer check for nested parallelism and will never automatically disable OpenMP.
#' @md
#' @name lidR-parallelism
NULL
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.