R/fCover.R

Defines functions fCover

Documented in fCover

#' Fractional Cover Analysis
#' 
#' fCover takes a classified high resolution image, e.g. vegetation and non-vegetation based on Landsat and calculates cover fractions for
#' pixels of a coarser resolution, e.g. MODIS.
#' 
#' @param classImage high resolution RasterLayer containing a landcover classification, e.g. as obtained by \link{superClass}.
#' @param predImage coarse resolution RasterLayer for which fractional cover will be estimated.
#' @param nSamples Integer. Number of pixels to sample from predImage to train the regression model
#' @param classes Integer. Classes for which fractional cover should be estimated (one or more). 
#' @param model Character. Which model to fit for image regression. See \link[caret]{train} for options. Defaults to randomForest ('rf')
#' @param maxNA Numeric. Maximal proportion of NAs allowed in training pixels. 
#' @param tuneLength Integer. Number of levels for each tuning parameters that should be generated by train. See Details and \code{\link[caret]{train}}. 
#' @param method Character. Resampling method for parameter tuning. Defaults to 10 fold cross-validation. See \code{\link[caret]{trainControl}} for options.
#' @param verbose Logical. Print progress information.
#' @param clamp Logical. Enforce results to stay within [0,1] interval. Values <0 are reset to 0 and values >1 to 1. 
#' @param filename Character. Filename of the output raster file (optional).
#' @param ... further arguments to be passed to \link[caret]{trainControl} and \code{\link[raster]{writeRaster}}
#' @details 
#' fCover gets the pixel values in a high resolution classified image that correspond to 
#' randomly selected moderate resolution pixels and then calculates the percent of 
#' the classified image pixels that represent your cover type of interest. In other words, if 
#' your high resolution image has a pixel size of 1m and your moderate resolution image has a 
#' pixel size of 30m the sampling process would take a block of 900 of the 1m resolution pixels 
#' that correspond to a single 30m pixel and calculate the percentage of the 1m pixels that 
#' are forest. For example, if there were 600 forest pixels and 300 non-forest pixels the value 
#' given for the output pixel would be 0.67 since 67% of the block of 1m pixels were forest. 
#' 
#' fCover relies on the train() function from the caret package which provides access to a huge number of classifiers.
#' Please see the available options at \link[caret]{train}. The default classifier (randomForest) we chose has been shown
#' to provide very good results in image regression and hence it is recomended you start with this one. If you choose a different
#' classifier, make sure it can run in regression mode.
#' 
#' Many models require tuning of certain parameters. Again, this is handled by \link[caret]{train} from the caret package.
#' With the tuneLength argument you can specify how many different values of each tuning parameter should be tested. The Random Forest
#' algorithm for example can be tuned by varying the mtry parameter. Hence, by specifying tuneLength = 10, ten different levels
#' of mtry will be tested in a cross-validation scheme and the best-performing value will be chosen for the final model.
#' 
#' If you register a parallel backend before, model fitting will run in parallel.
#' 
#' If the total no-data values for a block of high resolution pixels is greater than maxNA then 
#' it will not be included in the training data set since there is too much missing data to provide 
#' a reliable cover percentage. If the no-data proporton is less then maxNA the no-data pixels are removed 
#' from the total number of pixels when calculating the percent cover. 
#' 
#' @return 
#' Returns a list with two elements: models contains the fitted models evaluated in tenfold cross-validation (caret train objects); 
#' fCover contains a RasterStack with a fractional cover layer for each requested class.
#' @export 
#' @seealso \link{superClass}
#' @examples
#' \donttest{
#' library(raster)
#' library(caret)
#' ## Create fake input images
#' data(rlogo)
#' lsat <- rlogo
#' agg.level <- 9
#' modis <- aggregate(lsat, agg.level)
#' 
#' ## Perform classification
#' lc      <- unsuperClass(lsat, nClass=2)
#' 
#' ## Calculate the true cover, which is of course only possible in this example, 
#' ## because the fake corse resolution imagery is exactly res(lsat)*9
#' trueCover <- aggregate(lc$map, agg.level, fun = function(x, ...){sum(x == 1, ...)/sum(!is.na(x))})
#' 
#' ## Run with randomForest and support vector machine (radial basis kernel)
#' ## Of course the SVM is handicapped in this example due to poor tuning (tuneLength)
#' par(mfrow=c(2,3))
#' for(model in c("rf", "svmRadial")){
#'    fc <- fCover(
#'            classImage = lc$map ,
#'            predImage = modis,
#'            classes=1,
#'            model=model,
#'            nSample = 50,
#'            number = 5,
#'            tuneLength=2
#'    )           
#'    
#'    ## How close is it to the truth?
#'    compare.rf <- trueCover - fc$map
#'    plot(fc$map, main = paste("Fractional Cover: Class 1\nModel:", model))
#'    plot(compare.rf, main = "Diffence\n true vs. predicted")
#'    plot(trueCover[],fc$map[],  xlim = c(0,1), ylim =c(0,1),
#'            xlab = "True Cover", ylab = "Predicted Cover" )
#'    abline(coef=c(0,1))
#'    rmse <- sqrt(cellStats(compare.rf^2, sum))/ncell(compare.rf)
#'    r2 <- cor(trueCover[], fc$map[], "complete.obs")
#'    text(0.9,0.1,paste0(paste(c("RMSE:","R2:"),
#'                            round(c(rmse, r2),3)),collapse="\n"), adj=1)
#' }
#'
#' ## Reset par
#' par(mfrow=c(1,1))
#' }
fCover <- function(classImage, predImage, nSamples = 1000, classes = 1, model = "rf", tuneLength = 3, 
        method = "cv",  maxNA = 0, clamp = TRUE, filename = NULL, verbose, ...){
    
	predImage <- .toRaster(predImage)
	classImage <- .toRaster(classImage)
    if(!missing("verbose")) .initVerbose(verbose)
    
    ## Resolution check
    r1         <- res(predImage)
    r2         <- res(classImage)[1]
    if(r2 >= max(r1)) stop("Resolution of classImage must be smaller than the resolution of predImage")
    
    if(nSamples > ncell(predImage)) {
        nSamples <- ncell(predImage)
        warning("nSamples > ncell(predImage). Resetting nSamples to ncell(predImage)")
    }
    
    ## Spit ellipsis into caret::trainControl and raster::writeRaster
    frmls_train <- names(formals(caret::trainControl))
    args  <- c(list(...), method = method)
    args_trainControl  <- args[names(args) %in% frmls_train]
    args_writeRaster   <- args[!names(args) %in% frmls_train]
    args_writeRaster$filename <- if(length(classes) == 1) filename else NULL ## write raster here already during predict if only one layer is output
    
    ## Draw random sample from coarse res image
    .vMessage("Collecting random samples")
    dummy   <- raster(predImage) 
    dummyEx <- extent(crop(dummy, classImage, snap = "in")) ## crop properly to avoid sampling in marginal pixels (otherwise sampleRandom uses sanp='near', potentially resulting in incomple pixels)
    ranSam  <- sampleRandom(predImage, size = nSamples, ext = dummyEx*0.8, xy = TRUE, na.rm = TRUE)
    
    ## Extract classified (high res) values
    .vMessage("Extracting classified pixels")
    d      <- ranSam[,c("x","y")]
    exts <- apply(cbind(d[,1] - r1[1]/2, d[,1] + r1[1]/2, d[,2] - r1[2]/2, d[,2] + r1[2]/2), 1, extent) ## tried this with SpatialPolygons but thats even slower
    vals <- .parXapply(X = exts, XFUN = "lapply", FUN = function(ext) {
                extract(x = classImage, y = ext, na.rm=FALSE)
            }, envir=environment())
    
    ## Calculate fractional cover
    .vMessage("Calculating fractional cover")
    
    tabl <- lapply(vals, function(x) table(x, useNA = "always") / length(x))    
    
    sampledClasses <- unique(unlist(lapply(tabl, names)))
    missingClasses <- setdiff(as.character(classes), sampledClasses)
    if(length(missingClasses)) {
        stop(sprintf("One or more classes are not represented in the sampled pixels. Missing class/classes: %s\nEither this class does not occur in classImage or nSample was too small for a representative sample.", 
                        paste(missingClasses,collapse=", ")), call.=FALSE)
    }
    
    
    fCov <- do.call("rbind", lapply(tabl, "[", as.character(classes)))
    fCov[is.na(fCov)] <- 0
    colnames(fCov)    <- classes
    
    ## NA handling
    ##
    ## This means_ that if there is a NA pixel, it will be weighted proportionately and
    ## to the area of the occuring classes. This is a design decision. Alternatively we
    ## could count it as non-class pixel for either class, however, then these don't sum up 
    ## to unity anymore. ==> In general we should not make use of this option because of the
    ## required assumptions and simply ignore samples with NAs ==> default argument maxNA=0
    ## Maybe we should ditch the maxNA argument alltogether
    fCovNA  <- lapply(tabl, tail, 1)
    include <- unlist(fCovNA <= maxNA)
    # Normalize by 1-NAfrequency
    fCov    <- fCov/vapply(fCovNA, function(x) 1-x, numeric(1))
    if(!any(include)) stop("No non-NA samples!")
    
    .registerDoParallel()
    ## Fit regression model and predict
    fCL <- lapply(1:length(classes), function(cl){
                .vMessage("Fitting regression model for class ", cl)
                
                ## Assemble training data (and remove cells exceeding maxNA)
                trainingData <- data.frame(response = fCov[include, cl], ranSam[include, -c(1:2)])
                
                ## Fit model
                modelFit      <- train(response ~ ., data = trainingData, method = model, tuneLength = tuneLength,  
                        trControl = do.call("trainControl", args_trainControl))
                
                ## Predict  
                .vMessage(paste0("Predicting fractional cover for class ", cl))               
                out <- .paraRasterFun(predImage, rasterFun = raster::predict, args = list(model = modelFit, na.rm = TRUE), wrArgs =  args_writeRaster)              
                list(modelFit, out)                
            })
    
    ## Prepare output and return
    out     <- stack(lapply(fCL, "[[", 2))
    models  <- lapply(fCL, "[[", 1)
    atts     <- attr(classImage@data, "attributes")
    if(clamp) out <- clamp(out, 0, 1, useValues = TRUE)
    names(models) <- names(out) <- paste0("fC_", if(length(atts) > 0) atts[[1]][classes,"value"] else paste0("class",classes))
    
    if(!is.null(filename) & length(classes) > 1) out <- writeRaster(out, filename, ...)
    structure(list(model = models, trainData = d[include,], map = out), class = c("fCover", "RStoolbox"))
    
}

Try the RStoolbox package in your browser

Any scripts or data that you put into this service are public.

RStoolbox documentation built on March 18, 2022, 5:37 p.m.