Background

This R-package contains utility functions to easily and consistently perform several important tasks required to prepare data for uploading to the R&R Webtool GitLab repositories. The R&R Webtool Maintenance Manual gives full documentation of the design of the webtool, describes how each data element for a webtool taxon is prepared, and gives procedures to upload the output from functions in this package to the GitLab repositories which generate the publicly visible webtool. Please consult that document for both an overview and precise details of tasks to be performed by each function in this package.

A copy of the R&R Webtool Maintenance Manual is always available on the N-drive here: N:\Evolutionary Ecology\Restore & Renew\Webtool R&R\Documentation

The functions and associated tasks included in this package are listed in the table below. See the section for each function later in this document for a detailed description of the function and examples of its use.

Function | Task(s) -----------------|------------------------------------------- formatInfoImages | Scale and/or crop jpg-format images for the species information section showModelCovars | Show the current GDM model covariates and paths to covariate data files setModelCovars | Set GDM model covariate paths to structure used on the device hosting the R&R webtool R-package. GDM model threshold may also be set to a new value update_occFiles | Update filtered ALA herbarium data files for a set of taxa makeOccPlot | Make leaflet maps as png images showing filtered ALA herbarium record distribution for a set of taxa makeBRECIplots | Generate png files of BRECI plots for a set of taxa trim_GDM_domain | Trim an R&R GDM model domain polygon using special zone information

The functions

formatInfoImages

The species information section of the webtool provides a set of images highlighting important species characteristics. A set of images for each taxon needs to be sourced in JPEG format as described the R&R Webtool Maintenance Manual. Collated images are stored on the N-drive in the folder ???? but in their original state covers a wide spectrum of image sizes and aspect ratios. This function will process a set of jpg files in the source folder and output a set of jpg images with consistent target height and width dimensions specified in the script which renders an output html document for an end user of the R&R Webtool. Processed images are written to the specified destination folder.

The target height of output images is 500 pixels and target width is 576 pixels giving a target aspect ratio of 1.152. The algorithm developed to perform the re-scaling ensures that the original aspect ratio is conserved to avoid distortion. This may result in a image area embedded within the the target image frame. If this occurs, the background is set to black to pad the re-sized image centering it within the target frame.

An example may make this clearer. The first image shows the original format has height of 499 pixels and width of 750 pixels, and therefore an aspect ratio of 1.503006. To fit the target image frame but maintain the original aspect ratio to avoid distortion, we end up with a height of 384 pixels and a width of 576 pixels (second image). So the re-scaled image is padded symmetrically so the shortest dimension matches the target image frame (see third image).

![An original image with height of 499 pixels, height of 750 pixels and aspect ratio of about 1.5.](Angophora hispida buds flowers and leaves.jpg)

![Image after the re-sizing pahse of processing. Aspect ratio has been maintained ("respected") giving dimensions of 384 px (height) and 576 px (width).](Angophora_hispida_buds_ flowers_and_leaves_rescaled.jpg)

![Padding is added to the dimension shorter than target to reach target dimensions and hence aspect ratio.](Angophora_hispida_buds_ flowers_and_leaves_processed.jpg)

Small images (those with height and/or width less than the target dimensions) are some times encountered. No rescaling of small images (to avoid image quality loss) is performed, only padding on height and width to fill the target image frame.

showModelCovars

During the process of setting up a new batch of taxa for uploading to the webtool data repository it is necessary to review the list of covariates required by the GDM models for the set of species. This can be done species by species by using the load() function in the R-console, and inspecting the objects internal structure and data content. However, this is very laborious and potentially error-prone with more than a few species, so this function allows the user to list important information for a given GDM model object produced by Jason Bragg's toolkit. The model takes as its only parameter the full path to an R&R GDM model object and outputs a formatted object content summary.

The function is designed to be called with the full path to a single R&R GDM object file. The R&R GDM object for Hakea teretifolia is called "Hakea_teretifolia_genetic_model.Rd". If you wish to examine the model and it is currently stored in the folder "C:\Stuff\GDM", then you might call the function in the this way:

fileName <- "C:\\Stuff\\GDM\\Hakea_teretifolia_genetic_model.Rd"
showModelCovars(fileName)

setModelCovars

The R&R webtool allows the use of two types of covariate files: Environmental covariates, and ancestry coefficients (Q-files). When an R&R GDM object is produced by Jason's R-functions, paths to covariate rasters are saved as they appear in his development environment. This function performs a few important refinements to the R&R GDM model contents to enable it to be used in the production environment found on the web-server hosting the R session supporting the webtool.

In particular, paths to environmental covariates and ancestry proportion coefficient files (Q-files) must be set to access a file structure which differs from the model building or development environment. This function looks at the covariates listed for the fitted GDM model, confirms that the matching covariate files are present in the necessary location, and then sets the necessary elements of the model object to load raster stacks of covariates from the appropriate folders found on the server, namely:

An additional parameter was added to allow this function to change the threshold value stored in Jason's R&R GDM object. The default value for the parameter is NULL which leaves the current value of threshold unchanged. Calling the function with a value between 0 and 1 will set the threshold to this new value. The reason for permitting the change of threshold was the decision to shift the default threshold from 0.3 to 0.2 for GDMs except that for Acacia suaveolens.

The updated model object is saved over the original model object. For safety, it is recommended that a copy is archived in a folder not used by this function.

update_occFiles

The function updates the set of herbarium occurrence files for one or more taxa passed tot eh function in the parameter theseTaxa. This parameter has a default value of "all" which causes the full list of taxa found in the webtool taxon table to be processed.

The source path is interpreted as high-level folders within which a folder exists for each taxon listed in the current version of the R&R Webtool taxonTable stored at /rbgsyd-restore-and-renew-git-repos/rbgsyd-restore-and-renew-data/data.

Source files are expected to be as produced by the sequence of functions fetchALA() and filterALAdata in the package processALA. The function update_occFiles strips the fields in these files to the small set used for the webtool.

The destination folder has no sub-folders; output files are written directly to the folder and will be uploaded to the GitLab repository as a group of files.

makeOccPlot

This function produces a map of occurrence records for a selected taxon suitable for use in the R&R webtool. The map is output in PNG format.

The function has been designed to work with a fixed structure and filename convention for input data. It is assumed that within the folder defined by the parameter baseDataPath, there is a sub-folder named for each taxon listed in the parameter taxonNames. Then, within each taxon sub-folder there is a file with a name structured as "genus_specificEpithet_herbariumRecords_filtered.csv" which has been produced by the functions fetchALA() and filterALA() in the package processALA.

Output files are placed in the folder defined in parameter outputPath with a name like "genus_specificEpithet_distibution.png". For example, "Neolitsea_dealbata_distribution.png".

The spatial bounds of the output map are set by the parameter extentStr which may have one of three values:

trim_GDM_domain

This function is designed to excise one or more polygons from the GDM bounds polygon pointed to by the parameter thisDomain. These polygons represent areas for which it is inappropriate to produce an output from the webtool, and could include:

fetchPlantNET

Searches for a matching taxon on PlantNET and, if found, return a named list containing useful information extracted from the returned web page. No resolvable name returns "No_data" list elements.

mergeNewData

Merge new data into the locally cloned DATA repository.



peterbat1/RandR.webtool.dataPrep documentation built on Jan. 7, 2023, 3:05 p.m.