knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
Liam Lachs
SizeExtractR is a semi-automated workflow designed to improve the speed, reliability, and reproducibility of extracting size metrics from objects within images. On the spectrum of image analysis techniques that range from fully manual to fully automated, SizeExtractR lies in the middle. Note that a full User Guide is available at the data repository (it is currently only accessible at the private link, but will be transferred to the doi link upon publication). Broadly there are three steps to using SizeExtractR.
First the set of images to be analysed must be organised within a nested directory where the naming of folders is consistent. For example, for a field survey, there would be one root directory, year folders are stored here, with site folders within each year folder. The directory structure and folder names will be used later to determine the categorical variables that relate to each image (e.g., which year and site is the image from?).
The next stage of the SizeExtractR workflow is completed in ImageJ software using SizeExtractR macros to speed up an otherwise labour-intensive process of outlining Regions of Interest (ROIs), annotating/naming each ROI, measuring calibration lengths (e.g., ruler in each image), and then saving the data. The SizeExtractR worklfow exports one data file (.txt) per image that contains the uncalibrated size data (in pixels) and alphanumeric annotations (ROI label name, manually input by the user) for each ROI in that image. The following size metrics are computed:
Finally, the SizeExtractR R-package is used to perform quality control on ROI annotations, collate all the individual data files across an analysed image set, calibrate size metrics, and produce a single size dataset to be used for other scientific purposes. Overall, the R-package conducts the following interactive steps to build the size dataset:
To help users to learn the full SizeExtractR protocol (organising files, ImageJ, and R-package), a full worked example is provided in the official SizeExtractR User Guide. Here, we assume the user has already completed the steps in ImageJ and is ready to collate their data into a single dataset. Here is an example of an annotated image
This worked example based on a subset of data from Lachs et al. (2021). This database contains images of the seabed, each including a measurement scale. The images were taken during and 6-months after a coral bleaching event in 2016 in the Solitary Islands Marine Park, Southeast Australia. The research goal here was to assess how population size structure of the coral Pocillopora aliciae differs before and after coral bleaching event.
As all ROIs in this study are P. aliciae corals, all have an ROI Type code of Pa. The ROI Label Codes are shown in Table 1.
Table 1. Table of the ROI Label Codes for this worked example, along with corresponding variable names which will be in the final dataset.
library(knitr) library(kableExtra) knitr::kable(data.frame(ROI_Label_code = c("b","bb","pm"), Corresponding_Variable_Name = c("Mod_Bleached","Sev_Bleached","Partial_Mortality")), align="c") %>% kableExtra::kable_styling()
Examples of the study organism and table of the annotations used during image analysis are shown in Fig. 1, and an example analysed image (from ImageJ) is shown in Fig. 2.
{width=100%}
Figure 1. Overview of annotations used in this study. Using SizeExtractR to assess population size structure for hard coral from scaled seafloor photographs of the benthos (Lachs et al., 2021; Sommer et al., 2014). A) The process of coral bleaching from a healthy state until mortality is shown for Pocillopora aliciae. The user-defined ROI Label Code for recording the different categories are shown in backets. Moderately and severely bleached colonies are recorded as ‘b’ and ‘bb’, respectively, and partial mortality is recorded as ‘pm’. No ROI Label Code denoted a healthy colony, and dead colonies were not analysed. B) Example ROI name labels are shown with descriptions, including the automatically produced codes for calibration lengths (M) and calibration points (Cali_Pts).
{width=100%}
Figure 2. Example image output from ImageJ using the SizeExtractR ImageJ macros and workflow. Note the measurement stick is chunked into 10cm lengths which are annotated as Cali_Pts and M1-M4. Note the alphanumeric annotations used on the corals, and link them to Fig. 1 to understand the labelling system (Pa, b, bb). Note there are no corals in this image that suffer from partial mortality (pm)
The aim of this vignetter is to complete quality control of annotations build a single calibrated size dataset based on an entire set of images analysed using the SizeExtractR ImageJ macros and workflow Follow the steps below to learn how! The ImageJ-analysed dataset is saved in two versions. The full folder with images and size data text files can be found in the data repository (currently accessible at the private link, but will be transferred to the doi link upon publication) as a folder named Data_post-ImageJ_all. A reduced dataset containing only the directory structure and text files (no images to save space) is saved on the SizeExtract github page.
library(devtools) devtools::install_github("liamlachs/SizeExtractR")
library(SizeExtractR)
Save a variable with the path string to the root directory folder containing the image analysis files.
mypath = "Coral_Example_b_processed_post-ImageJ"
Full_SizeExtractR_Workflow()
Note that the known.calibration.length
parameter is entered as a value 10 (cm), as that was length of each calibration length on the measurement stick. Therefore all computed size metrics will be in centimeters. The include.calibrations
parameter is set to FALSE
to avoid measurement stick lengths (e.g., Cali_Pts and M1-M4) being included in the final dataset.
data = SizeExtractR::Full_SizeExtractR_Workflow( mypath, known.calibration.length = 10, include.calibrations = FALSE)
Full_SizeExtractR_Workflow()
Everything this function does is described in the following steps, and will colminate in a full calibrated size dataset. However, note that if there are mistakes that are found during quality control, then you will need to make changes outside R manually, and then rerun Full_SizeExtractR_Workflow()
.
Note: For each following step there is a screen shot of the R console after running Full_SizeExtractR_Workflow()
. Please read the red text which is a guide for navigating the interactive quality control checks and variable setting.
Quality control - Directory Variables
Ensures the folder names are all correct. If any names are incorrect, rename the folders manually outside of the R environment, and rerun Full_SizeExtractR_Workflow()
.
{width=75%}
Set Directory Variable names
The second step is to fill in the Variable names for each directory level. Here we have entered Timepoint and Site, which will end up as two categorical variables (columns) in the final dataset.
{width=75%}
Quality control - ROI Type codes
Ensures there are no human errors (e.g., typos) in the annotated ROI Type codes (c.f. Figure 1). If errors are present choose no
and then you will be given an option to locate the specific images that contain errors. Then you would need to rerun the workflow function.
If there are no errors then proceed.
{width=75%}
Set ROI Variable names
Now we must link the user-defined ROI Label Codes to corresponding names of categorical ROI Variables for the final dataset. This information must be added manually to the ROI_Labels.csv template file outside the R environment. This template file will have been automatically created. For this worked example, we will fill the data from Table 1 into the .csv file manually, save it, then continue in R.
Please do this now.
Table 1. Repeated from above
library(knitr) library(kableExtra) knitr::kable(data.frame(ROI_Label_code = c("b","bb","pm"), Corresponding_Variable_Name = c("Mod_Bleached","Sev_Bleached","Partial_Mortality")), align="c") %>% kableExtra::kable_styling()
{width=75%}
Quality control - ROI Label Codes 1
Ensures there are no human errors (e.g., typos) in the annotated ROI Label Codes (c.f. Fig. 1). Check that the data entered into the .csv file is correct.
{width=75%}
Quality control - ROI Label Codes 2
Finally, check that the translation matrix from ROI Label Codes to ROI Variables is correct.
{width=75%}
Now the database is calibrated. You have a single, quality-controlled, calibrated dataset of object sizes from the example image dataset. View the dataset using View()
or head()
. Notice the variable names we specified have been included in the dataset.
The dataset can be saved using:
write.csv(data, "Calibrated_Dataset.csv", row.names = FALSE)
Table 2. Full dataset with all size metrics and user-defined categorical variables.
data = read.csv("Database.Vignette.csv")
library(knitr) library(kableExtra) knitr::kable( data, align="c") %>% kableExtra::kable_styling() %>% kableExtra::scroll_box(width = "100%", height = "400px")
To plot some size-frequency distributions and visualise the size data to look for any patterns, try the following example codes.
SizeExtractR::Plot_Size_Frequency( data, size.metric = "Area", log_size = TRUE, nbins = 15)
Notice that the corals are more abundant in North Solitary Island than in Northwest Solitary island. Also see that there are fewer corals after the bleaching event across both islands.
SizeExtractR::Plot_Size_Frequency( data, size.metric = "Area", log_size = TRUE, nbins = 10, group_by = c("Site", "Timepoint"), facetRow_by = "Timepoint", facetCol_by = "Site", scales_gg = "fixed")
# Compute bleaching as either moderate or severe data$Bleached = ifelse(data$Mod_Bleached == TRUE | data$Sev_Bleached == TRUE, "moderate/severe", "none") Plot = SizeExtractR::Plot_Size_Frequency( data, size.metric = "Area", log_size = TRUE, nbins = 10, group_by = c("Site", "Timepoint", "Bleached"), fill_by = "Bleached", facetRow_by = "Timepoint", facetCol_by = "Site", scales_gg = "fixed") # See Plot is a ggplot2 class object Plot + ggplot2::theme( legend.position="top")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.