This is a package developed by David Tavernini for a term project in GEOB 503. This package contains all the tools necessary to extract grain sizes from images, digitally sieve grain size measurements, and gather descriptive statistics on sieved data. For questions, comments, or issues please refer to the discussion tab or email me at taverninid@gmail.com.
This app has two dependencies:
Shiny: can be installed from CRAN
System requirements:
Mac or Linux operating system for full functionality. Shiny app cannot be accessed on PC due to videoplayR only available for Mac and Linux
R(v3.3.2). Earlier versions may work
The features within this package allow for a self-contained workflow of grain size extraction, data sieving, descriptive statistics and probability distribution fitting. An example workflow would be:
This package also contains two datasets to test the functionality of this package:
elwha_img
which is a list of grain sizes extracted from an image of a gravel bar adjacent to the Elwha Riverelwha_wol
which is the corresponding Wolman pebble count for the gravel bar in elwha_img
.Each step in the above workflow is detailed further below:
Once images are preprocessed to correct for lens distortion, cropping, etc., the images will be ready for mport. Previously, I have used GIMP to do these corrections, especially perspective transform.
Open the Shiny app using run.grainsizeR
and follow the step-by-step protocol on the left hand side. This app allows for the scaling measurements to be included in the grain size collection and will output a .csv of the collected grain sizes. Random point plotting will occur until the user is finished collecting data. To collect point measurments, determine b-axis of grain indiciated by the randomly plotted yellow point. Click on one end of the b-axis line and double click to determine other point. Once clicks are finalized (coordinates shown on the left), click "Record Measurement" and the app will refresh with a new point.
If stones are too small, make the measurement to the best of your ability and flag the measurement using the "Too Small" button. Within the final dataframe, a separate column with binary (0 = No, 1 = Yes) of flagged measurements for manual editing of data.
If points are outside the measurement boundary or if the point resides on the same grain as a previously plotted point, click "Out of Interest" and a new point will be plotted without observation.
Behind the Scenes
Scale calculations are derived from plotting two points in the image indicating distances known a priori or measurement tools present in the image (e.g. ruler).
The length $L_{img}$ between the two points plotted on the image during the scaling process is calculated using Pythagorean Theorem. Using an entered value of real world length $L_{real}$, a scaling coefficient is calculated by dividing the image line length $L_{img}$ by the real-world length $L_{real}$
Each b-axis length is measured between two points on the image and converted to real world units by dividing the X and Y distance between the two points by the horizontal and vertical scaling coefficients, respectively. The X and Y distance between the two points is converted to euclidian distance using Pythagorean Theorem.
Following grain size collection (via image or pebble count), a vector of grain sizes can be inputted to sieve.mm()
or sieve.phi ()
(depending on units) to sieve the data into counts. In these functions, individual measurements are placed into the appropriate sieve classes based on the input size classes. These functions will return a dataframe with size class and the number of grains sieved by that size class.
The statistical functions developed in this package rely on graphical geometric or algebraic methods and thus require a cumulative distribution to be generated. Therefore, binning measurements into sieve classes and the calculation of a cumulative distribution is imperative to accurately measure the percentile grain sizes for statistical analysis.
Size Distributions
With sieved data, size distributions can be determined by cumudist()
which takes a vector of counts according to ascending size and calculates the proportion of counts less than that entry. In other words, sums the number of values of indices less than each index value. This method allows for the calculation of grain sizes according specific percentiles which is a critical component in the algebraic statistical equations.
Percentile Values
stat.dx()
is a quick function to extract the grain size for a desired percentile in either phi or mm units.
Central Tendency
Median: Using stat.dx(d = 0.5)
allows for the measurement of the grain size of the 50th percentile (a.k.a. the median grain size).
Mean: stat.mean()
provides numerous approaches to calculating algebraic or geometric means from a cumulative distribution.
Distribution Shape
Sorting: stat.sorting()
: algebraic methods for calculating sorting coefficient of a grain size distribution
Skewness: stat.skew()
: algebraic and geometric methods for calculating various skewness indices of a grain size distribution
Kurtosis: stat.kurt()
: algebraic and geometric methods for calculating various kurtosis indices of a grain size distribution
Still under development
library(grainsizeR) library(tidyverse) # Load in theme theme_dtavern <- theme(panel.background = element_rect(fill = "white"), axis.line.y = element_line(colour="black", size = 0.25), axis.line.x = element_line(colour="black", size = 0.25), axis.title.x = element_text(size = 24), axis.title.y = element_text(size = 24, margin = margin(10,10,5,10)), axis.text.x = element_text(face = "bold", size = 18), axis.text.y = element_text(face = "bold", size = 18), legend.key = element_rect(fill = "white"), legend.text = element_text(size = 18), legend.position = c(0.8, 0.2)) theme_dtavern1 <- theme(panel.background = element_rect(fill = "white"), axis.line.y = element_line(colour="black", size = 0.25), axis.line.x = element_line(colour="black", size = 0.25), axis.title.x = element_text(size = 24), axis.title.y = element_text(size = 24, margin = margin(10,10,5,10)), axis.text.x = element_text(face = "bold", size = 18), axis.text.y = element_text(face = "bold", size = 18), legend.key = element_rect(fill = "white"), legend.text = element_text(size = 18), legend.position = c(0.8, 0.8)) dtavern_facet9 <- theme(panel.background = element_rect(fill = "white"), axis.line.y = element_line(colour="black", size = 0.25), axis.line.x = element_line(colour="black", size = 0.25), axis.title.x = element_text(size = 16), axis.title.y = element_text(size = 16, margin = margin(10,10,5,10)), axis.text.x = element_text(face = "bold", size = 12), axis.text.y = element_text(face = "bold", size = 12), legend.key = element_rect(fill = "white"), legend.text = element_text(size = 18), legend.position = "none") dtavern_facet9_leg <- theme(panel.background = element_rect(fill = "white"), axis.line.y = element_line(colour="black", size = 0.25), axis.line.x = element_line(colour="black", size = 0.25), axis.title.x = element_text(size = 16), axis.title.y = element_text(size = 16, margin = margin(10,10,5,10)), axis.text.x = element_text(face = "bold", size = 12), axis.text.y = element_text(face = "bold", size = 12), legend.key = element_rect(fill = "white"), legend.text = element_text(size = 18), legend.position = c(0.2, 0.8))
Central Tendency and percentile comparison
# Load example datasets img <- elwha_img wol <- elwha_wol # Sieve img data img <- sieve.mm(img$gsize_cm*10) img <- img[-1,] #remove 0mm category due to match wolman #wol already sieved # Calculate cumulative distribution img$prop <- cumudist(img$count) wol$prop <- cumudist(wol$Count) ###################### # Central Tendencies + percentiles ######################### # Calculate statistics # Convert to phi img$phi <- -log2(img$size.mm) wol$phi <- -log2(wol$Size.class) ## calculate mean using algebraic phi scale and backtransform img_mean <- 2^-stat.mean(img$phi, img$prop, units = "phi", method = "mccam") wol_mean <- 2^-stat.mean(wol$phi, wol$prop, units = "phi", method = "mccam") ## calculate sorting based on two methods img_sort_fw <- stat.sorting(img$phi, img$prop, units = "phi", method = "fw") img_sort_inm <- stat.sorting(img$phi, img$prop, units = "phi", method = "inm") wol_sort_fw <- stat.sorting(wol$phi, wol$prop, units = "phi", method = "fw") wol_sort_inm <- stat.sorting(wol$phi, wol$prop, units = "phi", method = "inm") desc_comp <- data.frame("Statistic" = c("Mean", "Sorting: Inman", "Sorting: Folkward", "D16", "D25", "D50", "D75", "D84"), "Image" = c(img_mean, img_sort_inm, img_sort_fw, ## Backtransform measurements of percentile to present in mm 2^(-stat.dx(img$phi, img$prop, d = 0.16, phi = TRUE)), 2^(-stat.dx(img$phi, img$prop, d = 0.25, phi = TRUE)), 2^(-stat.dx(img$phi, img$prop, d = 0.50, phi = TRUE)), 2^(-stat.dx(img$phi, img$prop, d = 0.75, phi = TRUE)), 2^(-stat.dx(img$phi, img$prop, d = 0.84, phi = TRUE)) ), "Wolman" = c(wol_mean, wol_sort_inm, wol_sort_fw, 2^(-stat.dx(wol$phi, wol$prop, d = 0.16, phi = TRUE)), 2^(-stat.dx(wol$phi, wol$prop, d = 0.25, phi = TRUE)), 2^(-stat.dx(wol$phi, wol$prop, d = 0.50, phi = TRUE)), 2^(-stat.dx(wol$phi, wol$prop, d = 0.75, phi = TRUE)), 2^(-stat.dx(wol$phi, wol$prop, d = 0.84, phi = TRUE)) ) ) knitr::kable(desc_comp)
Visualization of Distributions
## Generate distribution curves comp_dist_prop<- data.frame( "size_mm" = img$size.mm, "wol_prop" = wol$prop, "img_prop" = img$prop ) comp_dist_prop_long <- gather(comp_dist_prop, type, "value", 2:3) ## Generate distribution curves comp_dist_prop<- data.frame( "size_mm" = img$size.mm, "wol_prop" = wol$prop, "img_prop" = img$prop ) comp_dist_prop_long <- gather(comp_dist_prop, type, "value", 2:3) # normal nor_dist_comp <- ggplot(data = comp_dist_prop_long) + #geom_hline(yintercept = seq(0, 1, by = 0.1), color = "grey90") + geom_point(aes(x = size_mm, y = value, color = type)) + geom_line(aes(x = size_mm, y = value, color = type)) + xlab("Grain Size (mm)") + ylab("Cumulative Frequency") + scale_color_discrete(name = "", breaks = c("img_prop", "wol_prop"), labels = c("Image", "Wolman"))+ theme_dtavern # In phi units comp_dist_prop_long$phi <- -log2(comp_dist_prop_long$size_mm) nor_dist_comp_phi <- ggplot(data = comp_dist_prop_long) + #geom_hline(yintercept = seq(0, 1, by = 0.1), color = "grey90") + geom_point(aes(x = phi, y = value, color = type)) + geom_line(aes(x = phi, y = value, color = type)) + xlab(expression(phi)) + ylab("Cumulative Frequency") + scale_x_reverse() + scale_color_discrete(name = "", breaks = c("img_prop", "wol_prop"), labels = c("Image", "Wolman"))+ theme_dtavern nor_dist_comp nor_dist_comp_phi
Distribution shape: Skewness & Kurtosis
## Measure distribution shape with stat.kurt and stat.skew with two methods ##################### ## Skewness ## According to Warren 1974: sk_img_war <- -1*stat.skew(img$phi, img$prop, units = "phi", method = "war") sk_wol_war <- -1*stat.skew(wol$phi, wol$prop, units = "phi", method = "war") ## According to Gordon: sk_img_gor <- -1*stat.skew(img$phi, img$prop, units = "phi", method = "gor") sk_wol_gor <- -1*stat.skew(wol$phi, wol$prop, units = "phi", method = "gor") #################### ## Kurtosis # According to Folk and Ward 1957 ku_img_fw <- stat.kurt(img$phi, img$prop, units = "phi", method = "fw") ku_wol_fw <- stat.kurt(wol$phi, wol$prop, units = "phi", method = "fw") # According to Inman 1952 ku_img_in <- stat.kurt(img$phi, img$prop, units = "phi", method = "inm") ku_wol_in <- stat.kurt(wol$phi, wol$prop, units = "phi", method = "inm") ############# ## Distribution shape dist_comp <- data.frame("Method" = c("Image", "Wolman"), "Skewness (Warren 1974)" = c(sk_img_war, sk_wol_war), "Skewness (Gordon 1996)" = c(sk_img_gor, sk_wol_gor), "Kurtosis (Folk/Ward 1957)" = c(ku_img_fw, ku_wol_fw), "Kurtosis (Inman 1952)" = c(ku_img_in, ku_wol_in)) knitr::kable(dist_comp)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.