fingerPro-package: Sediment Source Fingerprinting
In fingerPro: Sediment Source Fingerprinting

Description Author(s) See Also Examples

Soil erosion is one of the biggest challenges for food production and reservoirs siltation around the world. Information on sediment, nutrients and pollutant transport is required for effective control strategies. Source estimates are difficult to obtain using traditional monitoring techniques, but sediment source fingerprinting, has been proved to be a valuable tool. Sediment source fingerprinting offers the potential to assess sediment provenance as a basis to develop management plans and prevent erosion. The procedure focuses on developing methods that enable the apportionment of sediment sources to be identified from a composite sample of sediment mixture material. We developed an R-package as a tool to quantify the provenance of the sediments in a catchment. A mixing model algorithm is applied to the sediment mixture samples in order to estimate the relative contribution of each potential source. The package consists of a set of functions used to: i) characterise and pre-process the data, select the optimum subset of tracers; ii) unmix sediment samples and quantify the apportionment of each source; iii) assess the effect of the source variability; and iv) visualize and export the results.

Ivan Lizaga, Borja Latorre, Leticia Gaspar, Ana Navas

Maintainer: Ivan Lizaga <ilizaga@eead.csic.es // lizaga.ivan10@gmail.com>

https://github.com/eead-csic-eesa

#Created on 22/08/2018

#If you want to use your own data
#setwd("the directory that contains your dataset")
#data <- read.table('your dataset', header = T, sep = '\t')
#install.packages("fingerPro")
#library(fingerPro)
#Example of the data included in the fingerPro package
#Load the dataset called "catchment" 

# "Catchment": this dataset has been selected from a Mediterranean catchment for 
#this purpose and contains high-quality radionuclides and geochemistry data.
#AG (cropland)
#PI and PI1 (Pine forest, at first looks different but when you display de LDA plot 
#you will see that the wisher decision in join both pines as the same source)
#SS (subsoil)
data <- catchment
#boxPlot(data, columns = 1:6, ncol = 3)
#correlationPlot(data, columns = 1:5, mixtures = TRUE)
LDAPlot(data, P3D=FALSE)
#variables are collinear
#select the optimum set of tracers by implementing the statistical tests 
data <- rangeTest(data)
data <- KWTest(data)
data <- DFATest(data)
#Check how the selected tracers discriminate between sources
LDAPlot(data, P3D=FALSE)
#change P3D=FALSE to P3D=TRUE to visualize the 3D LDAPlot
#2D and 3D LDAPlots suggest that two of the sources have to be combined
#reload the original dataset "catchment"
data <- catchment
# Combine sources PI1 and PI based on the previous LDAPlot
data$Land_Use[data$Land_Use == 'PI1'] <- 'PI'
#select the optimum set of tracers by implementing the statistical tests 
data <- rangeTest(data)
data <- KWTest(data)
data <- DFATest(data)
LDAPlot(data, P3D=FALSE)
PCAPlot(data)
#Now the optimum tracer properties selected discriminate well, so proceed with the unmix function
result <- unmix(data, samples = 100L, iter =100L)
#Display the results
plotResults(result, y_high = 5, n = 1)
writeResults(result)

Registered S3 method overwritten by 'GGally':
  method from   
  +.gg   ggplot2
Registered S3 methods overwritten by 'lme4':
  method                          from
  cooks.distance.influence.merMod car 
  influence.merMod                car 
  dfbeta.influence.merMod         car 
  dfbetas.influence.merMod        car 
Warning messages:
1: no DISPLAY variable so Tk is not available 
2: In rgl.init(initValue, onlyNULL) : RGL: unable to open X11 display
3: 'rgl.init' failed, running with 'rgl.useNULL = TRUE'. 
Too few points to calculate an ellipse
Too few points to calculate an ellipse
Warning messages:
1: In lda.default(x, grouping, ...) : variables are collinear
2: Removed 1 row(s) containing missing values (geom_path). 
Attention-> 3 variables from a total of 21 were removed: Cr Pb Zn . The variable/variables that remains in your dataset is/are: Pbex K40 Bi214 Ra226 Th232 U238 Nb Sr Rb Fe Mn V Ti Ca K Al Si Mg .Attention-> 15 variables from a total of 18 were removed: K40 Th232 U238 Nb Sr Rb Fe Mn V Ti Ca K Al Si Mg . The variable/variables that remains in your dataset is/are: Pbex Bi214 Ra226 .Attention-> variables were removed from a total of 3 : . The variable/variables that remains in your dataset is/are: Pbex Bi214 Ra226 .Too few points to calculate an ellipse
Too few points to calculate an ellipse
Warning message:
Removed 1 row(s) containing missing values (geom_path). 
Attention-> 3 variables from a total of 21 were removed: Cr Pb Zn . The variable/variables that remains in your dataset is/are: Pbex K40 Bi214 Ra226 Th232 U238 Nb Sr Rb Fe Mn V Ti Ca K Al Si Mg .Attention-> 13 variables from a total of 18 were removed: K40 Th232 U238 Nb Sr Rb Fe V Ti Ca K Al Mg . The variable/variables that remains in your dataset is/are: Pbex Bi214 Ra226 Mn Si .Attention-> 2 variables were removed from a total of 5 : Ra226 Mn . The variable/variables that remains in your dataset is/are: Pbex Bi214 Si .Too few points to calculate an ellipse
Too few points to calculate an ellipse
Warning message:
Removed 1 row(s) containing missing values (geom_path). 
Too few points to calculate an ellipse
Too few points to calculate an ellipse
Too few points to calculate an ellipse
Too few points to calculate an ellipse
Warning message:
Removed 2 row(s) containing missing values (geom_path). 
Summary of the model imputs:
         3 variables from 3 sources ( AG PI SS )Press [enter] to unmix your data
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Summary of the model outputs: 
 See below the result/s of the unmixing process using the source variability of the best 100 results, notice that the first row of the results is the central value or the average with no correction 
 
     id   GOF.mean     GOF.SD    AG.mean      AG.SD    PI.mean      PI.SD
1 42744 0.94430071 0.03681212 0.18148918 0.06138831 0.47266473 0.07858758
     SS.mean      SS.SD
1 0.34584610 0.06549228
Timing stopped at: 0.11 0 0.114
<environment: R_GlobalEnv>