knitr::opts_chunk$set(comment = "#>", collapse = TRUE)

Summary

The Archdata package is a CRAN package, providing several types of data that are typically used in archaeological research.

In this particular R Script we will be using the RBPottery dataset.

Dataset Description

The RBPottery which is a a dataset containing results of chemical analyses of 48 specimens of Romano-British pottery from 5 sites in 3 regions.

A data frame with 48 observations on the following 11 variables:

Details:

Results of chemical analyses of 48 specimens of Romano-British pottery published by Tubb, et al. (1980). The numbers are the percentage metal oxide. "Kiln" indicates at which kiln site the pottery was found. The kiln sites come from three regions (1=Gloucester, (2=Llanedeyrn, 3=Caldicot), (4=Islands Thorns, 5=Ashley Rails)). The data were scanned from Table 2.2 in Baxter (2003, p. 21) and preserve three probable typographical errors in the original publication. Those errors are the values for TiO~2~ in line 4 (sample GA4), for MnO in line 35 (sample C13), and for K~2~O in line 36 (sample C14). Versions of these data are also available as Pottery in package car, pottery in package HSAUR, and Pottery2 in package heplots.

Source:

David L. Carlson and Georg Roth (2016). archdata: Example Datasets from Archaeological Research. R package version 1.1. https://CRAN.R-project.org/package=archdata

References

Baxter, M. J. 2003. Statistics in Archaeology. Arnold.

Tubb, A., A. J. Parker, and G. Nickless. 1980. The Analysis of Romano-British Pottery by Atomic Absorption Spectrophotometry. Archaeometry 22: 153-71.

Install ArchFlow package

if(!require(devtools)) install.packages("devtools",repos = "http://cran.us.r-project.org")

install_github("esteful/ArchFlow")
library("ArchFlow")

Load RBPottery from Archdata package (example data). Alternatively, import csv

if(!require(archdata)) install.packages("archdata",repos = "http://cran.us.r-project.org")
library("archdata")
  data(RBPottery)
  knitr::kable(RBPottery)

Declare df_raw and df_chem, and set the row.names

  row.names(RBPottery) <- RBPottery[,1]
  df_raw <- RBPottery  #we are calling to the whole dataset
  df_chem <- RBPottery[,-c(1:3)] #ignore all the columns containing categorical data, 9 in this case. 

Check the structure of the new dataset (skip if desired)

  str(df_raw)

In df_chem only integers (int) or numerical (num). Num class contain decimals

  str(df_chem)

Boxplots with the distribution of the chemical elements

  for (i in 1:ncol(df_chem)){
  .datt <- data.frame(REGION=df_raw$Region, i= df_chem[,i], KILN=df_raw$Kiln)
  plot(ggplot2::ggplot(.datt, ggplot2::aes(x= KILN, y = i, fill=REGION)) + ggplot2::geom_boxplot() + ggplot2::ylab(colnames(df_chem)[i]))
  }

Create categories based on the calcium content or any other condition (optional)

  df_raw$Calcium <- c() #Create the desired column

  "Low-Calcareous(CaO<6%)" ->   df_raw$Calcium[df_raw$CaO < 6]
  "Calcareous (6%<CaO<20%)" ->       df_raw$Calcium[df_raw$CaO> 6]
  "High Calcareous (CaO>%20)" -> df_raw$Calcium[df_raw$CaO > 20]

  df_raw$Calcium <- as.factor(df_raw$Calcium)  #bring back to factors

We might want to have a glance at which is the calcium content of the samples (two options)

df_raw[,c("CaO", "Calcium")]
#select(df_raw,c(CaO,Calcareous)) 

Applying statistics to the dataset

Variation Matrix

arch_varmat(df_chem)

Graphically: Variation Matrix (MVC) plotCompositional evenness graph, MVC dendrogram and Coda-Dendrogram

arch_evenness(df_chem)

Scatter plot Matrix

  #select this with the variables to use 
  vars <- c("MnO","CaO","Na2O", "TiO2","BaO", "Al2O3") 

  #alr conversion
  compositions::alr(df_chem,ivar = 2)-> df_alr
  cbind(df_raw[,1:3],df_alr) -> df_alr

  arch_scatter_matrix(df_raw=df_alr, vars, color = "Region", shape = "Kiln", title= "RBPottery")
  #a pdf file is saved in the working directory
  ##Ggcally package can offer similar kinds of visualizations

Hierarchical Cluster Analysis (HCA)

Create the dendrograms based on the clr data. Required: define df_chem for numerical df_raw whole dataset (containing categorical data). If desired the emf and pdf can be directly saved to the local folder turning the argument of printDendro = TRUE, by default is false.

nplot: is the vector of plots to be displayed. Chose the index of categorical columns to be displayed.

arch_dendro(df_chem = df_chem, df_raw = df_raw, printDendro = FALSE, nplot=c(2,3))

Principal Component Analysis (PCA)

For the alr transformation, the less variable element is generated in arch_evenness and saved as .lvar. Here ".lvar" is read automatically. nplot: is the vector of plots to be displayed. Chose the index of categorical columns to be displayed. printPCA = TRUE for generating pdf and emf files.

Principal Component Analysis (alr) without CaO, Mg, Sr, Ni

arch_PCA(df_chem, df_raw =df_raw, printPCA= FALSE, labels = FALSE, nplot = c(2,3), shape_cat_number = 2)

Heatmap (table with the concentrations)

#arch_heatmap(df_chem)

Ternary Diagram

Estimate SiO2 if not available in the dataset.

SiO2 <- 0
total <- 98


if (!"SiO2" %in% colnames(df_chem) == TRUE)  {

      SiO2 <- 1

      df_chem$SiO2 <- c(rep(0, nrow(df_chem)))

  for (i in 1:nrow(df_chem)){

      df_chem$SiO2[i] <- total - sum(df_chem[i,])

      }

cbind(df_raw[,c(1:3)], df_chem) -> df_raw

}

In ArchFlow a ternary diagram is included, where the values of CaO, \ch Al2O3 and \ch SiO2 are included in a single plot. However, as for the Tubbs's dataset the \ch SiO2 is not provided there is no chance to generate these kind of plots. See the ternary diagram of the case study instead.

arch_triangles(df_raw, plot.category = 3, rounded_circle = ) #grup = indicates the column from which the factors will be for the legend
#Remove the column of estimated SiO2
if (SiO2 == 1){
  df_chem[,-c(ncol(df_chem))] -> df_chem
  df_raw[,-c(ncol(df_raw))] -> df_raw
  SiO2 <- 0
}

Outlier detection

Through the boxplot functions outliers can be removed. In this case, TiO~2~ is chosen to be removed as a variable in order to asses that the grouping remains similar and therefore, being able to assign a group to CA4 sample, which is the one outlying due to the value of TiO~2~. This value, has been reported as a typographic error before by the ArchData package creator.

Final Remarks

message("This is the first step of the ArchFlow workflow")

Vignettes are long form documentation commonly included in packages. Because they are part of the distribution of the package, they need to be as compact as possible. The html_vignette output type provides a custom style sheet (and tweaks some options) to ensure that the resulting html is as small as possible. The html_vignette format:

Vignette Info

Note the various macros within the vignette section of the metadata block above. These are required in order to instruct R how to build the vignette. Note that you should change the title field and the \VignetteIndexEntry to match the title of your vignette.

Styles

The html_vignette template includes a basic CSS theme. To override this theme you can specify your own CSS in the document metadata as follows:

output: 
  rmarkdown::html_vignette:
    css: mystyles.css

Figures

The figure sizes have been customised so that you can easily put two images side-by-side.

plot(1:10)
plot(10:1)

You can enable figure captions by fig_caption: yes in YAML:

output:
  rmarkdown::html_vignette:
    fig_caption: yes

Then you can use the chunk option fig.cap = "Your figure caption." in knitr.

More Examples

You can write math expressions, e.g. $Y = X\beta + \epsilon$, footnotes^[A footnote here.], and tables, e.g. using knitr::kable().

knitr::kable(head(mtcars, 10))

Also a quote using >:

"He who gives up [code] safety for [code] speed deserves neither." (via)



esteful/ArchFlow documentation built on Dec. 20, 2021, 6:40 a.m.