require(knitr)
require(printr) # devtools::install_github("yihui/printr")
require(rmarkdown)
require(ggplot2)
require(reshape2)
require(plyr)
require(dplyr)
require(vcd)
require(rjags)
require(runjags)
require(coda)
require(BEST)
require(lattice)
require(xtable)
require(data.table)
require(git2r)
require(dependencies) # devtools::install_github("ropensci/dependencies")
require(Bchron)
require(JerimalaiStoneArtefacts)
opts_chunk$set(tidy=FALSE, warning=FALSE, message=FALSE, 
               comment = NA, verbose = TRUE, echo=TRUE,
               cache = TRUE, quiet = TRUE)
theme_set(theme_bw(base_size=14))
runjags.options( inits.warning=FALSE , rng.warning=FALSE )
# custom functions to save figures at JAS size & resolution
# cf http://www.elsevier.com/author-schemas/artwork-and-media-instructions/artwork-sizing
jhe_190mm_ggsave <- function(filename = default_name(plot), 
                            width= 190, 
                            units = "mm", 
                            dpi= 1000, ...) {
  ggsave(filename=filename, width=width, dpi=dpi, units=units, ...)
}

jhe_90mm_ggsave <- function(filename = default_name(plot), 
                            width= 90, 
                            units = "mm", 
                            dpi= 1000, ...) {
  ggsave(filename=filename, width=width, dpi=dpi, units=units, ...)
}

Introduction

This document is part of a research compendium that accompanies the analysis of stone artefacts from Jerimalai rockshelter reported in Marwick et al. "Pleistocene-aged stone artefacts from Jerimalai, East Timor: Long term conservatism in early modern human technology in island Southeast Asia".

This supplement has several parts. This file explains explains how the results of the paper can be reproduced with the included code and data. It also includes the R code to reproduce the results found in the published paper. The other R markdown files are additional description and analyses (each square by itself) that are not reported in the paper, but are included here to respond to comments from peer reviewers. These R markdown files for both squares are organised with the same structure as the manuscript and will produce all of the quantitative results, data visualisations and tables found in the manuscript.

Reproducing the results in the paper

The source file for this document is an R markdown document, which means that it is a plain text file that contains marked-up text (such as this paragraph), and chunks of the R programming language in between paragraphs of plain text. This file may be viewed in any text editor, and executed to run the code using R version 3.2.0 or higher.

This section of the document is organised with the same structure as the manuscript and will produce all of the quantitative results, data visualisations and tables found in the manuscript. This document is an R markdown document, which means that it is a plain text file that contains marked-up text (such as this paragraph), and chunks of the R programming language in between paragraphs of plain text. This file may be viewed in RStudio or any text editor, and executed to run the code using R. When the document is executed, a HTML file is produced that contains the tables, plots and other statistical output generated by the code (if you see plots below, you are reading this HTML file). All of the analysis code is present in both the R markdown and HTML files, but some lines of setup code have been hidden from the HTML file to improve readability.

The purpose of providing this supplement is to enable the reader to more thoroughly evaluate the reliability of our work through close inspection of the quantitative methods we used in the paper, and to enable computational reproduciblity of our methods to facilitate their application to other research projects. Note that the code presented here has been developed and tested specifically for this analysis, and is not yet intended as a general purpose tool.

Code and Dependencies

The exact version of the code and data that produced the results and figures in the published paper is archived at http://dx.doi.org/10.6084/m9.figshare.985406. Readers are recommended to use the files on the figshare repository because they are more complete than the supplementary files hosted by the journal publisher (requiring fewer additional downloads to run the code), and they have not been altered by the journal publisher during the process of preparing the files for publication.

The development version of this code can be found at https://github.com/benmarwick/Pleistocene-aged-stone-artefacts-from-Jerimalai--East-Timor. Note that the code in the development version on github may have changed since publication and may not produce exactly the same output as found in the published paper.

The code should run on a typical personal computer (Windows/OSX/Linux) that can run R and can install R packages from the internet. The specific software dependencies for the code included here are listed in the code chunk above this text and the DESCRIPTION file of the R package that this supplement file is contained in (the R package name is JerimalaiStoneArtefacts).

Managing the dependencies can be tedious, and we have no control over how they will change in the future. In an effort to capture the entire computational environment that this analysis was developed and conducted in, and protect against changes in the dependent software packages, we provide a Docker image as a lightweight virtual machine that is the actual environment in which we developed, tested and ran the code. The Docker image contains all the necessary software, code and data already installed, so no further configuration is required. We also include a Dockerfile as a record of the instructions used to build the Docker image, this can be found in the docker/ directory in the research compendium. To launch the Docker image for this project, first, install Docker, then run Docker and at the Docker prompt, enter:

docker run -dp 8787:8787 benmarwick/jerimalaistoneartefacts

Then open your web browser at localhost:8787 or or run docker-machine ip default in the shell to find the correct IP address. This will run RStudio in your web browser (username and password are "rstudio").

Once logged in, the Files pane (bottom right) will show the manuscript/ directory where you can find this document and execute it. More information about using RStudio in Docker is available at the Rocker wiki pages.

Data availability

All of the data needed to reproduce the results presented in the paper are included in this compendium as CSV files (ie. plain text comma separated variables, readable in any text editor). They can be found in the data/ directory and are made available with a CC-0 licence.

Results for square A and square B combined

Here we combine stone artefacts from both squares.

Chronology of the excavated deposit

The code chunk below reproduces Figure 2, the depth-age distribution for radiocarbon dates from Jerimalai square B.

# if interactive, setwd to this source file location
sqB_dates <- read.csv("data/Jerimalai_dates_Square_B.csv", as.is = TRUE)
sqA_dates <- read.csv("data/Jerimalai_dates_Square_A.csv", as.is = TRUE)
sqA_dates$sq <- "A"
sqB_dates$sq <- "B"
both_sqs_dates <- rbind(sqA_dates, sqB_dates)

# we have to put in order to let the calibration work...
both_sqs_dates <- both_sqs_dates %>%
                   arrange(depth_bs) 

ages <- BchronCalibrate(ages = both_sqs_dates$age,
                    ageSds = both_sqs_dates$error,
                    positions = both_sqs_dates$depth_bs, 
                    calCurves = rep("intcal13",  
                                length(both_sqs_dates$age)))
# show tables of calibrated age ranges for each date
# summary(ages)

## credible intervals for each date
# # create age samples for each date
age_samples = lapply(ages, function(x) sample(x$ageGrid,size=2000,replace=TRUE,prob=x$densities))
# Now summaries them with quantiles - this gives a 95% credible interval
credible_intervals <- ldply(age_samples,
                            quantile,
                            prob=c(0.025,0.975))
credible_intervals[, 2:3] <- apply(credible_intervals[, 2:3], 2, round, 0)

# for the table
tbl_one_dates <- 
  cbind(both_sqs_dates, 
        credible_intervals)

tbl_one_dates <- tbl_one_dates %>% 
                  select(sq, 
                         spit,
                         depth_bs,
                         material,
                         age,
                         error,
                         `2.5%`,
                         `97.5%`,
                         lab_code)



names(tbl_one_dates) <- c("Square",
                          "Spit",
                          "Depth below surface (m)",
                          "Material",
                          "Radiocarbon Age",
                          "Radiocarbon Error",
                          "Calibrated 2.5% Credible interval",
                          "Calibrated 97.5% Credible interval",
                          "Lab code")

tbl_one_dates <- tbl_one_dates %>% 
                  arrange(Square, Spit)

write.csv(tbl_one_dates, "figures/table_1_radiocarbon_dates.csv")
kable(tbl_one_dates, caption = "Radiocarbon dates for Jerimalai square A and B. Dates calibrated using the IntCal13 curve (Reimer et al. 2013) by BChron 4.1.2.")
ages_predict = Bchronology(ages = both_sqs_dates$age,
                     ageSds = both_sqs_dates$error,
                     positions = both_sqs_dates$depth_bs, 
                     positionThicknesses = 0.01,
                     ids = both_sqs_dates$lab_code,
                     calCurves = rep("intcal13", length(both_sqs_dates$age)))

# save plot
png("figures/fig_2-Jeremalai-dates.png", 
    width = 200, 
    height = 120, 
    units = "mm", res = 100)

bchron_plot <- function(){
plot(ages_predict,
     main="",
     xlab='Age (cal years BP)',
     ylab='Depth (m)',
     las=1, 
     dateHeight = 0.1) #, 
     #asp=0.6)

# add phases

phases <- data.frame(phase = 1:4,
                     start = c(42,  17, 6.9, 5.3),
                     end =   c(35,  9,  5.5, 0 ))

phase_lines <- (phases$start * 1000)
line_height <- c(1.7, 0.3, -0.1, 0.4)

lines(rep(phase_lines[1],2), line_height[1:2], col = "blue")
lines(rep(phase_lines[2],2), line_height[c(1,3)], col = "blue")
lines(rep(phase_lines[3],2), line_height[c(1,3)], col = "blue")
lines(rep(phase_lines[4],2), line_height[c(1,3)], col = "blue")

text(phase_lines[1] - 2500, line_height[4], labels = "Phase I")
text(phase_lines[2] - 2500, line_height[4], labels = "Phase II")
text(phase_lines[3] - 600, line_height[4]+1, labels = "Phase III", srt = 90)
text(phase_lines[4] - 2500, line_height[4] , labels = "Phase IV")
}
bchron_plot()
# end saving plot
dev.off()

# combine all the calibrated dates into a single plot (using results = 'hide' to hide the progress bar)
# ages_densities <- BchronDensity(ages = dates$age,
#                     ageSds = dates$error,
#                     
#                     calCurves = rep("intcal13", length(dates$age)))

# plot(ages_densities, xlab='Age (cal years BP)', withPositions = TRUE)
# and show the plot when this document is knited
bchron_plot()
# various dadta
core_types <- read.csv("data/Jerimalai_cores_techno_metrics.csv")
# retouch indices
retouch_indices <- read.csv("data/Jerimalai_retouch_indices.csv")
# sediment volumes
vols <- read.csv("data/Artefact densities with soil volumes Sq B.csv", skip = 1)


# read in sq B data
sqB_all <- read.csv("data/Jerimalai_All_Artefacts_Square_B.csv")
sqB_depths <- read.csv("data/Jerimalai_spit_depths_Square_B.csv")


# read in sq A data
sqA_all <- read.csv("data/Jerimalai_All_Artefacts_Square_A.csv") %>% 
  filter(Square == "A")
sqA_depths <- read.csv("data/Jerimalai_spit_depths_Square_A.csv") 

# merge A and B
names_both <- intersect(names(sqA_all), names(sqB_all))
# select only columns in both data sets
sqA_all_s <- sqA_all %>% select_(.dots = names_both)
sqB_all_s <- sqB_all %>% select_(.dots = names_both)

# combine
both_sqs_all <- rbind(sqA_all_s, sqB_all_s)

# subset flakes
both_sqs_flakes <- both_sqs_all %>%
                      filter(Artclas == "Flake")

Results: Raw materials at both squares

The code chunk below computes the percentages of the two major cortex types for chert flakes, and displays the result in a table.

cortex_type <- as.data.frame(table(both_sqs_flakes$Cortype))[-1,]
cortex_type$prop <- round(prop.table(cortex_type$Freq),2)
names(cortex_type)  <- c("Cortex_type", "Frequency", "Proportion")
kable(cortex_type, caption = "Cortex type among chert flakes")

The chunk above also generates the values that are found in this sentence in the manuscript: "The chert artefacts have a combination of rounded cortex (r filter(cortex_type, Cortex_type == 'Round')$Proportion * 100%) and angular cortex (r filter(cortex_type, Cortex_type == 'Angul')$Proportion * 100%)."

The code chunk below subsets the stone artefact data so that we only include flakes without any missing mass data. We then assign each flake to a depositional phase based on the spit it was recovered from. There is no visual output from this chunk but the values are stored and used in the following chunks.

# we have to do this separately for sq A and sq B.
## sq A

# omit rows with blanks or NAs
sqA_all <- both_sqs_all %>%
               filter(Square == "A") %>%
               filter(Weight != "") %>%
               filter(!is.na(Weight))

# put depths on lithic data
sqA_all$depth <- sqA_depths$Depth.bs..m[match(sqA_all$Spit,sqA_depths$Spit.no)]

# omit rows with blanks or NAs... again
sqA_all <- sqA_all[!(sqA_all$depth == "" | is.na(sqA_all$depth)), ]

# make phases for sq A
sqA_makephases <- function(x) {

                            ifelse(x >= 1 & x <= 5, 4,
                            ifelse(x >= 6 & x <= 25, 3,
                            ifelse(x >= 26 & x <= 38, 2,
                            ifelse(x >= 39 & x <= 46, 1, NA))))
}

sqA_all$phase <- sqA_makephases(sqA_all$Spit)

# check if any NA
check <- unique(sqA_all$phase)
# omit NA
sqA_all <- sqA_all %>% filter(!is.na(phase))

## sq B

# omit rows with blanks or NAs
sqB_all<- both_sqs_all %>%
               filter(Square == "B") %>%
               filter(Weight != "") %>%
               filter(!is.na(Weight))

# put depths on lithic data
sqB_all$depth <- sqB_depths$Depth.bs..m[match(sqB_all$Spit,sqB_depths$Spit.no)]

# omit rows with blanks or NAs... again
sqB_all <- sqB_all[!(sqB_all$depth == "" | is.na(sqB_all$depth)), ]

# make phases for sq B
sqB_makephases <- function(x) {
                       ifelse(x > 3 & x <= 20, 4,
                       ifelse(x >= 21 & x <= 39, 3,
                       ifelse(x >= 40 & x <= 48, 2,
                       ifelse(x >= 49 & x <= 69, 1, NA))))

}

sqB_all$phase <- sqB_makephases(sqB_all$Spit)

# check if any NA
check <- unique(sqB_all$phase)
# omit NA
sqB_all <- sqB_all %>% filter(!is.na(phase))

######

# combine both squares again
both_sqs_all_with_phases <- rbind(sqA_all, sqB_all)

The code chunk below generates a table that summarizes the frequencies of the main raw materials in each depositional phase. A raw material is considered dominant here if there are more than ten pieces in a phase.

# get all flakes
flakes <- both_sqs_all_with_phases %>% filter(Artclas == "Flake")
# raw material
raw <- dcast(flakes, Material ~ phase) 
rownames(raw) <- raw[,1]
# get rid of rows with no value
raw <- raw[rownames(raw) != "",]
# remove col of NA
raw <- raw[,colnames(raw) != "NA"]
raw <- raw[,-1]
# subset dominant raw materials
dom <- raw[rowSums(raw) > 10,]
colnames(dom) <- c("phase 1", "phase 2", "phase 3", "phase 4")
kable(dom, caption = "Frequencies of dominant raw materials by depositional phase in both squares")

The code chunk below uses the Bayesian Poisson exponential ANOVA to compute the probabilities of any credible interactions between raw material frequencies and depositional phase. The goal is to determine if there are any significant changes in the use of raw materials for stone artefacts over time.

data <- melt(as.matrix(dom),  varnames=c("raw_material", "phase"), value.name="Freq")
myDataFrame = data
yName=names(myDataFrame)[3]   # Freq
x1Name=names(myDataFrame)[2]  # phase 
x2Name=names(myDataFrame)[1]  # raw material
x1contrasts = list( 
  list( c("phase 1")  , c("phase 2") , compVal=0.0 , ROPE=c(-0.1,0.1) ) ,
  list( c("phase 2")  , c("phase 3") , compVal=0.0 , ROPE=c(-0.1,0.1)) ,
  list( c("phase 3")  , c("phase 4") , compVal=0.0 , ROPE=c(-0.1,0.1)) 
  )
numSavedSteps = 12000 # MCMC parameters
thinSteps = 10        # MCMC parameters

mcmcCoda = genMCMC_cont_table( datFrm=myDataFrame , 
                    yName=yName , x1Name=x1Name , x2Name=x2Name ,
                    numSavedSteps=numSavedSteps , thinSteps=thinSteps , 
                    )

# Get summary statistics of chain:
summaryInfo = smryMCMC_cont_table( mcmcCoda , 
                        datFrm=myDataFrame , x1Name=x1Name , x2Name=x2Name ,
                        x1contrasts=x1contrasts 
                         )

# #  Display posterior information (not easy to read unless using interactively):
#     plotMCMC_cont_table( mcmcCoda , 
#               datFrm=myDataFrame , yName=yName , x1Name=x1Name , 
#                x2Name=x2Name ,
#               x1contrasts=x1contrasts  )

# subset summaryInfo to show HDI interval for interactions
HDI_intervals_for_interactions <- summaryInfo[c((nrow(summaryInfo)-length(x1contrasts)+1):nrow(summaryInfo)), c("HDIlow", "HDIhigh")]
kable(round(HDI_intervals_for_interactions,3), caption = "Highest density intervals (95%) for the posterior distributions of the interactions between phases and raw material frequencies in both squares.")

The code chunk above provides posterior distributions to evaluate the credibility of differences in raw material frequencies over time. The posterior distributions for the phase interactions are close to zero, indicating a small effect. Among the contrast of the phases, all HDIs exclude zero except for phase 2 v. phase 3. This indicates that there are credibly different frequencies of raw material between each phase except for 2 and 3. This is likely are result of small changes in the low frequencies of quartz, quartzite and silcrete in the earlier phases, as shown in the phase by raw material table above.

The code chunk below computes a NHST equivalent to the above Bayesian test, in this case a chi-square test. We also compute Cramer's V, a measure of association for nominal variables that ranges from 0 (no association between the variables) to 1 (strong association between the variables).

# here is the frequentist equivalent 
raw_material_by_phase_nhst <- chisq.test(dom)
cramers_V <- assocstats(as.matrix(dom)) 

The code chunk above returns a chi-squared value of r round(raw_material_by_phase_nhst$statistic, 2) and a p-value of r round(raw_material_by_phase_nhst$p.value, 2), indicating a significant difference in raw material frequencies by phase. However, the Cramer's V value of r round(cramers_V$cramer, 3) indicates that the effect size is extremely small. We interpret this result to mean that there is no substantial significance in the differences in raw materials frequencies by phase.

Results: Discard rates

The code chunk below generates the figure that shows discard rates of chert artefacts over time at Jerimalai in both squares. Each point is an excavation unit. The blue line is a locally weighted regression line (span = 0.4) to aid in visualizing the trend of increased discard in the upper part of the deposit.

# discard rates
discard <-  aggregate(Weight ~ depth + Spit + Square, flakes, length)

# get sediment volumes from CSV file
sqA_vols <- vols %>%
              plyr::summarize(spit =  X.3, vol = Soil.1)

sqB_vols <- vols %>%
              plyr::summarize(spit =  X, vol = Soil)

sqA_discard <- discard %>% filter(Square == "A")
sqA_discard$sedvol <- sqA_vols$vol[match(sqA_discard$Spit, sqA_vols$spit)]
sqA_discard$thick <- c(0.018, diff(sqA_discard$depth))
sqA_discard$kgsed <- with(sqA_discard, Weight / sedvol)
sqA_discard$cubmet <- with(sqA_discard, Weight / thick)

sqB_discard <- discard %>% filter(Square == "B")
sqB_discard$sedvol <- sqB_vols$vol[match(sqB_discard$Spit, sqB_vols$spit)]
sqB_discard$thick <- c(0.018, diff(sqB_discard$depth))
sqB_discard$kgsed <- with(sqB_discard, Weight / sedvol)
sqB_discard$cubmet <- with(sqB_discard, Weight / thick)

# combine again
both_sqs_discard <- rbind(sqA_discard, sqB_discard)

# remove spit that is far too thin...
both_sqs_discard <- both_sqs_discard %>% filter(thick > 0.01)


# Plot
ggplot(both_sqs_discard, (aes(depth, cubmet, colour = Square))) +
  geom_point() +
  stat_smooth(span = 0.5, se = FALSE) +
  xlab("Depth below surface (m)") +
  ylab("Number of chert flakes per \ncubic metre of deposit") +
  theme(axis.text=element_text(size=10)) +
  coord_flip() +
  scale_x_reverse()
# save plot
jhe_90mm_ggsave("figures/fig_3_Jeremalai-flake-discard.png")

The code chunk below the generates the figure showing discard rates of chert artefacts per depositional phase at Jerimalai square B. This aggregates the individual excavation units.

# plot artefacts/cubic meter/1000 years by phase, get phase number for each spit
# this is the most sensible option
both_sqs_discard$phase <- ifelse(both_sqs_discard$Square == "A",           
                                 sqA_makephases(both_sqs_discard$Spit), 
                                 sqB_makephases(both_sqs_discard$Spit))
# this is ok for both squares
phases <- data.frame(phase = 1:4,
                     start = c(42,  17, 6.9, 5.3),
                     end =   c(35,  9,  5.5, 0 ))
phases$duration <- with(phases, start - end)

discard_agg <- aggregate(cubmet ~ phase, both_sqs_discard, mean)
discard_agg$cubmetperkyr <- discard_agg$cubmet / phases$duration
ggplot(discard_agg, (aes(phase, cubmetperkyr))) +
  geom_bar(stat="identity", colour = "black", fill = "white") +
  theme_minimal() +
  xlab("Depositional phase") +
  ylab("Mean number of chert flakes per \ncubic meter of deposit per thousand years in bothe squares") 
# save plot
jhe_90mm_ggsave("figures/fig_4_Jeremalai-flake-discard-phase-m3.png", height = 90)

Results: Artefact taphonomy

The code chunk below computes a Bayesian Poisson exponential ANOVA to investigate differences in flake breakage classes over time.

allchert <- both_sqs_all[both_sqs_all$Material == 'Chert', ]
# assign phases
allchert$phase <- ifelse(allchert$Square == "A",           
                                 sqA_makephases(allchert$Spit), 
                                 sqB_makephases(allchert$Spit))


# make Artclass that is long and transv breaks
allchert$Artclas <- ifelse(allchert$Breaks == "", 
       as.character(allchert$Artclas), 
       paste(allchert$Artclas, allchert$Breaks, sep = "-"))
taph <- data.frame(table(allchert$Artclas))
# use regex to get broken flakes -b- 
broken <- allchert[grep("-b", allchert$Artclas), ]
# get counts of broken to complete per phase
# flake to -b-
breaks <- dcast(allchert, Artclas ~ phase)[-1,]
breaks <- breaks[breaks$Artclas =="Flake" | grepl("-b-", breaks$Artclas), ]
allchert$Artclas <- tolower(allchert$Artclas)
allchert$breakt <- "" # create variable to fill
allchert$breakt[grep("trans", allchert$Artclas)] <- "trans"
allchert$breakt[grep("long", allchert$Artclas)] <- "long"
# per depositional phase
breakt <- dcast(allchert, breakt ~ phase)[-1,]
# add complete flake counts
breakt <- rbind( breakt , setNames( breaks[1, ] , names( breakt ) ) )
# shift rownames out and delete them
rownames(breakt) <- breakt[,1]
breakt <- breakt[,-1]
# exclude artefacts not assigned to a phase
breakt <- breakt[, -which(names(breakt) == "NA")]
# do bayesian contingency table test
colnames(breakt) <- c("phase 1", "phase 2", "phase 3", "phase 4")
data <- melt(as.matrix(breakt),  varnames=c("breakt", "phase"), value.name="Freq")

myDataFrame = data
yName=names(myDataFrame)[3]   # Freq 
x1Name=names(myDataFrame)[2]  # phase
x2Name=names(myDataFrame)[1]  # break type
x1contrasts = list( 
  list( c("phase 1") , c("phase 2") , compVal=0.0 , ROPE=c(-0.1,0.1) ) ,
  list( c("phase 2")  , c("phase 3") , compVal=0.0 , ROPE=c(-0.1,0.1)) ,
  list( c("phase 3")  , c("phase 4") , compVal=0.0 , ROPE=c(-0.1,0.1)) 
  #list( c("phase 4")  , c("phase 5") , compVal=0.0 , ROPE=c(-0.1,0.1))
  )
numSavedSteps = 12000 # MCMC parameters
thinSteps = 10        # MCMC parameters

mcmcCoda = genMCMC_cont_table( datFrm=myDataFrame , 
                    yName=yName , x1Name=x1Name , x2Name=x2Name ,
                    numSavedSteps=numSavedSteps , thinSteps=thinSteps , 
                    )

# Get summary statistics of chain:
summaryInfo = smryMCMC_cont_table( mcmcCoda , 
                        datFrm=myDataFrame , x1Name=x1Name , x2Name=x2Name ,
                        x1contrasts=x1contrasts 
                         )

# #  Display posterior information (not easy to read unless using interactively):
#     plotMCMC_cont_table( mcmcCoda , 
#               datFrm=myDataFrame , yName=yName , x1Name=x1Name , 
#                x2Name=x2Name ,
#               x1contrasts=x1contrasts  )

# table of raw counts
kable(breakt, caption = "Table of frequencies of each class of breakage by phase, in both squares")

# subset summaryInfo to show HDI interval for interactions
HDI_intervals_for_interactions <- summaryInfo[c((nrow(summaryInfo)-length(x1contrasts)+1):nrow(summaryInfo)), c("HDIlow", "HDIhigh")]
kable(round(HDI_intervals_for_interactions,3), caption = "Highest density intervals (95%) for the posterior distributions of the interactions between phases and flake breakage classes, in both squares.")

The above code chunk returns results that there are credible difference but small differences in the frequencies of flake breakage types by phase.

The code chunk below computes the frequentist equivalent, the chi-square test and the Cramer's V for effect size.

artefact_taphonomy_nhst <- assocstats(as.matrix(breakt))

The code chunk above returns a chi-squared value of r round(artefact_taphonomy_nhst$chisq_tests[2,1], 2) and a p-value of r round(artefact_taphonomy_nhst$chisq_tests[2,3], 2), indicating a significant difference in frequencies of breakage classes by phase. However, the Cramer's V value of r round(artefact_taphonomy_nhst$cramer, 3) indicates that the effect size is extremely small. We interpret this result to mean that although the test result is statistically significant, there is no substantial significance in the differences in frequencies of breakage types by phase.

The code chunk below draws a plot of frequencies of complete flakes, transversely broken flakes and longitudinally broken flakes made from chert in each depositional unit at Jerimalai square B.

data$phase <- gsub("[[:alpha:]]*", "", data$phase)
# recode break type for pretty legend
data$breakt_ <- with(data, ifelse(breakt == 'long', 'long. broken flakes',
        ifelse(breakt == "trans", "trans. broken flakes",
               ifelse(breakt == "Flake", "complete flakes", NA))))
ggplot(data, aes(phase, Freq, fill = breakt_)) +
  ylab("Frequency") +
  geom_bar(stat = "identity") +
  scale_fill_discrete(name="flake type") +
  xlab("depositional phase") +
  theme_minimal((base_size = 6))
jhe_90mm_ggsave("figures/fig_5_Jeremalai-flake-broken-phase.png", height = 90/2)

The code chunk below summarizes the frequencies of heat-treated flakes in both squares at Jerimalai

check <- sum(flakes$Heat, na.rm = TRUE) / nrow(flakes)
heat <- aggregate(Heat ~ phase, flakes, length)
total <- aggregate(Spit ~ phase, flakes, length)
heat$Not_heat <- total$Spit - heat$Heat
# show proportions that are heat-treated
max_heat <- max(heat$Heat / total$Spit)
min_heat <- min(heat$Heat / total$Spit)
heat <- t(heat)
# do bayesian contingency table test
colnames(heat) <- c("phase 1", "phase 2", "phase 3", "phase 4")
heat <- heat[-1,]

Between r round(min_heat, 2)*100% and r round(max_heat, 2)*100% of chert artefacts in each depositional phase show signs of having been heated, such as crenation, potlid scars or surface crazing.

The code chunk below computes a Bayesian Poisson exponential ANOVA to investigate differences in heat treatment by phase.

data <- melt(as.matrix(heat),  varnames=c("heat", "phase"), value.name="Freq")
myDataFrame = data
yName=names(myDataFrame)[3]   # Freq 
x1Name=names(myDataFrame)[2]  # phase
x2Name=names(myDataFrame)[1]  # heat
x1contrasts = list( 
  list( c("phase 1") , c("phase 2") , compVal=0.0 , ROPE=c(-0.1,0.1) ) ,
  list( c("phase 2")  , c("phase 3") , compVal=0.0 , ROPE=c(-0.1,0.1)) ,
  list( c("phase 3")  , c("phase 4") , compVal=0.0 , ROPE=c(-0.1,0.1)) 

  )
numSavedSteps = 12000 # MCMC parameters
thinSteps = 10        # MCMC parameters

mcmcCoda = genMCMC_cont_table( datFrm=myDataFrame , 
                    yName=yName , x1Name=x1Name , x2Name=x2Name ,
                    numSavedSteps=numSavedSteps , thinSteps=thinSteps , 
                    )

# Get summary statistics of chain:
summaryInfo = smryMCMC_cont_table( mcmcCoda , 
                        datFrm=myDataFrame , x1Name=x1Name , x2Name=x2Name ,
                        x1contrasts=x1contrasts 
                         )

# # #  Display posterior information (not easy to read unless using interactively):
#     plotMCMC_cont_table( mcmcCoda , 
#               datFrm=myDataFrame , yName=yName , x1Name=x1Name , 
#                x2Name=x2Name ,
#               x1contrasts=x1contrasts  )

# table of heated artefact counts
kable(heat, caption = "Table of frequencies of heat treatment by phase at both squares")

# subset summaryInfo to show HDI interval for interactions
HDI_intervals_for_interactions <- summaryInfo[c((nrow(summaryInfo)-length(x1contrasts)+1):nrow(summaryInfo)), c("HDIlow", "HDIhigh")]
kable(round(HDI_intervals_for_interactions,3), caption = "Highest density intervals (95%) for the posterior distributions of the interactions between phases and flake heat treatment at both squares.")

The code chunk above returned results that indicate credible differences in the frequency of heat treatment between each phase except for phase one and two. The phase three to four transition is particularly different from the others, but this corresponds to only a r round(abs(heat[1,3]/sum(heat[1:2,3]) - heat[1,4]/sum(heat[1:2,4]) * 100),1)% change increase in frequency of heated pieces. all the HDIs are close to zero, indicating a small effect size.

The code chunk below computes a frequentist chi-square and Cramer's V test for the heat alteration data.

chert_artefacts_heat_nhst <- assocstats(as.matrix(heat))

The code chunk above returns a chi-squared value of r round(chert_artefacts_heat_nhst$chisq_tests[2,1], 2) and a p-value of r round(chert_artefacts_heat_nhst$chisq_tests[2,3], 2), indicating a significant difference in frequencies of breakage classes by phase. However, the Cramer's V value of r round(chert_artefacts_heat_nhst$cramer, 2) indicates that the effect size is extremely small. We interpret this result to mean that although the test result is statistically significant, there is no substantial significance in the differences in frequencies of heat alteration by phase.

Results: Metric and technological characteristics of cores and unretouched flakes

The code chunk below produces the table "summarizing the attributes of chert complete flakes from Jerimalai. Each cell contains mean ± standard deviation unless otherwise indicated. The table produced here has been rearranged by hand for the paper.

# the table has been rearranged by hand for the paper
metrics <- flakes %>% 

  group_by(phase) %>%

  summarise(median(Length, na.rm = TRUE),  
            median(Width, na.rm = TRUE), 
            median(Thick, na.rm = TRUE),
            median(Weight, na.rm = TRUE),
            median(Length, na.rm = TRUE),
            median(Platwid, na.rm = TRUE),
            median(Platthic, na.rm = TRUE),
            median(NoDS, na.rm = TRUE),
            median(Cortex, na.rm = TRUE),
            IQR(Length, na.rm = TRUE),  
            IQR(Width, na.rm = TRUE), 
            IQR(Thick, na.rm = TRUE),
            IQR(Weight, na.rm = TRUE),
            IQR(Length, na.rm = TRUE),
            IQR(Platwid, na.rm = TRUE),
            IQR(Platthic, na.rm = TRUE),
            IQR(NoDS, na.rm = TRUE),
            IQR(Cortex, na.rm = TRUE),
            n = length(Weight))

# get overhang removal data also
ohr <- filter(flakes, Overhang == "Yes") %>%
       group_by(phase) %>%
       summarise(OHR_n = length(Overhang))
# get percentages of OHR per phase
ohr$OHR_perc <- ohr$OHR_n/metrics$n * 100

# combine
metrics <- cbind(metrics, ohr[,c("OHR_n", "OHR_perc")])
metrics <- t(round(metrics,2))
# save as CSV to tidy up, there is no simple way to make the table
# that appears in the paper
write.csv(metrics, "figures/table_6_flake_metrics.csv")
# show a slightly untidy version
metrics <- data.frame(metrics)
names(metrics) <- c("phase 1", "phase 2", "phase 3", "phase 4")
kable(metrics[-1,], caption = "Summary of attributes of chert complete flakes from Jerimalai both squares. Each cell contains median ± interquartile range unless otherwise indicated. ")

The code chunk below uses a Bayesian one-way ANOVA to investigate differences in chert flake mass by depositional phase.

myDataFrame <- data.frame(phase = flakes$phase, mass = flakes$Weight)
# remove flakes with no phase
myDataFrame <- myDataFrame[!is.na(myDataFrame$phase),]
yName = names(myDataFrame)[2] # mass
xName = names(myDataFrame)[1] # phase
contrasts = list(
  list( c("1") , c("2") , compVal=0.0 , ROPE=c(-0.1,0.1) ) ,
  list( c("2")  , c("3") , compVal=0.0 , ROPE=c(-0.1,0.1)) ,
  list( c("3")  , c("4") , compVal=0.0 , ROPE=c(-0.1,0.1)) )
# Generate the MCMC chain:
  mcmcCoda_ANOVA = genMCMC_ANOVA(datFrm=myDataFrame , yName=yName , xName=xName ,
                      numSavedSteps=11000 , thinSteps=10 
                      )
  # Get summary statistics of chain:
  summaryInfo_ANOVA = smryMCMC_ANOVA( mcmcCoda_ANOVA , 
                          datFrm=myDataFrame , xName=xName ,
                          contrasts=contrasts 
                          )

#  Display posterior information (not easy to read unless using interactively):
    plotMCMC_ANOVA( mcmcCoda_ANOVA , 
              datFrm=myDataFrame , yName=yName , xName=xName , 
              contrasts=contrasts  )

# subset summaryInfo to show HDI interval for interactions
    HDI_intervals_for_interactions <- summaryInfo_ANOVA[c((nrow(summaryInfo_ANOVA)-length(contrasts)+1):nrow(summaryInfo_ANOVA)), c("HDIlow", "HDIhigh")]

kable(round(HDI_intervals_for_interactions,3), caption = "Highest density intervals (95%) for the posterior distributions of the interactions between phases and flake mass from both squares.")

The code chunk above produced results that indicate that only phase four and five have a credibly different distributions of flake mass. In the code chunk below we repeat the same investigation using a frequentist ANOVA.

# ANOVA with Tukey's HSD
fit <- aov(Weight ~ as.factor(phase), flakes)
fit_summary <-  summary(fit)
tuk <- TukeyHSD(fit)
# plot(tuk, las = 2, cex = 0.1)
kable(round(tuk$`as.factor(phase)`,3), caption = "Tukey's Honest Significant Difference for phase by phase comparisons of flake mass both squares.")

In the code chunk above we obtain the results of a frequentist ANOVA which returns a statistically significant result: F = r round(fit_summary[[1]]$'F value'[1], 2), df = r round(fit_summary[[1]]$Df[1], 2), p = r round(fit_summary[[1]]$'Pr(>F)'[1], 3). The Tukey's Honest Significant Difference test shows confidence intervals on the differences between the means artefact masses of a phase. When the interval includes zero, the difference is considered not significant. In this case we see that phase five differs from phase four, equivalent to what we see in the Bayesian HDI intervals.

The code chunk below produces a summary table of metric attributes of chert complete cores from Jerimalai square B.

cores_mass <- allchert[allchert$Artclas == "core", ]
core_metrics <- cores_mass %>% 
  group_by(phase) %>% 
  filter(phase %in% 1:4) %>%
  summarise(median(Weight, na.rm = TRUE), 
            median(Length, na.rm = TRUE), 
            median(Width, na.rm = TRUE), 
            median(Thick, na.rm = TRUE),
            median(as.numeric(cores_mass$NoDS), na.rm = TRUE),
            median(as.numeric(cores_mass$Cortex), na.rm = TRUE),
            median(as.numeric(cores_mass$Corerot), na.rm = TRUE),
            IQR(Weight, na.rm = TRUE), 
            IQR(Length, na.rm = TRUE), 
            IQR(Width, na.rm = TRUE), 
            IQR(Thick, na.rm = TRUE),
            IQR(as.numeric(cores_mass$NoDS), na.rm = TRUE),
            IQR(as.numeric(cores_mass$Cortex), na.rm = TRUE),
            IQR(as.numeric(cores_mass$Corerot), na.rm = TRUE),
            n = length(Weight))
core_metrics_t <- t(round(core_metrics,2))
colnames(core_metrics_t) <- c("phase 1", "phase 2", "phase 3", "phase 4" )
kable(core_metrics_t, caption = "Summary of chert core metrics for both squares")

The code chunk below creates a figure that shows the distribution of core and complete flake mass by depositional phases

# plot flake and core mass by phase in one box plot
flakes_cores_weight <- allchert %>%
                         filter(Artclas %in% c("flake", "core")) %>%
                         filter(phase %in% 1:4) %>%
                         select(Artclas, Weight, phase)


ggplot(flakes_cores_weight, aes(fill = Artclas, as.factor(phase), Weight)) +
  geom_point(aes(colour = Artclas), size = 0.5, alpha = 0.9, shape = 1, 
             position=position_jitterdodge(dodge.width=0.9)) +
  geom_boxplot(alpha = 0.5) +
  scale_y_log10() +
  #theme_minimal(base_size = 4) +
  xlab('depositional phase') +
  ylab("mass (g)") +
  scale_fill_manual(name="artefact type",
                      labels=c("core", "complete flake"), 
                     values = c("grey10", "grey80")) +
  scale_colour_manual(name="artefact type",
                      labels=c("core", "complete flake"), 
                      values = c("grey10", "grey15")) 

# save plot
jhe_190mm_ggsave("figures/fig_6_Jeremalai-flake-core-mass-phase.png", 
                height = (190/2))

The code chunk below uses a Bayesian one-way ANOVA to investigate differences in chert core mass by depositional phase.

myDataFrame <- data.frame(phase = cores_mass$phase, mass = cores_mass$Weight)
# remove flakes with no phase
myDataFrame <- myDataFrame[!is.na(myDataFrame$phase),]
yName = names(myDataFrame)[2] # mass
xName = names(myDataFrame)[1] # phase
contrasts = list(
  list( c("1") ,  c("2") , compVal=0.0 , ROPE=c(-0.1,0.1) ) ,
  list( c("2")  , c("3") , compVal=0.0 , ROPE=c(-0.1,0.1)) ,
  list( c("3")  , c("4") , compVal=0.0 , ROPE=c(-0.1,0.1)) )
# Generate the MCMC chain:
  mcmcCoda_ANOVA = genMCMC_ANOVA(datFrm=myDataFrame , yName=yName , xName=xName ,
                      numSavedSteps=11000 , thinSteps=10 
                      )
  # Get summary statistics of chain:
  summaryInfo_ANOVA = smryMCMC_ANOVA( mcmcCoda_ANOVA , 
                          datFrm=myDataFrame , xName=xName ,
                          contrasts=contrasts 
                          )

# #  Display posterior information (not easy to read unless using interactively):
#     plotMCMC_ANOVA( mcmcCoda_ANOVA , 
#               datFrm=myDataFrame , yName=yName , xName=xName , 
#               contrasts=contrasts  )
#   
# subset summaryInfo to show HDI interval for interactions
 HDI_intervals_for_interactions <- summaryInfo_ANOVA[c((nrow(summaryInfo_ANOVA)-length(contrasts)+1):nrow(summaryInfo_ANOVA)), c("HDIlow", "HDIhigh")]
kable(round(HDI_intervals_for_interactions,3), caption = "Highest density intervals (95%) for the posterior distributions of the interactions between phases and core mass, both squares.")

The code chunk above produced results that indicate that the HDIs for all phases include zero, which we interpret as no credible difference. In the code chunk below we repeat the same investigation using a frequentist ANOVA.

# ANOVA with Tukey's HSD
fit <- aov(mass ~ as.factor(phase), myDataFrame)
fit_summary <-  summary(fit)
tuk <- TukeyHSD(fit)
# plot(tuk, las = 2, cex = 0.1)
kable(round(tuk$`as.factor(phase)`,3), caption = "Tukey's Honest Significant Difference for phase by phase comparisons of core mass, both squares.")

In the code chunk above we obtain the results of a frequentist ANOVA which returns a statistically significant result: F = r round(fit_summary[[1]]$'F value'[1], 2), df = r round(fit_summary[[1]]$Df[1], 2), p = r round(fit_summary[[1]]$'Pr(>F)'[1], 3). The Tukey's Honest Significant Difference test shows confidence intervals on the differences between the means artefact masses of a phase. In this case we see that only the comparison of phase four to phase two has a significant difference in core mass. Consecutive phases show no significance difference, which we interpret as evidence of overall no substantial change in core mass.

The code chunk below produces the figures that illustrate the frequency of flake platform categories for chert complete flakes at Jerimalai square B.

# flake platform
plat <- dcast(flakes, Plat ~ depth) 
rownames(plat) <- plat[,1]
# get rid of rows with no plat
plat <- plat[rownames(plat) != "",]
plat <- plat[,-1]
plat_freqs <- data.frame(plat_types = rownames(plat), Freq = rowSums(plat))
plat_freqs <- plat_freqs[plat_freqs$Freq > 90,]
# what is 'coll'? Dropping it
plat_freqs <- plat_freqs[plat_freqs$plat_types != 'Coll', ]
plat_freqs$plat_types <- c('2-scars', '3-scars', 'cortical', 'focalised', 'single')
# plot freq of platform types
main <- ggplot(plat_freqs, aes(reorder(plat_types, Freq), Freq)) + 
  geom_bar(stat="identity", fill = "white", colour = "black") +
  # theme_minimal() +
  xlab("platform type") +
  ylab("frequency") +
  # remove grid lines for subplot
  theme_update(panel.background = element_blank(),
               panel.grid.major = element_blank(),
               panel.grid.minor = element_blank())
main + theme_minimal()
# raw materials by spit
# compute proportions per layer (col props)
all_tab <- data.frame()
for(i in seq(ncol(plat))){
  for(j in seq(nrow(plat))){
    all_tab[j,i] <- plat[j,i]/colSums(plat)[i]
  }
}
# check 
check <- colSums(all_tab) # should == 1
colnames(all_tab) <- colnames(plat) 
check <- rowSums(all_tab) # should be various
all_tab$plat_type <- rownames(plat) 
# get rid of raw materials that are not very abundant
all_tab <- all_tab[which(rowSums(all_tab[,1:ncol(all_tab)-1]) > 1.5) , ]
# get rid of NA column
all_tab <- all_tab[,names(all_tab) != 'NA']
# plot
all_tab_m <- melt(all_tab, id.var = 'plat_type')
# by phase
plat <- dcast(flakes, Plat ~ phase)
rownames(plat) <- plat[,1]
# get rid of rows with no plat
plat <- plat[rownames(plat) != "",]
plat <- plat[,-1]
# raw materials by site
# compute proportions per phase (col props)
all_tab <- data.frame()
for(i in seq(ncol(plat))){
  for(j in seq(nrow(plat))){
    all_tab[j,i] <- plat[j,i]/colSums(plat)[i]
  }
}
all_tab$plat_type <- rownames(plat) 
# get rid of raw materials that are not very abundant
all_tab <- all_tab[which(rowSums(all_tab[,1:ncol(all_tab)-1]) > 0.15) , ]
# get rid of NA column
all_tab <- all_tab[,names(all_tab) != 'NA']
# plot distibution of platform sizes for each type
flakes$Platarea <- with(flakes, (Platthic * Platwid))
plat_area_type <- flakes[flakes$Plat %in% c("Single", "Focal", "2-scars", "Cort", '3-scars'),]

# make names a bit more readable
plat_area_type$Plat <- ifelse(plat_area_type$Plat == 'Cort', 'cortical',
                      ifelse(plat_area_type$Plat == 'Focal',  'focalised', 
                    ifelse(plat_area_type$Plat == 'Single',  'single', as.character(plat_area_type$Plat))))  

# put types in same order as frequency plot
plat_area_type$Plat <- factor(plat_area_type$Plat, 
                              levels = c('cortical', '3-scars','2-scars', 'focalised', 'single'), ordered = TRUE)

sub <- ggplot(plat_area_type, aes(Plat, Platarea)) +
  geom_boxplot() +
  geom_jitter(size = 0.5, alpha = 0.3, shape = 1) +
  scale_y_log10() +
  ylab(as.expression(bquote('platform area (' * mm^{2} * ")" ))) +
  xlab("") +
  theme_minimal()

# plot freq of plat type and platform area together in one plot

vp <- viewport(width = 0.45, height = 0.54, 
               x = 0.57, y = 0.4, 
               just = c("right", "bottom"))

# combine plots, print and save (wont show in console)
  png("figures/fig_7_Jeremalai-platform-area-by-plat-type.png", 
      units = "mm", w = 190, h = 190/2, res = 600)
     print(main)
     print(sub + theme_bw(base_size = 10), vp = vp)
  dev.off()

The code chunk below creates a plot showing the distributions of core and flake cortex by depositional phase at both squares of Jerimalai.

flakes_cores_cortex <- allchert %>%
                         filter(Artclas %in% c("flake", "core")) %>%
                         filter(phase %in% 1:4) %>%
                         select(Artclas, Cortex, phase)

ggplot(flakes_cores_cortex, aes( fill = Artclas, as.factor(phase), Cortex)) +
  geom_point(aes(colour = Artclas), size = 0.5, alpha = 0.9, shape = 1, 
             position=position_jitterdodge(dodge.width=0.9)) +
  geom_boxplot(alpha = 0.4) +
  scale_y_log10() +
  xlab('depositional phase') +
  ylab("percent cortex") +
  scale_fill_manual(name="artefact type",
                      labels=c("core", "complete flake"), 
                     values = c("grey10", "grey80")) +
  scale_colour_manual(name="artefact type",
                      labels=c("core", "complete flake"), 
                      values = c("grey10", "grey20")) 
# save plot
jhe_190mm_ggsave("figures/fig_8_Jeremalai-flake-core-cortex-phase.png", height = 190/2)

The code chunk below uses a Bayesian one-way ANOVA to investigate differences in chert flake cortex by depositional phase.

myDataFrame <- data.frame(phase = flakes$phase, cortex = flakes$Cortex)
# remove flakes with no phase
myDataFrame <- myDataFrame[!is.na(myDataFrame$phase),]
myDataFrame <- myDataFrame[!is.na(myDataFrame$cortex),]
# replace zeros with ones, since mode can't be zero here
# myDataFrame$cortex <- ifelse(myDataFrame$cortex == 0, 1, myDataFrame$cortex)
yName = names(myDataFrame)[2] # cortex
xName = names(myDataFrame)[1] # phase
contrasts = list(
  list( c("1") ,  c("2") , compVal=0.0 , ROPE=c(-0.1,0.1) ) ,
  list( c("2")  , c("3") , compVal=0.0 , ROPE=c(-0.1,0.1)) ,
  list( c("3")  , c("4") , compVal=0.0 , ROPE=c(-0.1,0.1))  )
# Generate the MCMC chain:
  mcmcCoda_ANOVA = genMCMC_ANOVA(datFrm=myDataFrame , yName=yName , xName=xName , numSavedSteps=11000 , thinSteps=10 
                      )
  summaryInfo_ANOVA = smryMCMC_ANOVA( mcmcCoda_ANOVA , 
                          datFrm=myDataFrame , xName=xName ,
                          contrasts=contrasts 
                          )
  # #  Display posterior information (not easy to read unless using interactively):
#     plotMCMC_ANOVA( mcmcCoda_ANOVA , 
#               datFrm=myDataFrame , yName=yName , xName=xName , 
#               contrasts=contrasts  )
#   
# subset summaryInfo to show HDI interval for interactions
HDI_intervals_for_interactions_f <- summaryInfo_ANOVA[c((nrow(summaryInfo_ANOVA)-length(contrasts)+1):nrow(summaryInfo_ANOVA)), c("HDIlow", "HDIhigh")]
kable(round(HDI_intervals_for_interactions_f,3), caption = "Highest density intervals (95%) for the posterior distributions of the interactions between phases and flake cortex at both squares")

The code chunk above produced results that indicate that the HDIs for phases 3 to 4 exclude zero, which we interpret as credible difference for this transition. In the code chunk below we repeat the same investigation using a frequentist ANOVA.

# ANOVA with Tukey's HSD
fit <- aov(cortex ~ as.factor(phase), myDataFrame)
fit_summary <-  summary(fit)
tuk <- TukeyHSD(fit)
# plot(tuk, las = 2, cex = 0.1)
kable(round(tuk$`as.factor(phase)`,3), caption = "Tukey's Honest Significant Difference for phase by phase comparisons of flake cortex, both squares")

In the code chunk above we obtain the results of a frequentist ANOVA which returns a statistically non-significant result: F = r round(fit_summary[[1]]$'F value'[1], 2), df = r round(fit_summary[[1]]$Df[1], 2), p = r round(fit_summary[[1]]$'Pr(>F)'[1], 3). The Tukey's Honest Significant Difference test shows confidence intervals on the differences between the means artefact masses of a phase. We see the phase 3-4 interaction showing a significant difference.

In the two code chunks below we repeat the statistical tests above for core cortex.

myDataFrame <- data.frame(phase = cores_mass$phase, cortex = cores_mass$Cortex)
myDataFrame <- myDataFrame[complete.cases(myDataFrame),] # omit NAs
yName = names(myDataFrame)[2] # cortex
xName = names(myDataFrame)[1] # phase
contrasts = list(
  list( c("1") ,  c("2") , compVal=0.0 , ROPE=c(-0.1,0.1) ) ,
  list( c("2")  , c("3") , compVal=0.0 , ROPE=c(-0.1,0.1)) ,
  list( c("3")  , c("4") , compVal=0.0 , ROPE=c(-0.1,0.1)) 
  )
# Generate the MCMC chain:
  mcmcCoda_ANOVA = genMCMC_ANOVA(datFrm=myDataFrame , yName=yName , xName=xName ,
                      numSavedSteps=11000 , thinSteps=10 
                      )
  # Get summary statistics of chain:
  summaryInfo_ANOVA = smryMCMC_ANOVA( mcmcCoda_ANOVA , 
                          datFrm=myDataFrame , xName=xName ,
                          contrasts=contrasts 
                          )
  # #  Display posterior information (not easy to read unless using interactively):
#     plotMCMC_ANOVA( mcmcCoda_ANOVA , 
#               datFrm=myDataFrame , yName=yName , xName=xName , 
#               contrasts=contrasts  )
#   
# subset summaryInfo to show HDI interval for interactions
 HDI_intervals_for_interactions_c <- summaryInfo_ANOVA[c((nrow(summaryInfo_ANOVA)-length(contrasts)+1):nrow(summaryInfo_ANOVA)), c("HDIlow", "HDIhigh")]
kable(round(HDI_intervals_for_interactions_c,3), caption = "Highest density intervals (95%) for the posterior distributions of the interactions between phases and core cortex at both squares")
# combine core and flake HDIs into one table
HDIs_for_cores_and_flakes <- round(cbind(HDI_intervals_for_interactions_c, HDI_intervals_for_interactions_f),3)
kable(HDIs_for_cores_and_flakes, caption = "Highest density intervals (95%) for the posterior distributions of the interactions between phases and core and flake cortex, both squares")

The code chunk above produced results that indicate that the HDIs for all phases include zero, which we interpret as no credible difference. In the code chunk below we repeat the same investigation using a frequentist ANOVA.

# ANOVA with Tukey's HSD
fit <- aov(cortex ~ as.factor(phase), myDataFrame)
fit_summary <-  summary(fit)
tuk <- TukeyHSD(fit)
# plot(tuk, las = 2, cex = 0.1)
kable(round(tuk$`as.factor(phase)`,3), caption = "Tukey's Honest Significant Difference for phase by phase comparisons of flake cortex, both squares")

In the code chunk above we obtain the results of a frequentist ANOVA which returns a statistically non-significant result: F = r round(fit_summary[[1]]$'F value'[1], 2), df = r round(fit_summary[[1]]$Df[1], 2), p = r round(fit_summary[[1]]$'Pr(>F)'[1], 3). The Tukey's Honest Significant Difference test shows confidence intervals on the differences between the means artefact masses of a phase. None of the interactions show a significant difference. We interpret this as evidence of overall no change in core cortex.

Results: Core technology

This code draws a plot of differences in the mass of cores recovered from Jerimalai by core type. There are r nrow(core_types) cores in the assemblage.

both_sqs_core_types <- core_types 
both_sqs_core_types$Type_long <- with(both_sqs_core_types, 
                              ifelse(Type == "SPC", "Single plaform",
                              ifelse(Type == "RC", "Radial",
                              ifelse(Type == "BDC", "Bidirectional",
                              ifelse(Type == "BiC", "Bipolar",
                              ifelse(Type == "MPC", "Multi-platform",
                              ifelse(Type == "LLC", "Levallois-like",
                              ifelse(Type == "FFC", "Faceted flake", NA))))))) )
# plot
ggplot(both_sqs_core_types, aes(reorder(Type_long, -Mass,  FUN=median), Mass)) +
   geom_jitter(alpha = 0.9, shape = 1) + 
  geom_boxplot(alpha = 0.1, fill = "white", colour = "black") +
  ylim(0,20) +
  xlab("Core type") +
  ylab("Core mass (g)") +
  theme_minimal(base_size = 10)
# save
jhe_190mm_ggsave("figures/fig_9_Jeremalai-core-by-type.png", height = 190/1.6)

The code chunk below computes the amount of cortex for each core type.

core_cortex <- aggregate( X..Cortex ~ Type, data = core_types, mean)
# what is the average amount of cortex for each core type?
core_cortex_means <- arrange(core_cortex, -X..Cortex)
names(core_cortex_means) <- c("Type", "Cortex percentage")
core_cortex_means[,2] <- round(core_cortex_means[,2],1)

Consistent with overall size, single platform cores retain the most cortex on average (r filter(core_cortex_means, Type == 'SPC')$'Cortex percentage'%), followed by radial cores (r filter(core_cortex_means, Type == 'RC')$'Cortex percentage'%), multiplatform cores (r filter(core_cortex_means, Type == 'MPC')$'Cortex percentag'%) and faceted flake cores (r filter(core_cortex_means, Type == 'FFC')$'Cortex percentage'%). Levallois-like cores and bipolar cores exhibit the least cortex (<4%).

The code chunk below computes the numbers of flake scars by each core type.

core_scars <- core_types %>%
  group_by(Type) %>%
  summarize(means = round(mean(Number.of.Scars, na.rm = TRUE),0),
            sds = round(sd(Number.of.Scars,  na.rm = TRUE)),0) %>%
  arrange(means)

Levallois-like cores exhibit almost twice the number of flake scars on average as other cores in the assemblage (mean = r filter(core_scars, Type == "LLC")$means±r filter(core_scars, Type == "LLC")$sds scars versus r round(mean(core_scars$means),0)±r round(mean(core_scars$sds, na.rm = TRUE),0) on average). Single platform cores have the least scars on average (r filter(core_scars, Type == "SPC")$means±r filter(core_scars, Type == "SPC")$sds).

Results: Retouched artefacts

The code chunk below draws a plot showing locations of retouch on chert flakes by depositional unit in both squares of Jerimalai.

# get sq B retouch
rt_flakes <- read.csv("data/JB_Chert_Flakes_and_Retouch.csv")
rt_flakes$phase <- ifelse(rt_flakes$Square == "A",           
                                 sqA_makephases(rt_flakes$Spit), 
                                 sqB_makephases(rt_flakes$Spit))

rt_flakes <- rt_flakes[!is.na(rt_flakes$phase), ]
# frequency of flakes with retouch per phase
rt <- flakes[flakes$Rtch == "Yes" , ] # sq A retouch also here
rt_ <- rt_flakes[rt_flakes$Rtch == "Yes" , ]
common_names <- intersect(names(rt_), names(rt)) 
rt_ <- rt_ %>% select_(.dots = common_names)
rt <- rt %>% select_(.dots = common_names)
rt <- rbind(rt , rt_)
# what is this for?
rt_tab <- t(data.frame(rt = aggregate(Retype ~ phase, rt, 
                                      function(x) rt_count = length(x)), 
                       flakes = aggregate(Retype ~ phase, flakes, 
                                      function(x) fk_count = length(x))[,2] ))

# retouch locations
rt_loc <- data.frame(phase = rt$phase, rt_loc = rt$Retloc)
rt_loc <- rt_loc[!(is.na(rt_loc$phase)), ]
rt_sum <- as.data.frame.matrix(table(rt_loc))[,-1]
rt_sum$phase <- paste0("phase ", row.names(rt_sum))
rt_sum <- rt_sum[,-which(names(rt_sum) == "Bothlats")] # not in both squares
colnames(rt_sum) <- c("Both margins", "Distal", "Perimeter", "Left lateral", "Medial", "Proximal", "Right lateral", "phase")
rt_sum_m <- melt(rt_sum)
# plot
ggplot(rt_sum_m, aes(variable, value)) +
  geom_bar(stat="identity") +
  theme(axis.text.x  = element_text(angle=90, vjust=0.5, size=12)) +
  ylab("frequency") +
  xlab("retouch location") +
  facet_wrap(~phase, ncol = 5)
# save plot
jhe_190mm_ggsave("figures/fig_15_Jeremalai-flake-retouched-flake-location-phase.png", height = 190/1.6)

The code chunk below draws a plot of the distribution of flake lengths for retouched and unretouched flakes by depositional unit for both squares at Jerimalai.

flakes_retouch_size <- allchert %>%
                         filter(Artclas %in% c("flake", "retf")) %>%
                         filter(phase %in% 1:4) %>%
                         select(Artclas, Weight, Length, phase)

ggplot(flakes_retouch_size, aes( fill = Artclas, as.factor(phase), Weight)) +
  geom_point(aes(colour = Artclas), size = 0.5, alpha = 0.9, shape = 1, 
             position=position_jitterdodge(dodge.width=0.9)) + 
  geom_boxplot(alpha = 0.4) +
  scale_y_log10() +
  xlab('depositional phase') +
  ylab("mass (g)") +
  scale_fill_manual(name="artefact type",
                    labels=c("unretouched flake", "retouched flake"), 
                    values = c("grey10", "grey80")) +
  scale_colour_manual(name="artefact type",
                    labels=c("unretouched flake", "retouched flake"), 
                    values = c("grey10", "grey15")) 
# save plot
jhe_190mm_ggsave("figures/fig_16_Jeremalai-flake-retouchedflake-mass-phase.png", height = 190/1.6)

The code chunk below computes the tests for credible difference in retouch flake frequency by phase.

rt_tab <- data.frame(rt_tab)
colnames(rt_tab) <- c("phase 1", "phase 2", "phase 3", "phase 4")
rt_tab <- rt_tab[-1,]
myDataFrame <- melt(as.matrix(rt_tab),  varnames=c("retouch", "phase"), value.name="Freq")
myDataFrame <- myDataFrame[complete.cases(myDataFrame),] # omit NAs
yName=names(myDataFrame)[3]   # Freq 
x1Name=names(myDataFrame)[2]  # phase
x2Name=names(myDataFrame)[1]  # retouch
x1contrasts = list( 
  list( c("phase 1") , c("phase 2") , compVal=0.0 , ROPE=c(-0.1,0.1) ) ,
  list( c("phase 2")  , c("phase 3") , compVal=0.0 , ROPE=c(-0.1,0.1)) ,
  list( c("phase 3")  , c("phase 4") , compVal=0.0 , ROPE=c(-0.1,0.1)) 
  )
numSavedSteps = 12000 # MCMC parameters
thinSteps = 10        # MCMC parameters

mcmcCoda = genMCMC_cont_table( datFrm=myDataFrame , 
                    yName=yName , x1Name=x1Name , x2Name=x2Name ,
                    numSavedSteps=numSavedSteps , thinSteps=thinSteps , 
                    )

# Get summary statistics of chain:
summaryInfo = smryMCMC_cont_table( mcmcCoda , 
                        datFrm=myDataFrame , x1Name=x1Name , x2Name=x2Name ,
                        x1contrasts=x1contrasts
                         )

# # #  Display posterior information (not easy to read unless using interactively):
#     plotMCMC_cont_table( mcmcCoda , 
#               datFrm=myDataFrame , yName=yName , x1Name=x1Name , 
#                x2Name=x2Name ,
#               x1contrasts=x1contrasts  )

# table of retouch artefact counts
kable(rt_tab, caption = "Table of frequencies of retouched flakes by phase, both squares")

# subset summaryInfo to show HDI interval for interactions
 HDI_intervals_for_interactions_fr <- summaryInfo_ANOVA[c((nrow(summaryInfo_ANOVA)-length(contrasts)+1):nrow(summaryInfo_ANOVA)), c("HDIlow", "HDIhigh")]
kable(round(HDI_intervals_for_interactions_fr,3), caption = "Highest density intervals (95%) for the posterior distributions of the interactions between phases and retouched flake frequency, both squares.")

The results of the code chunk above indicates that there are no credible differences in the frequencies of retouched artefacts. A look at the raw counts shows that from phase three to four there is a decrease in the proportion of retouched pieces, but that the overall count is still low.

chert_artefacts_retouch_nhst <- assocstats(as.matrix(rt_tab))

The code chunk above returns a chi-squared value of r round(chert_artefacts_retouch_nhst$chisq_tests[2,1], 2) and a p-value of r round(chert_artefacts_retouch_nhst$chisq_tests[2,3], 2), indicating a nonsignficant difference in frequencies of breakage classes by phase. The Cramer's V value of r round(chert_artefacts_retouch_nhst$cramer, 3) indicates that the effect size is extremely small. We interpret this result to mean that there is no substantial significance in the differences in frequencies of retouched flakes by phase.

The code chunk below investigates changes in the length of the retouched margin of artefacts over time.

rt_len <- data.frame(phase = rt$phase, rt_len = rt$Retlen)
# omit NAs
rt_len <- rt_len[!(is.na(rt_len$rt_len)), ]
# get mean lengths
mean_lengths <- aggregate(rt_len ~ phase, rt_len, mean)
# do bayesian ANOVA
myDataFrame <- rt_len
myDataFrame <- myDataFrame[complete.cases(myDataFrame),] # omit NAs
yName = names(myDataFrame)[2] # retouch length
xName = names(myDataFrame)[1] # phase
contrasts = list(
  list( c("1") ,  c("2") , compVal=0.0 , ROPE=c(-0.1,0.1) ) ,
  list( c("2")  , c("3") , compVal=0.0 , ROPE=c(-0.1,0.1)) ,
  list( c("3")  , c("4") , compVal=0.0 , ROPE=c(-0.1,0.1))  )
# Generate the MCMC chain:
  mcmcCoda_ANOVA = genMCMC_ANOVA(datFrm=myDataFrame , yName=yName , xName=xName ,
                      numSavedSteps=11000 , thinSteps=10 
                      )
  # Get summary statistics of chain:
  summaryInfo_ANOVA = smryMCMC_ANOVA( mcmcCoda_ANOVA , 
                          datFrm=myDataFrame , xName=xName ,
                          contrasts=contrasts 
                          )
  # #  Display posterior information (not easy to read unless using interactively):
#     plotMCMC_ANOVA( mcmcCoda_ANOVA , 
#               datFrm=myDataFrame , yName=yName , xName=xName , 
#               contrasts=contrasts  )
#   
# subset summaryInfo to show HDI interval for interactions
 HDI_intervals_for_interactions_l <- summaryInfo_ANOVA[c((nrow(summaryInfo_ANOVA)-length(contrasts)+1):nrow(summaryInfo_ANOVA)), c("HDIlow", "HDIhigh")]
kable(round(HDI_intervals_for_interactions_l ,3), caption = "Highest density intervals (95%) for the posterior distributions of the interactions between phases and retouch length, both squares")
# combine retouch freq and length into one table
HDIs_flake_retouch_freq_and_length <- round(cbind(HDI_intervals_for_interactions_fr, HDI_intervals_for_interactions_l),3)
kable(HDIs_flake_retouch_freq_and_length, caption = "Highest density intervals (95%) for the posterior distributions of the interactions between phases and retouch frequency and length, both squares")

The results of the code chunk above show that the HDIs for all interactions include zero. This indicates no credible difference in retouch length by phase.

The code chunk below repeats this analysis using a frequentist test.

# ANOVA with Tukey's HSD
fit <- aov(rt_len ~ as.factor(phase), myDataFrame)
fit_summary <-  summary(fit)
tuk <- TukeyHSD(fit)
# plot(tuk, las = 2, cex = 0.1)
kable(round(tuk$`as.factor(phase)`,3), caption = "Tukey's Honest Significant Difference for phase by phase comparisons of flake retouch length, both squares")

In the code chunk above we obtain the results of a frequentist ANOVA which returns a statistically non-significant result: F = r round(fit_summary[[1]]$'F value'[1], 2), df = r round(fit_summary[[1]]$Df[1], 2), p = r round(fit_summary[[1]]$'Pr(>F)'[1], 3). The Tukey's Honest Significant Difference test shows confidence intervals on the differences between the means artefact retouch length of a phase. None of the interactions show a significant difference. We interpret this as evidence of overall no change in flake retouch length.

In the code chunk below we compute a Bayesian t-test to investigate differences in the mass of retouched and non-retouched flakes.

# Bayesian t-test of the mass of all retouched vs all unretouched flakes
# flakes_retouch_size_t_test <- BESTmcmc(flakes_retouch_size[flakes_retouch_size$Artclas == "flake", ]$Weight, flakes_retouch_size[flakes_retouch_size$Artclas == "retf", ]$Weight)

In the code chunk below we perform a frequentist t-test on flake mass of retouched and non-retouched flakes.

retouch_nonretouch_flake_mass_nhst <- t.test(flakes_retouch_size[flakes_retouch_size$Artclas == "flake", ]$Weight, flakes_retouch_size[flakes_retouch_size$Artclas == "retf", ]$Weight)

The code chunk above returns the result of the frequentist t-test as follows: t = r round(retouch_nonretouch_flake_mass_nhst$statistic, 2), df = r round(unname(retouch_nonretouch_flake_mass_nhst$parameter), 2) and p = r round(retouch_nonretouch_flake_mass_nhst$p.value, 3).

The code chunk below makes a table the summarizes retouch indices for retouched pieces recovered from Jerimalai. GIUR = Geometric Index of Unifacial Retouch, II = Index of Invasiveness, % = percent of perimeter with retouch

retouch_indices[is.na(retouch_indices)] <- 0
retouch_indices$GIUR <- with(retouch_indices, t1/T1 + t2/T2 + t3/T3)/3
retouch_indices$perimeter_perc <- with(retouch_indices, length/perimeter * 100)
retouch_indices$II <- with(retouch_indices, ((X0.5 * 0.5) + (X1 * 1))/16)
# get mean and standard deviation for each index
retouch_indices_subset <- retouch_indices %>% select(GIUR, perimeter_perc, II) 
# sweep over the columns to compute mean and standard deviation
retouch_indices_means <- data.frame(t(round(apply(retouch_indices_subset, 2, mean, na.rm = TRUE),2)))
retouch_indices_sds <- data.frame(t(round(apply(retouch_indices_subset, 2, sd, na.rm = TRUE),2)))
# make table
retouch_table <- retouch_indices %>%
  select(Square, Spit, Type, GIUR, II, perimeter_perc)
# do some rounding
retouch_table[,4:6] <- apply(retouch_table[,4:6], 2, round, 2)
# have a look
retouch_table <- arrange(retouch_table, Spit)
# assign phases
retouch_table$Phase <- ifelse(retouch_table$Square == "A",           
                                 sqA_makephases(retouch_table$Spit), 
                                 sqB_makephases(retouch_table$Spit))

# remove NA
retouch_table <- retouch_table[!is.na(retouch_table$Phase), ]
# reorder cols
retouch_table  <- retouch_table[, c(1,2,7,3:6)]
# write the table to a csv file so we can put it in the word doc
write.csv(retouch_table, file = 'retouch_table.csv', row.names = FALSE)
kable(retouch_table, caption = "Summary of retouch indices for retouched pieces recovered from both squares of Jerimalai. GIUR = Geometric Index of Unifacial Retouch, II = Index of Invasiveness, % = percent of perimeter with retouch")

The retouch intensity can be summarized with the following metrics: GIUR = r retouch_indices_means$GIUR +/- r retouch_indices_sds$GIUR perimeter = r retouch_indices_means$perimeter_perc +/- r retouch_indices_sds$ perimeter_perc% II = r retouch_indices_means$II +/- r retouch_indices_sds$II

Results: Technological types

This code makes the table "Summary of counts and classes. Counts refers to the count of spits in each depositional phase containing a given class."

## read in the techno types for both squares
summaryt_props <- read.csv("data/Jerimalai_technological_counts_both_squares_percentages.csv", skip = 1)

names(summaryt_props) <- c("category", "type", "phase I", "phase II", "phase III", "phase IV")

kable(summaryt_props, caption = "Summary of proportions and classes. Proportions refers to the proportion of spits in each depositional phase containing a given class")

# figure
technological_types_table <- read.csv("data/Jerimalai_technological_counts_both_squares.csv")

# spits per phase, both squares
spits_per_phase <- both_sqs_all_with_phases %>% 
                      group_by(phase) %>% 
                      summarize(n_spits=n_distinct(Spit))

perc_spits_techno_class <- technological_types_table %>%
                              group_by(class) %>%
                              filter(class %in% c("Retouch", "Bipolar", "Flake cores", "Levallois", "multiplatform")) %>%
                              summarize(I = sum(Phase.I.total.count, na.rm = TRUE) ,
                                        II = sum(Phase.II.total.count, na.rm = TRUE),
                                        III = sum(Phase.III.total.count, na.rm = TRUE),
                                        IV = sum(Phase.IV.total.count, na.rm = TRUE),
                                        counts = n())

# divide counts of spits for each class by number of spits per phase

tmp <- perc_spits_techno_class[,-1]
for(i in seq_along(perc_spits_techno_class$counts)){

  tmp[,i] <- perc_spits_techno_class[,-1][ ,i] /   perc_spits_techno_class$counts[i]

}

apply(tmp[,1:4], 1,  function(i) i * spits_per_phase$n)
# compare Liang Bua and Mata Menge using the data in Brumm et al. 2006 Nature
# http://www.nature.com/nature/journal/v441/n7093/fig_tab/nature04618_T1.html
lb_mm <- read.csv("data/Liang_Bua_and_Mata_Menge_from_Brumm_2006.csv")

row.names(lb_mm) <- lb_mm$type
lb_mm$type <- NULL

# chi-square test on counts of artefact types at Liang Bua and Mata Menge
lb_mm_chi_sq <-  assocstats(as.matrix(lb_mm))

lb_mm_ <- lb_mm[(lb_mm$Liang.Bua >0 & lb_mm$Mata.Menge > 0), ]

# Bayesian Poisson exponential ANOVA
data <- melt(as.matrix(lb_mm_),  varnames=c("artefact_type", "site"), value.name="Freq")
# remove types with zero counts
# data <- data[data$Freq > 0, ]
myDataFrame = data
yName=names(myDataFrame)[3]   # Freq 
x1Name=names(myDataFrame)[2]  # site
x2Name=names(myDataFrame)[1]  # artefact_type
x1contrasts = list( 
    list( c("Liang.Bua") , c("Mata.Menge") , compVal=0.0 , ROPE=c(-0.1,0.1) )
    )
numSavedSteps = 12000 # MCMC parameters
thinSteps = 10        # MCMC parameters

mcmcCoda = genMCMC_cont_table( datFrm=myDataFrame , 
                    yName=yName , x1Name=x1Name , x2Name=x2Name ,
                    numSavedSteps=numSavedSteps , thinSteps=thinSteps 
                    )

# Get summary statistics of chain:
summaryInfo = smryMCMC_cont_table( mcmcCoda , 
                        datFrm=myDataFrame , x1Name=x1Name , x2Name=x2Name
                        ,   x1contrasts=x1contrasts 
                         )

# # #  Display posterior information (not easy to read unless using interactively):
    # plotMCMC_cont_table( mcmcCoda ,
    #           datFrm=myDataFrame , yName=yName , x1Name=x1Name ,
    #            x2Name=x2Name ,
    #           x1contrasts=x1contrasts  
    #           )

# table of artefacts from the two sites
kable(lb_mm, caption = "Mata Menge and Liang Bua stone artefact types")

# subset summaryInfo to show HDI interval for interactions
HDI_intervals_for_interactions <- summaryInfo[c((nrow(summaryInfo) - length(x1contrasts) + 1):nrow(summaryInfo)), c("HDIlow", "HDIhigh")]
HDI_intervals_for_interactions_df <- setNames(data.frame(t(data.frame(site_type = unname(HDI_intervals_for_interactions)))), names(HDI_intervals_for_interactions))

kable(round(HDI_intervals_for_interactions_df,3), caption = "Mata Menge and Liang Bua stone artefact types: Highest density intervals (95%) for the posterior distributions of the interactions between sites and artefact types.")
# sudo chmod 777 -R Pleistocene-aged-stone-artefacts-from-Jerimalai--East-Timor
lb <- read.csv("data/Liang_Bua_continuities_JHE_2009.csv")


row.names(lb) <- lb$ Artifact.Type
lb$Artifact.Type <- NULL
lb[is.na(lb)] <-  0

# chi-square test on counts of artefact types at Liang Bua and Mata Menge
lb_chi_sq <-  assocstats(as.matrix(lb))

lb_ <- lb[(lb$X9 >0 & lb$X7 > 0 & lb$X4 > 0 & lb$X3 > 0 & lb$X2 > 0), ]

# Bayesian Poisson exponential ANOVA
data <- melt(as.matrix(lb_),  varnames=c("artefact_type", "layer"), value.name="Freq")
# remove types with zero counts
# data <- data[data$Freq > 0, ]
myDataFrame = data
yName=names(myDataFrame)[3]   # Freq 
x1Name=names(myDataFrame)[2]  # site
x2Name=names(myDataFrame)[1]  # artefact_type
x1contrasts = list( 
    list( c("X2") , c("X3") , compVal=0.0 , ROPE=c(-0.1,0.1) ),
    list( c("X3") , c("X4") , compVal=0.0 , ROPE=c(-0.1,0.1) ),
    list( c("X4") , c("X7") , compVal=0.0 , ROPE=c(-0.1,0.1) ),
    list( c("X7") , c("X9") , compVal=0.0 , ROPE=c(-0.1,0.1) ),
    list( c("X2") , c("X7") , compVal=0.0 , ROPE=c(-0.1,0.1) ),
    list( c("X3") , c("X9") , compVal=0.0 , ROPE=c(-0.1,0.1) ),
    list( c("X2") , c("X4") , compVal=0.0 , ROPE=c(-0.1,0.1) )
    )
numSavedSteps = 12000 # MCMC parameters
thinSteps = 10        # MCMC parameters

mcmcCoda = genMCMC_cont_table( datFrm=myDataFrame , 
                    yName=yName , x1Name=x1Name , x2Name=x2Name ,
                    numSavedSteps=numSavedSteps , thinSteps=thinSteps 
                    )

# Get summary statistics of chain:
summaryInfo = smryMCMC_cont_table( mcmcCoda , 
                        datFrm=myDataFrame , x1Name=x1Name , x2Name=x2Name
                        ,   x1contrasts=x1contrasts 
                         )

# # #  Display posterior information (not easy to read unless using interactively):
    # plotMCMC_cont_table( mcmcCoda ,
    #           datFrm=myDataFrame , yName=yName , x1Name=x1Name ,
    #            x2Name=x2Name ,
    #           x1contrasts=x1contrasts  
    #           )

# table of artefacts from Liang Bua 
kable(lb, caption = "Liang Bua Stone Artifacts by Stratigraphic Unit")

# subset summaryInfo to show HDI interval for interactions
HDI_intervals_for_interactions <- summaryInfo[c((nrow(summaryInfo) - length(x1contrasts) + 1):nrow(summaryInfo)), c("HDIlow", "HDIhigh")]

kable(round(HDI_intervals_for_interactions,3), caption = "Liang Bua Stone Artifacts by Stratigraphic Unit: Highest density intervals (95%) for the posterior distributions of the interactions between sites and artefact types.")

Colophon

This report was generated on r Sys.time() using the following computational environment and dependencies:

# which R packages and versions?
sessionInfo()

# what other pieces of software?
needs <- needs()
c(needs$depends$SystemRequirements[needs$depends$SystemRequirements != "NULL"], needs$imports$SystemRequirements[needs$imports$SystemRequirements != "NULL"])

# what commit is this file at?
repo <- repository(path = "../")
last_commit <- commits(repo)[[1]]

The current git commit of this file is r last_commit@sha, which is on the r branches(repo)[[1]]@name branch and was made by r last_commit@committer@name on r when(last_commit). The current commit message is "r last_commit@summary".



benmarwick/Pleistocene-aged-stone-artefacts-from-Jerimalai--East-Timor documentation built on May 12, 2019, 1:01 p.m.