Zooarchaeology is concerned with the analysis and inference of faunal remains recovered from archaeological sites. The zooaRch package provides analytical tools for zooarchaeological data. Functions in this package allow users to:

1. Survivorship and Mortality^[The surv.func function is only intended for traditional survivorship analyses, while mort.func is the analog for analyses of mortality profiles.That is, the data must be assignable to discrete age classes. In zooarchaeology, this is traditionally dental eruption/wear data. Epiphyseal fusion data are treated in section 2.]

1.1 Introduction

Functions surv.func and mort.func create confidence intervals around survivorship and mortality data respectively. surv.func takes bootstrap samples of the original dataset, calculates the Kaplan-Meier Estimator (KME) for each, and then calculates the confidence interval. The function mort.func bootstraps the mortality profiles. Defaults are 1000 iterations for the bootstrap and 95% for the confidence interval.

1.2 Setting up the Data for mortality and survivorship

The data must be coded in the long way (i.e., each specimen must have its own entry). The dataframe should include two columns: one called "Genus" and one called "Ageclass." For example, the first six entries of the Marj Rabba sheep/goat data from Price et al.^[Price, M.D., Buckley, M., Rowan, Y.M., Kersel, M., 2013. Animal Management Strategies during the Chalcolithic in the Lower Galilee: New Data from Marj Rabba, Paleorient 39, 183-200.] is:

|Genus |Ageclass| |:--------:|:----:| |Ovis/Capra| 1| |Ovis/Capra| 2| |Ovis/Capra| 3| |Ovis/Capra| 3| |Ovis/Capra| 3| |Ovis/Capra| 3|

Note: the Ageclass must be numeric (i.e., you cannot use "A", "B", "C", etc. for age classes).

1.3 Arguments.

Function arguments are very similar for both surv.func and mort.func. Here we describe the surv.func arguments. See help(mort.func) and help(surv.func) for more details on the arguments for mortality profile and survivorship bootstrapping.

surv.func(SurviveData, labels = NULL, models = NULL, usermod = NULL, ci = 95, plot = TRUE, iter = 1000)

1.4 What the Function surv.func Does.

The function calculates the KME values for the original dataset.

> survivorcurve.Eq4 <- function(data)
{
  vector <- rep(1, N.ages+1) 
  for(i in 1:N.ages+1) 
  {
    vector[i] <- vector[i-1]*(sum(data >= (i-1)) - sum(data == > (i-1)))/sum(data >= (i-1)) 
 }
 vector[is.na(vector)] <- 0
 round(vector,4) 
}

It then takes x samples of the dataset, with x equal to the number of iterations (Default = 1000). The samples are initially structured in the same way as the original dataset, with each sample size equal to the size of the original (e.g., in the Marj Rabba example, there are 42 entries in each sample). The function calculates the KME for each sample

survive.matrix <- matrix(NA, ncol = N.ages+1, nrow = 1000)
survive.matrix[1,] <- survivorcurve.Eq4(SurviveData$Ageclass)
for(i in 2:1000)
{
  bootstrap <- sample(1:nrow(SurviveData), nrow(SurviveData),   replace = TRUE)
  survive.matrix[i,] <-
survivorcurve.Eq4(SurviveData$Ageclass[bootstrap])
}

Plotting and the calculation of the confidence intervals is straightforward:

plot(x = (1:N.ages), y = survive.matrix[1,-1], type = "l", lwd = 2, xlab = "Age Class", 
     ylab = "Proportion Survived", axes = FALSE, ylim = 
c(0,1))
axis(side = 1, at = 1:N.ages, 
     labels = Labels.ageclass)
axis(side = 2)

lines(x = 1:N.ages, y = apply(survive.matrix[,-1], MARGIN = 2, FUN = quantile, probs = 0.025), lty = "dashed", ylim = c(0,1))

lines(x = 1:N.ages, y = apply(survive.matrix[,-1], MARGIN = 2, FUN = quantile, probs = 0.975), lty = "dashed", ylim = c(0,1))

Plotting models and a legend is also straightforward:

lines(x = 1:N.ages, y = model.catastrophe, col = "blue", lwd = 2, lty = "dotted")
lines(x = 1:N.ages, y = model.attritional, col = "red", lwd = 2, lty = "dotdash")

legend(x = "topright", cex = .75, lwd = c(2,1,2,2,2),lty = 
c("solid", "dashed", "dotted", "dotdash"), 
       col = c("black", "black", "blue", "red"),
       legend = c("Survivorship", "95% Confidence Interval", 
    "Catastrophe", "Attritional"))

Finally, the function makes a table of the results:

data.LowerCI <- apply(survive.matrix[,-1], MARGIN = 2, FUN = 
quantile, prob = 0.025)
data.PointValue <- survive.matrix[1,-1]
data.UpperCI <- apply(survive.matrix[,-1], MARGIN = 2, FUN =  quantile, prob = 0.975)

Taxon <- rep(unique(SurviveData$Genus), times = N.ages)

Output.Matrix <- data.frame(Taxon = Taxon, AgeClassLabs =  
    Labels.ageclass, LowerCI = data.LowerCI, PointValue = 
    data.PointValue, UpperCI = data.UpperCI)

Output.Matrix

1.5 What the Function mort.func Does.

The function calculates mortality profile values using the age distribution as a proportion from the original dataset.

> mortprof <- function(data){
    vector <- rep(NA, N.ages)   
    for(i in 1:N.ages) {
      vector[i] <- sum(data==i)/length(data)
    }
    vector[is.na(vector)] <- 0
    round(vector,4) 
  }  

It then takes x samples of the dataset, with x equal to the number of iterations (Default iter = 1000). The samples are initially structured in the same way as the original dataset, with each sample size equal to the size of the original (e.g., in the Marj Rabba example, there are 42 entries in each sample). The function calculates the mortality profile using the proportional age distribution for each sample.

mortality.matrix <- matrix(NA, ncol = N.ages, nrow = iter)
mortality.matrix[1,] <- mortprof(data)
for(i in 2:iter){
  bootstrap <- sample(1:length(data), length(data), replace = TRUE)
  mortality.matrix[i,] <- unlist(mortprof(data[bootstrap]))
} 

Plotting and the calculation of the confidence intervals is straightforward:

bar<-barplot(mortality.matrix[1,],ylim=c(0,(max(upCI)+.1)),
            names=Labels.ageclass,ylab=ylab,
            xlab=xlab,beside=T)
      g<-(max(bar)-min(bar))/110
      for (i in 1:length(bar))         {
        lines(c(bar[i],bar[i]),c(upCI[i],loCI[i]))
        lines(c(bar[i]-g,bar[i]+g),c(upCI[i],upCI[i]))
        lines(c(bar[i]-g,bar[i]+g),c(loCI[i],loCI[i]))
      }

Plotting models and a legend and making a table of the results is also a sstraight forward as shown for surv.func.

2. Survivorship using the %Fused Method

2.1 Introduction

The fuse.func function creates confidence intervals for the survivorship based on epiphyseal fusion, using the %Fused method. This method takes the ratio of fused to fused-plus-unfused bones for each fusion group (a fusion group is a group of elements that fuse around the same time). It is important to note that this is not equivalent to survivorship based on age classes. Fusion groups are not age classes, since the specimens pertinent to each fusion group are from animals of different ages. In effect, each fusion group is really just a survivorship curve with two age classes - fused and unfused.

The fuse.func function requires the user to load a .csv file with ("Data", below). The user is then prompted to enter the number of fusion groups, the number of elements in each fusion group, and finally the names of the elements in each fusion group. The function then takes bootstrap samples of the original dataset for the %Fused in each fusion group. It then calculates the confidence interval. Defaults are 1000 iterations for the bootstrap and 95% for the confidence interval.

Note: the graphical output requires the package ggplot2^[Wickman, H., 2009. ggplot2: elegant graphics for data analysis, Springer, New York.]. The code automatically installs and loads ggplot2 if not already loaded.

2.2 Setting up the Data

The data must be coded in the long way (i.e., each specimen must have its own entry) in a .csv file. The dataframe should include three columns: "Identification" (a unique identification number, this can be left blank); "Element" (these should be differentiated according to proximal and distal, when necessary); "Fusion" ("Fused" or "Unfused"). An example:

|Identification |Element |Fusion| |:-------------:|:---------:|:----:| |Specimen1 |Calcaneus |Fused | |Specimen2 |Px.Femur |Fused | |Specimen3 |Px.Femur |Unfused| |Specimen4 |Ds.Femur |Unfused| |Specimen5 |Px.Radius |Fused | |Specimen6 |Innominate |Unfused|

2.3 Arguments

The fuse.func function is in the following format by default:

fuse.func(data, iter=100, ci = 95, plotci=TRUE, plot.title=NULL)

2.4 What the Function Does

The function first requires the user to define the data. Then the function initiates a series of command prompts. First, for example,

> Enter number of fusion groups
> 5

Then

> Enter number of skeletal elements for fusion group A
> 2

.
.
.

> Enter number of skeletal elements for fusion group E 
> 1

Then

> Enter the 2 names of skeletal elements for fusion group A then > press enter 
> Px.Radius
> Ds.Humerus

.
.
.

> Enter the 1 names of skeletal elements for fusion group E then > press enter 
> Phalanx1

The R code for this is as follows:

fuse.func<-function(data,iter=1000,plotci=TRUE,plot.title=NULL){
  require(ggplot2)
  cat(paste("Enter number of fusion groups"), "\n")
  ans<-readLines(n = 1)
  ans <- as.numeric(ans)  
  fu.grps<-LETTERS[1:ans]
  ske.n<-numeric(length(fu.grps))  
  for(i in 1:length(ske.n)){
    cat(paste("Enter number of skeletal elements for fusion group"), fu.grps[i],"\n")
    ans<-readLines(n = 1)
    ske.n[i] <- as.numeric(ans)
  }  
  ele.list<-as.list(rep(NA,length(ske.n)))
  names(ele.list)<-fu.grps
  for(i in 1: length(ske.n)){
    cat(paste("Enter the", ske.n[i],"names of skeletal elements for fusion group"), fu.grps[i],"then press enter","\n")
    ele.list[[i]]<-readLines(n = ske.n[i])
  }

Then the function calculates the %Fused for each fusion group:

pctfuse<-function(dat){
    pct.ufu<-n<-numeric(length(ele.list))
    names(pct.ufu)<-fu.grps
    wh<-function(it){which(dat$Element==ele.list[[i]][it])} 
    for (i in 1:length(pct.ufu)){
      tab<-table(dat$Fusion[unlist(lapply(1:ske.n[i],wh))])
      fu<-tab[which(names(table(dat$Fusion[unlist(lapply(1:ske.n[i],wh))] ))=="Fused")]
      ufu<-tab[which(names(table(dat$Fusion[unlist(lapply(1:ske.n[i],wh))] ))=="Unfused")]
      if (is.nan(fu/(fu+ufu))){
        pct.ufu[i]<-0} else {pct.ufu[i]<-fu/(fu+ufu)}
      n[i]<-(fu+ufu)    
    }
    return(list(pct.ufu,n))
  }  

It then bootstraps the %Fused values for each fusion group (an analogous process to part 1.4). It provides a table of the results, with confidence intervals (default = 95%).

  boot <- matrix(NA, ncol = length(ele.list), nrow = iter) 
  boot[1,] <- pctfuse(data)[[1]]
  for(i in 2:iter){
    data.boot<-data[sample(1:dim(data)[1],dim(data)[1],replace=T),]
    boot[i,] <- pctfuse(data.boot)[[1]]
  }

  ### Provide a Table of the Bootstrap Results
  quantilematrix <- matrix(NA, ncol = 2, nrow = length(fu.grps))
  for(i in 1:ncol(boot)){
    quantilematrix[i,] <- quantile(boot[,i], probs = c(0.025,0.975), na.rm = T)
  }
  outputtable <- data.frame(Fusion.groups = fu.grps, 
                            Data = round(boot[1,],2), 
                            LowerCI = round(quantilematrix[,1],2), UpperCI = round(quantilematrix[,2],2), 
                            Count = pctfuse(data)[[2]])

Plotting the results is the final step. The fuse.func function uses ggplot2 to graph the results. The resulting graph shows the %Fused on the y-axis, with lines representing the confidence intervals. The x-axis shows the fusion groups (which are labeled "A" , "B" , "C", etc.). Directly above the graphed %Fused values and confidence intervals are the numbers of specimens per fusion group.

  ### Plotting the %Fusion data
   ciplot<-ggplot(outputtable, aes(x = Fusion.groups, y = Data))+
    #now add the points
    geom_point(size = 3)+
    #add in the 95% confidence interval bars
    geom_errorbar(aes(ymax = UpperCI, ymin = LowerCI), width = 0.2)+
    #add in the sample size label for each fusion group
    #this uses the previously-made function
    geom_text(aes(x = Fusion.groups, y = rep(1.05, length(Fusion.groups)), label = Count))+
    #add in the theme (all of the background plotting details)
    #the size for element_text is font size, for element_line it is the thickness
    #element_blank() makes it so there is no background color to the plot
    theme(panel.background = element_blank(),
          panel.grid.minor = element_blank(),
          panel.border = element_rect(fill=NA, color = "black"),
          axis.title = element_text(color = "black", size = 20),
          axis.text = element_text(color = "black", size = 15),
          axis.ticks = element_line(color = "black", size = 0.75),
          plot.title = element_text(color = "black", size = 24))+
    #add in the titles/labels
    ylab("%Fused")+xlab("Fusion Group")+
    ggtitle(plot.title)
  if(plotci==TRUE){print(ciplot)}
  list(Output = outputtable, Bootstrap.Data = boot)
}


Try the zooaRch package in your browser

Any scripts or data that you put into this service are public.

zooaRch documentation built on May 1, 2019, 7:10 p.m.