thinDates: Sampling function to select a maximum number of dates per...

View source: R/aggregation.R

thinDatesR Documentation

Sampling function to select a maximum number of dates per site, bin or phase.

Description

Function to select a subset of uncalibrated radiocarbon dates up to a maximum sample size per site, bin or phase.

Usage

thinDates(ages, errors, bins, size, thresh = 0.5, method = "random", seed = NA)

Arguments

ages

A vector of uncalibrated radiocarbon ages

errors

A vector of uncalibrated radiocarbon errors (same length as ages)

bins

A vector of labels corresponding to site names, ids, bins or phases (same length as ages)

size

A single integer specifying the maximum number of desired dates for each label stated bin.

thresh

A single numeric value between 0 and 1 specifying the approximate proportion (after rounding) of the resulting sample that will be chosen according to lowest date errors. At the extremes, O produces a simple random sample whereas 1 selects the sample dates with the lowest errors. Ignored if method="random".

method

The method to be applied where "random" simple selects a random sample, whereas "splitsample", picks some proportion (see thresh) of the sample to minimise errors, and randomly samples the rest. At present, these are the only two options.

seed

Allows setting of a random seed to ensure reproducibility.

Value

A numeric vector of the row indices corresponding to those of the input data.

See Also

binPrep

Examples

data(euroevol)
foursites <- euroevol[euroevol$SiteID %in% c("S2072","S4380","S6139","S9222"),]
table(as.character(foursites$SiteID))
## Thin so each site has 10 dates each max, with random selection
thinInds<- thinDates(ages=foursites$C14Age, errors=foursites$C14SD, 
bins=foursites$SiteID, size=10, method="random", seed=123)
tdates <- foursites[thinInds,]
tdates
## Same but choose the first 60% (i.e. 6 dates) from the lowest errors 
## and then fill in the rest randomly.
thinInds<- thinDates(ages=foursites$C14Age, errors=foursites$C14SD, 
bins=foursites$SiteID, size=10, method="splitsample", thresh=0.6, seed=123)
tdates1 <- foursites[thinInds,]
tdates1

rcarbon documentation built on Aug. 24, 2023, 5:11 p.m.