| beast | R Documentation |
beast is a high-level interface to the BEAST algorithm, a Bayesian model averaging approach for decomposing regular time series or other 1D sequential data into individual components, including abrupt changes, trends, and periodic/seasonal variations. BEAST is useful for changepoint detection (e.g., breakpoints, joinpoints, or structural breaks), nonlinear trend analysis, time series decomposition, and time series segmentation.
beast(
y,
start = 1,
deltat = 1,
season = c("harmonic", "svd", "dummy", "none"),
period = NULL,
scp.minmax = c(0, 10), sorder.minmax = c(0, 5),
tcp.minmax = c(0, 10), torder.minmax = c(0, 1),
sseg.min = NULL, sseg.leftmargin = NULL, sseg.rightmargin = NULL,
tseg.min = NULL, tseg.leftmargin = NULL, tseg.rightmargin = NULL,
s.complexfct = 0.0,
t.complexfct = 0.0,
method = c("bayes", "bic", "aic", "aicc", "hic",
"bic0.25", "bic0.5", "bic1.5", "bic2"),
detrend = FALSE,
deseasonalize = FALSE,
mcmc.seed = 0,
mcmc.burnin = 200,
mcmc.chains = 3,
mcmc.thin = 5,
mcmc.samples = 8000,
precValue = 1.5,
precPriorType = c("componentwise", "uniform", "constant", "orderwise"),
hasOutlier = FALSE,
ocp.minmax = c(0, 10),
print.param = TRUE,
print.progress = TRUE,
print.warning = TRUE,
quiet = FALSE,
dump.ci = FALSE,
dump.mcmc = FALSE,
gui = FALSE,
...
)
y |
If |
start |
|
deltat |
|
season |
Character (default
|
period |
Numeric or character string. The period of the seasonal/periodic component in
|
scp.minmax |
A numeric vector of length 2 (integers
|
sorder.minmax |
A numeric vector of length 2 (integers
|
tcp.minmax |
A numeric vector of length 2 (integers
|
torder.minmax |
A numeric vector of length 2 (integers
|
sseg.min |
An integer ( |
sseg.leftmargin |
Integer ( |
sseg.rightmargin |
Integer ( |
tseg.min |
Integer ( |
tseg.leftmargin |
Integer ( |
tseg.rightmargin |
Integer ( |
s.complexfct |
Numeric (default |
t.complexfct |
Numeric (default |
method |
Character (default
|
detrend |
Logical; if |
deseasonalize |
Logical; if |
mcmc.seed |
Integer ( |
mcmc.chains |
Integer ( |
mcmc.thin |
Integer ( |
mcmc.burnin |
Integer ( |
mcmc.samples |
Integer ( |
precValue |
Numeric ( |
precPriorType |
Character; one of
|
hasOutlier |
Logical; if
|
ocp.minmax |
Numeric vector of length 2 (integers |
print.param |
Logical; if |
print.progress |
Logical; if |
print.warning |
Logical; if |
quiet |
Logical; if |
dump.ci |
Logical; if |
dump.mcmc |
Logical; if |
gui |
Logical; if |
... |
Additional arguments reserved for future extensions and backward compatibility. Certain low-level settings are only available via |
The result is an object of class "beast", i.e., a list with structure identical to the outputs of beast.irreg and beast123. The exact array dimensions depend on the input y. Below we assume a single time series input of length N.
time |
Numeric vector of length |
data |
A vector, matrix, or 3D array; a copy of the input |
marg_lik |
Numeric; the average marginal likelihood of the sampled models. Larger values indicate better fit for a given time series (e.g., |
R2 |
Numeric; coefficient of determination (R-squared) of the fitted model. |
RMSE |
Numeric; root mean squared error of the fitted model. |
sig2 |
Numeric; estimated variance of the residual error. |
trend |
|
season |
|
outlier |
|
The three functions beast(), beast.irreg(), and beast123() implement the same BEAST algorithm but with different APIs. There is a one-to-one correspondence between (1) the arguments of beast() / beast.irreg() and (2) the metadata, prior, mcmc, and extra lists used by beast123(). Examples include:
start | <-> | metadata$startTime |
deltat | <-> | metadata$deltaTime |
deseasonalize | <-> | metadata$deseasonalize |
hasOutlier | <-> | metadata$hasOutlier |
scp.minmax[1:2] | <-> | prior$seasonMinKnotNum, prior$seasonMaxKnotNum |
sorder.minmax[1:2] | <-> | prior$seasonMinOrder, prior$seasonMaxOrder |
sseg.min | <-> | prior$seasonMinSepDist |
tcp.minmax[1:2] | <-> | prior$trendMinKnotNum, prior$trendMaxKnotNum |
torder.minmax[1:2] | <-> | prior$trendMinOrder, prior$trendMaxOrder |
tseg.leftmargin | <-> | prior$trendLeftMargin |
tseg.rightmargin | <-> | prior$trendRightMargin |
s.complexfct | <-> | prior$seasonComplexityFactor |
t.complexfct | <-> | prior$trendComplexityFactor |
mcmc.seed | <-> | mcmc$seed |
dump.ci | <-> | extra$computeCredible. |
For large data sets, irregular time series, or stacked images, beast123() is generally the recommended interface.
Zhao, K., Wulder, M.A., Hu, T., Bright, R., Wu, Q., Qin, H., Li, Y., Toman, E., Mallick, B., Zhang, X. and Brown, M., 2019. Detecting change-point, trend, and seasonality in satellite time series data to track abrupt changes and nonlinear dynamics: A Bayesian ensemble algorithm. Remote Sensing of Environment, 232, p.111181 (the beast algorithm paper).
Zhao, K., Valle, D., Popescu, S., Zhang, X. and Mallick, B., 2013. Hyperspectral remote sensing of plant biochemistry using Bayesian model averaging with variable and band selection. Remote Sensing of Environment, 132, pp.102-119 (the Bayesian MCMC scheme used in beast).
Hu, T., Toman, E.M., Chen, G., Shao, G., Zhou, Y., Li, Y., Zhao, K. and Feng, Y., 2021. Mapping fine-scale human disturbances in a working landscape with Landsat time series on Google Earth Engine. ISPRS Journal of Photogrammetry and Remote Sensing, 176, pp.250-261(a beast application paper).
beast.irreg,
beast123,
minesweeper,
tetris,
geeLandsat
library(Rbeast)
#------------------------------------Example 1----------------------------------------#
# 'googletrend_beach' is the monthly Google Trends popularity of searching for 'beach'
# in the US from 2004 to 2022. Sudden changes coincide with known extreme weather
# events (e.g., 2006 North American Blizzard, 2011 record US summer heat, record warm
# January in 2016) or the COVID-19 outbreak.
out <- beast(googletrend_beach)
plot(out)
plot(out, vars = c("t", "slpsgn")) # plot trend and slope-sign probabilities only.
# In the slpsgn panel, the upper red band is the
# probability that the trend slope is positive,
# the middle green band the probability that the
# slope is effectively zero, and the lower blue
# band the probability that it is negative.
# See ?plot.beast for details.
#------------------------------------Example 2----------------------------------------#
# Yellowstone is a half-monthly satellite NDVI time series of length 774 starting
# from July 1-15, 1981 (start ~ c(1981, 7, 7)) at a forest site in Yellowstone.
# There are 24 data points per year. The 1988 Yellowstone Fire caused a sudden drop
# in greenness. Note that the beast function hanldes only evenly-spaced regular time
# series. Irregular data need to be first aggegrated at a regular time interval of
# your choice--the aggregation functionality is implemented in beast.irreg() and beast123().
data(Yellowstone)
plot(1981.5 + (0:773) / 24, Yellowstone, type = "l") # A sudden drop in greenness in 1988
# due to the 1988 Yellowstone Fire
# Yellowstone is a plain numeric vector (no time attributes). Below, no extra
# arguments are supplied, so default values (start = 1, deltat = 1) are used and
# time is 1:774. The period is missing and is guessed via autocorrelation.
# Autocorrelation-based period estimates can be unreliable; examples below show
# how to specify time/period explicitly.
o <- beast(Yellowstone) # By default, time = 1:length(Yellowstone), season = "harmonic"
plot(o)
# o <- beast(Yellowstone, quiet = TRUE) # suppress warnings
# o <- beast(Yellowstone, quiet = TRUE, print.progress = FALSE) # suppress all output
#------------------------------------Example 3----------------------------------------#
# Time information (start, deltat, period) specified explicitly of Yellowstone.
# (1) Arbitrary unit: 1981.5137 can be interpreted flexibly in any units not neccessarily
# referting to times
o <- beast(Yellowstone, start = 1981.5137, deltat = 1/24, period = 1.0)
# Strings can be used to explicitly specify time unitsas dates or years:
# o <- beast(Yellowstone, start = 1981.5137, deltat = "1/24 year", period = 1.0)
# o <- beast(Yellowstone, start = 1981.5137, deltat = "0.5 mon", period = 1.0)
# o <- beast(Yellowstone, start = 1981.5137, deltat = 1/24, period = "1 year")
# o <- beast(Yellowstone, start = 1981.5137, deltat = 1/24, period = "365 days")
# (2) start as year-month-day (unit is year: deltat = 1/24 year = 0.5 month)
# o <- beast(Yellowstone, start = c(1981, 7, 7), deltat = 1/24, period = 1.0)
# (3) start as Date (unit is year: deltat = 1/24 year = 0.5 month)
# o <- beast(Yellowstone, start = as.Date("1981-07-07"), deltat = 1/24, period = 1.0)
print(o) # 'o' is a list with many fields
str(o) # See a list of fields in the BEAST output o
plot(o) # plot many default variables
plot(o, vars = c("y","s","t")) # plot the data, seasonal, and trend components only
plot(o, vars = c("s","scp","samp","t","tcp","tslp")) # selected variables (see ?plot.beast)
plot(o, vars = c("s","t"), col = c("red","blue")) # Specify colors of selected subplots
plot(o$time, o$season$Y, type = "l") # fitted seasonal component
plot(o$time, o$season$cpOccPr) # seasonal changepoint probabilities
plot(o$time, o$trend$Y, type = "l") # fitted trend component
plot(o$time, o$trend$cpOccPr) # trend changepoint probabilities
plot(o$time, o$season$order) # average harmonic order over time
plot(o, interactive = TRUE) # interactively choose variables to plot
#------------------------------------Example 4----------------------------------------#
# Specify other arguments explicitly. Default values are used for missing parameters.
# Note that beast(), beast.irreg(), and beast123() call the same internal C/C++ library,
# so in beast(), the input parameters will be converted to metadata, prior, mcmc, and
# extra parameters as explained for the beast123() function. Or type 'View(beast)' to
# check the parameter assignment in the code.
# In R's terminology, the number of datapoints per period is also called 'freq'. In this
# version, the 'freq' argument is obsolete and replaced by 'period'.
# period=deltat*number_of_datapoints_per_period = 1.0*24=24 because deltat is set to 1.0 by
# default and this signal has 24 samples per period.
o = beast(Yellowstone, period=24.0, mcmc.samples=5000, tseg.min=20)
# period=deltat*number_of_datapoints_per_period = 1/24*24=1.0.
# o = beast(Yellowstone, deltat=1/24 period=1.0, mcmc.samples=5000, tseg.min=20)
o = beast(
Yellowstone, # Yellowstone: a pure numeric vector wo time info
start = 1981.51,
deltat = 1/24,
period = 1.0, # Period=delta*number_of_datapoints_per_period
season = 'harmonic', # Periodic compnt exisits,fitted as a harmonic curve
scp.minmax = c(0,3), # Min and max numbers of seasonal changpts allowed
sorder.minmax = c(1,5), # Min and max harmonic orders allowed
sseg.min = 24, # The min length of segments btw neighboring chnpts
# '24' means 24 datapoints; the unit is datapoint.
sseg.leftmargin= 40, # No seasonal chgpts allowed in the starting 40 datapoints
tcp.minmax = c(0,10),# Min and max numbers of changpts allowed in the trend
torder.minmax = c(0,1), # Min and maxx polynomial orders to fit trend
tseg.min = 24, # The min length of segments btw neighboring trend chnpts
tseg.leftmargin= 10, # No trend chgpts allowed in the starting 10 datapoints
s.complexfct = 0.26, # Manually tune it to fit a more complex seasonal curve
# than the non-informative prior on number of changepts
t.complexfct = -5.2, # Manually tune it to fit a more complex trend curve
# than the non-informative prior on number of changepts
deseasonalize = TRUE, # Remove the global seasonality before fitting the beast model
detrend = TRUE, # Remove the global trend before fitting the beast model
mcmc.seed = 0, # A seed for mcmc's random nummber generator; use a
# non-zero integer to reproduce results across runs
mcmc.burnin = 500, # Number of initial iterations discarded
mcmc.chains = 2, # Number of chains
mcmc.thin = 3, # Include samples every 3 iterations
mcmc.samples = 6000, # Number of samples taken per chain
# total iteration: (500+3*6000)*2
print.param = FALSE # Do not print the parameters
)
plot(o)
plot(o,vars=c('t','slpsgn') ) # plot only trend and slope sign
plot(o,vars=c('t','slpsgn'), relative.heights =c(.8,.2) ) # run "?plot.beast" for more info
plot(o, interactive=TRUE)
#------------------------------------Example 5----------------------------------------#
# Run an interactive GUI to visualize how BEAST is samplinig from the possible model
# spaces in terms of the numbers and timings of seasonal and trend changepoints.
# The GUI inferface allows changing the option parameters interactively. This GUI is
# only available on Win x64 machines, not Mac or Linux.
## Not run:
beast(Yellowstone, period=24, gui=TRUE)
## End(Not run)
#------------------------------------Example 6----------------------------------------#
# Apply beast to trend-only data. 'Nile' is the ANNUAL river flow of the river
# Nile at Aswan since 1871. It is a 'ts' object; its time attributes (start=1871,
# end=1970,frequency=1) are used to replace the user-supplied start,deltat, and freq,
# if any.
data(Nile)
plot(Nile)
attributes(Nile) # a ts object with time attributes (i.e., tsp=(start,end,freq)
o = beast(Nile) # start=1871, delta=1, and freq=1 taken from Nile itself
plot(o)
o = beast(Nile, # the same as above. The user-supplied values (i.e., 2023,
start=2023, # 9999) are ignored bcz Nile carries its own time attributes.
period=9999, # Its frequency tag is 1 (i.e., trend-only), so season='none'
season='harmonic' # is used instead of the supplied 'harmonic'
)
#------------------------------------Example 7----------------------------------------#
# NileVec is a pure data vector. The first run below is WRONG bcz NileVec was assumed
# to have a perodic component by default and beast gets a best estimate of freq=6 while
# the true value is freq=1. To fit a trend-only model, season='none' has to be explicitly
# specified, as in the 2nd & 3rd funs.
NileVec = as.vector(Nile) # NileVec is not a ts obj but a pure numeric data vector
o = beast(NileVec) # WRONG WAY to call: No time attributes available to interpret
# NileVec. By default, beast assumes season='harmonic', start=1,
# & deltat=1. 'freq' is missing and guessed to be 6 (WRONG).
plot(o) # WRONG Results: The result has a suprious seasonal component
o=beast(NileVec,season='none') # The correct way to call: Use season='none' for trend-only
# analysis; the default time is the integer indices
# "1:length(NileVec)'.
print(o$time)
o=beast(NileVec, # Recommended way to call: The true time attributes are
start = 1871, # given explicitly through start and deltat (or freq if
deltat = 1, # there is a cyclic/seasonal cmponent).
season = 'none')
print(o$time)
plot(o)
#------------------------------------Example 8----------------------------------------#
# beast can handle missing data. co2 is a monthly time series (i.e.,freq=12) starting
# from Jan 1959. We generate some missing values at random indices
## Not run:
data(co2)
attributes(co2) # A ts object with time attributes (i.e., tsp)
badIdx = sample( 1:length(co2), 50) # Get a set of random indices
co2[badIdx] = NA # Insert some data gaps
out=beast(co2) # co2 is a ts object and its 'tsp' time attributes are used to get the
# true time info. No need to specify 'start','deltat', & freq explicity.
out=beast(co2, # The supplied time/period values will be ignored bcz
start = c(1959,1,15),# co2 is a ts object; the correct period = 1 will be
deltat = 1/12, # used.
period = 365)
print(out)
plot(out)
## End(Not run)
#------------------------------------Example 9----------------------------------------#
# Apply beast to time seris-like sequence data: the unit of sequences is not
# necessarily time.
data(CNAchrom11) # DNA copy number alterations in Chromesome 11 for cell line GM05296
# The data is orderd by genomic position (not time), and the values
# are the log2-based intensity ratio of copy numbers between the sample
# the reference. A value of zero means no gain or loss in copy number.
o = beast(CNAchrom11,season='none') # season is a misnomer here bcz the data has nothing
# to do with time. Regardless, we fit only a trend.
plot(o)
#------------------------------------Example 10---------------------------------------#
# Apply beast to time seris-like data: the unit of sequences is not necessarily time.
# Age of Death of Successive Kings of England
# If the data link is deprecated, install the time series data library instead,
# which is available at https://pkg.yangzhuoranyang.com/tsdl/
# install.packages("devtools")
# devtools::install_github("FinYang/tsdl")
# kings = tsdl::tsdl[[293]]
kings = scan("http://robjhyndman.com/tsdldata/misc/kings.dat",skip=3)
out = beast(kings,season='none')
plot(out)
#------------------------------------Example 11---------------------------------------#
# Another example from the tsdl data library
# Number of monthly births in New York from Jan 1946 to Dec 1959
# If the data link becomes invalid, install the time series data package instead
# install.packages("devtools")
# devtools::install_github("FinYang/tsdl")
# kings = tsdl::tsdl[[534]]
births = scan("http://robjhyndman.com/tsdldata/data/nybirths.dat")
out = beast(births,start=c(1946,1,15), deltat=1/12 )
plot(out) # the result is wrong bcz the guessed freq via auto-correlation by beast
# is 2 rather than 12, so we recommend always specifying 'freq' explicitly
# for those time series with a periodic component, as shown below.
out = beast(births,start=c(1946,1,15), deltat=1/12, freq =12 )
out = beast(births,start=c(1946,1,15), deltat=1/12, period=1.0 )
plot(out)
# Fit the seasonal component using a singular-value-decompostion-based basis functions
# rathat than the default harmonic sin/cos basis functions.
out = beast(births,start=c(1946,1,15), deltat=1/12, period=1.0, season="svd" )
plot(out)
#------------------------------------Example 12---------------------------------------#
# Daily confirmed COVID-19 new cases and deaths across the globe
## Not run:
data(covid19)
plot(covid19$date, covid19$newcases, type='l')
newcases = sqrt( covid19$newcases ) # Apply a square root-transformation
# This ts varies periodically every 7 days. 7 days can't be precisely represented
# in the unit of year bcz some years has 365 days and others has 366. BEAST can hanlde
# this in two ways.
#(1) Use the date number as the time unit--the num of days lapsed since 1970-01-01.
datenum = as.numeric(covid19$date)
o = beast(newcases, start=min(datenum), deltat=1, period=7)
o$time = as.Date(o$time, origin='1970-01-01') # Convert from integers to Date.
plot(o)
#(2) Use strings to explicitly specify deltat and period with a unit.
startdate = covid19$date[1]
o = beast(newcases, start=startdate, deltat='1day', period='7days')
plot(o)
## End(Not run)
#------------------------------------Example 13---------------------------------------#
# The old API interface of beast is still made available but NOT recommended. It is
# kept mainly to ensure the working of the sample code on Page 475 in the text
# Ecological Metods by Drs. Southwood and Henderson.
## Not run:
# The interface as shown here will be deprecated and NOT recommended.
beast(Yellowstone, 24) #24 is the freq: number of datapoints per period
# Specify the model or MCMC parameters through opt as in Rbeast v0.2
opt=list() #Create an empty list to append individual model parameters
opt$period=24 #Period of the cyclic component (i.e.,freq in the new version)
opt$minSeasonOrder=2 #Min harmonic order allowed in fitting season component
opt$maxSeasonOrder=8 #Max harmonic order allowed in fititing season component
opt$minTrendOrder=0 #Min polynomial order allowed to fit trend (0 for constant)
opt$maxTrendOrder=1 #Max polynomial order allowed to fit trend (1 for linear term)
opt$minSepDist_Season=20#Min separation time btw neighboring season changepoints
opt$minSepDist_Trend=20 #Min separation time btw neighboring trend changepoints
opt$maxKnotNum_Season=4 #Max number of season changepoints allowed
opt$maxKnotNum_Trend=10 #Max number of trend changepoints allowed
opt$omittedValue=NA #A customized value to indicate bad/missing values in the time
#series, in additon to those NA or NaN values.
# The following parameters used to configure the reverisible-jump MCMC (RJMCC) sampler
opt$chainNumber=2 #Number of parallel MCMC chains
opt$sample=1000 #Number of samples to be collected per chain
opt$thinningFactor=3 #A factor to thin chains
opt$burnin=500 #Number of burn-in samples discarded at the start
opt$maxMoveStepSize=30 #For the move proposal, the max window allowed in jumping from
#the current changepoint
opt$resamplingSeasonOrderProb=0.2 #The probability of selecting a re-sampling proposal
#(e.g., resample seasonal harmonic order)
opt$resamplingTrendOrderProb=0.2 #The probability of selecting a re-sampling proposal
#(e.g., resample trend polynomial order)
opt$seed=65654 #A seed for the random generator: If seed=0,random numbers differ
#for different BEAST runs. Setting seed to a chosen non-zero integer
#will allow reproducing the same result for different BEAST runs.
beast(Yellowstone, opt)
## End(Not run)
#------------------------------------Example 14---------------------------------------#
# Fit a model with an outlier component: Y = trend + outlier + error.
# Outliers here refer to spikes or dips at isolated points that can't be capatured by the
# trend
## Not run:
NileVec = as.vector(Nile)
NileVec[50] = NileVec[50] + 1500 # Add an large artificial spike at t=50
o = beast(NileVec, season='none', hasOutlier=TRUE)
plot(o)
o = beast(Nile, season='none', hasOutlier=TRUE) # Fit a model with outliers
plot(o)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.