| beast123 | R Documentation |
BEAST is a Bayesian model averaging algorithm for decomposing time series or other 1D sequential data into individual components, including abrupt changes, trends, and periodic/seasonal variations. It is useful for changepoint detection (e.g., breakpoints or structural breaks), nonlinear trend analysis, time series decomposition, and time series segmentation. beast123 is a low-level interface to BEAST.
beast123(
Y,
metadata = list(),
prior = list(),
mcmc = list(),
extra = list(),
season = c("harmonic","svd","dummy","none"),
method = c("bayes","bic","aic","aicc","hic",
"bic0.25","bic0.5","bic1.5","bic2"),
...
)
Y |
|
metadata |
If missing, default values are used. However, when When
|
prior |
|
mcmc |
|
extra |
|
season |
Character (default
|
method |
Character (default
|
... |
Additional parameters reserved for future extensions; currently unused. |
The output is a list object of class "beast" containing the following fields. Exact dimensions depend on the input Y and extra$whichOutputDimIsTime. The descriptions below assume a single time series input of length N; for 2D/3D inputs, interpret the results according to the 2D/3D dimensions.
time |
A numeric vector of length |
data |
A vector, matrix, or 3D array; a copy of the input |
marg_lik |
Numeric; the average marginal likelihood of the sampled models. Larger values indicate better fit for a given time series. |
R2 |
Numeric; coefficient of determination (R-squared) of the fitted model. |
RMSE |
Numeric; root mean squared error of the fitted model. |
sig2 |
Numeric; estimated variance of the residual error. |
trend |
|
season |
|
outlier |
|
Zhao, K., Wulder, M.A., Hu, T., Bright, R., Wu, Q., Qin, H., Li, Y., Toman, E., Mallick, B., Zhang, X. and Brown, M., 2019. Detecting change-point, trend, and seasonality in satellite time series data to track abrupt changes and nonlinear dynamics: A Bayesian ensemble algorithm. Remote Sensing of Environment, 232, p.111181 (the beast algorithm paper).
Zhao, K., Valle, D., Popescu, S., Zhang, X. and Mallick, B., 2013. Hyperspectral remote sensing of plant biochemistry using Bayesian model averaging with variable and band selection. Remote Sensing of Environment, 132, pp.102-119 (the Bayesian MCMC scheme used in beast).
Hu, T., Toman, E.M., Chen, G., Shao, G., Zhou, Y., Li, Y., Zhao, K. and Feng, Y., 2021. Mapping fine-scale human disturbances in a working landscape with Landsat time series on Google Earth Engine. ISPRS Journal of Photogrammetry and Remote Sensing, 176, pp.250-261(a beast application paper).
beast,
beast.irreg,
minesweeper,
tetris,
geeLandsat
#-------------------------------- NOTE ----------------------------------------------#
# beast123() is an all-inclusive function that unifies the functionality of beast()
# and beast.irreg(). It can handle single, multiple, or 3D stacked time series, either
# regular or irregular. It allows customization via four list arguments:
# metadata -- additional information about the input Y
# prior -- prior parameters for the BEAST model
# mcmc -- MCMC settings
# extra -- miscellaneous options controlling output and parallel computation
#
# Although functionally similar to beast() and beast.irreg(), beast123() is designed
# to efficiently handle multiple time series concurrently (e.g., stacked satellite
# images) via internal multithreading. When processing stacks of raster layers,
# DO NOT iterate pixel-by-pixel using beast() or beast.irreg() wrapped by external
# parallel tools (e.g., doParallel, foreach). Instead, use beast123(), which
# implements parallelism internally.
#------------------------------ Example 1: seasonal time series ----------------------#
# Yellowstone is a half-monthly NDVI time series of length 774 for a site in Yellowstone
# starting from July 1-15, 1981 (start ~ c(1981, 7, 7); 24 observations per year).
library(Rbeast)
data(Yellowstone)
plot(Yellowstone)
# Below, the four optional arguments are omitted, so default values are used.
# By default, Y is assumed to be regular with a seasonal component. The actual
# default values are printed at the start of the run and can be used as templates
# for customization.
o <- beast123(Yellowstone)
plot(o)
#------------------------------ Example 2: trend-only series -------------------------#
# Nile is an annual river flow time series (no periodic variation). Set season = "none"
# to indicate trend-only analysis. Default values are used for other options.
# Unlike beast(), beast123() does NOT use the time attributes of a 'ts' object.
# For example, Nile is treated as numeric data only with the default times
# 1:length(Nile); its (start=1871, end=1970, freq=1) attributes are ignored
# unless specified through 'metadata'.
o <- beast123(Nile, season = "none")
o <- beast123(Nile, extra = list(dumpInputData = TRUE, quiet = TRUE),
season = "none")
o <- beast123(Nile, metadata = list(startTime = 1871, hasOutlier = TRUE),
season = "none")
plot(o)
#------------------------------ Example 3: full API usage ---------------------------#
# Explicitly specify metadata, prior, mcmc, and extra. Among these, 'prior' contains
# the true statistical model parameters; the others configure input, output, and
# computation.
## Not run:
# metadata: extra information describing the input time series Y.
metadata <- list()
# metadata$isRegularOrdered <- TRUE # no longer used in this version
metadata$whichDimIsTime <- 1 # which dimension is time for 2D/3D inputs
metadata$startTime <- c(1981, 7, 7)
# alternatively: metadata$startTime <- 1981.5137
# metadata$startTime <- as.Date("1981-07-07")
metadata$deltaTime <- 1/24 # half-monthly: 24 obs/year
metadata$period <- 1.0 # 1-year period: 24 * (1/24) = 1
metadata$omissionValue <- NaN # NaNs are omitted by default
metadata$maxMissingRateAllowed <- 0.75 # skip series with > 75
metadata$deseasonalize <- FALSE # do not pre-remove global seasonality
metadata$detrend <- FALSE # do not pre-remove global trend
# prior: only true model parameters specifying the Bayesian formulation
prior <- list()
prior$seasonMinOrder <- 1 # min harmonic order allowed to fit seasonal cmpnt
prior$seasonMaxOrder <- 5 # max harmonic order allowed to fit seasonal cmpnt
prior$seasonMinKnotNum <- 0 # min number of changepnts in seasonal cmpnt
prior$seasonMaxKnotNum <- 3 # max number of changepnts in seasonal cmpnt
prior$seasonMinSepDist <- 10 # min inter-chngpts separation for seasonal cmpnt
prior$trendMinOrder <- 0 # min polynomial order allowed to fit trend cmpnt
prior$trendMaxOrder <- 1 # max polynomial order allowed to fit trend cmpnt
prior$trendMinKnotNum <- 0 # min number of changepnts in trend cmpnt
prior$trendMaxKnotNum <- 15 # max number of changepnts in trend cmpnt
prior$trendMinSepDist <- 5 # min inter-chngpts separation for trend cmpnt
prior$precValue <- 10.0 # Initial value of precision parameter(no need
# to change it unless for precPrioType='const')
prior$precPriorType <- "uniform" # "constant", "uniform", "componentwise", "orderwise"
# mcmc: settings for the MCMC sampler
mcmc <- list()
mcmc$seed <- 9543434 # arbitray seed for random number generator
mcmc$samples <- 3000 # samples collected per chain
mcmc$thinningFactor <- 3 # take every 3rd sample and discard others
mcmc$burnin <- 150 # discard the initial 150 samples per chain
mcmc$chainNumber <- 3 # number of chains
mcmc$maxMoveStepSize <- 4 # max random jump step when proposing new chngpts
mcmc$trendResamplingOrderProb <- 0.10 # prob of choosing to resample polynomial order
mcmc$seasonResamplingOrderProb <- 0.10 # prob of choosing to resample harmonic order
mcmc$credIntervalAlphaLevel <- 0.95 # the significance level for credible interval
# extra: output and computation options
extra <- list()
extra$dumpInputData <- FALSE # If true, a copy of input time series is outputted
extra$whichOutputDimIsTime <- 1 # For 2D or 3D inputs, which dim of the output refers
# to time? Ignored if the input is a single time series
extra$computeCredible <- FALSE # If true, compute CI: computing CI is time-intensive.
extra$fastCIComputation <- TRUE # If true, a faster way is used to get CI, but it is
# still time-intensive. That is why the function beast()
# is slow because it always compute CI.
extra$computeSeasonOrder <- FALSE # If true, dump the estimated harmonic order over time
extra$computeTrendOrder <- FALSE # If true, dump the estimated polynomial order over time
extra$computeSeasonChngpt <- TRUE # If true, get the most likely locations of s chgnpts
extra$computeTrendChngpt <- TRUE # If true, get the most likely locations of t chgnpts
extra$computeSeasonAmp <- FALSE # If true, get time-varying amplitude of seasonality
extra$computeTrendSlope <- FALSE # If true, get time-varying slope of trend
extra$tallyPosNegSeasonJump <- FALSE # If true, get those changpts with +/- jumps in season
extra$tallyPosNegTrendJump <- FALSE # If true, get those changpts with +/- jumps in trend
extra$tallyIncDecTrendJump <- FALSE # If true, get those changpts with increasing/
# decreasing trend slopes
extra$printProgress <- TRUE
extra$printParameter <- TRUE
extra$quiet <- FALSE # if TRUE, print warning messages, if any
extra$consoleWidth <- 0 # If 0, the console width is from the current console
extra$numThreadsPerCPU <- 2 # 'numThreadsPerCPU' and 'numParThreads' are used to
extra$numParThreads <- 0 # configure multithreading runs; they're used only
# if Y has multiple time series (e.g.,stacked images)
o <- beast123(Yellowstone, metadata, prior, mcmc, extra, season = "harmonic")
plot(o)
## End(Not run)
#------------------------------ Example 4: irregular time series ---------------------#
# ohio is a data frame of Landsat NDVI observations at unevenly spaced times.
## Not run:
data(ohio)
str(ohio)
###################################################################################
# Individual times are provided as fractional years via ohio$time
metadata <- list()
metadata$time <- ohio$time # Must supply individual times for irregular inputs
metadata$deltaTime <- 1/12 # Must supply the desired time interval for aggregation
metadata$period <- 1.0
o <- beast123(ohio$ndvi, metadata) # defaults used for missing parameters
###################################################################################
# Another accepted time format: Ohio dates as Date objects
class(ohio$rdate)
metadata <- list()
metadata$time <- ohio$rdate # Must supply individual times for irregular inputs
metadata$deltaTime <- 1/12 # Must supply the desired time interval for aggregation
o <- beast123(ohio$ndvi, metadata) # Default values used for those missing parameters
###################################################################################
# Another accepted time format: separate Y, M, D columns
ohio$Y # Year
ohio$M # Month
ohio$D # Day
metadata <- list()
metadata$time$year <- ohio$Y # Must supply individual times for irregular inputs
metadata$time$month <- ohio$M
metadata$time$day <- ohio$D
metadata$deltaTime <- 1/12 # Must supply the desired time interval for aggregation
o <- beast123(ohio$ndvi, metadata) # Default values used for those missing parameters
###################################################################################
# Another accepted time format: year and DOY
ohio$Y # Year
ohio$doy # Day of year
metadata <- list()
metadata$time$year <- ohio$Y # Must supply individual times for irregular inputs
metadata$time$doy <- ohio$doy
metadata$deltaTime <- 1/12 # Must supply the desired time interval for aggregation
o <- beast123(ohio$ndvi, metadata) # Default values used for those missing parameters
###################################################################################
# Another accepted time format: date strings with custom strfmt
ohio$datestr1
metadata <- list()
metadata$time$datestr <- ohio$datestr1
metadata$time$strfmt <- "????yyyy?mm?dd"
metadata$deltaTime <- 1/12
o <- beast123(ohio$ndvi, metadata)
###################################################################################
# Another accepted time format: date strings with custom strfmt
ohio$datestr2
metadata <- list()
metadata$deltaTime <- 1/12
metadata$time$datestr <- ohio$datestr2
metadata$time$strfmt <- "????yyyydoy????"
o <- beast123(ohio$ndvi, metadata)
###################################################################################
# Another accepted time format: date strings with custom strfmt
ohio$datestr3
metadata <- list()
metadata$deltaTime <- 1/12
metadata$time$datestr <- ohio$datestr3
metadata$time$strfmt <- "Y,,M/D"
o <- beast123(ohio$ndvi, metadata)
## End(Not run)
#------------------ Example 5: multiple time series (matrix input) -------------------#
# simdata is a 2D matrix of dimension 300 x 3, representing 3 time series of length 300.
## Not run:
data(simdata) # dim of simdata: 300 x 3 (time x num_of_time_series)
dim(simdata) # the first dimenion refer to time (i.e, 300)
metadata <- list()
metadata$whichDimIsTime <- 1 # first dimension is time
metadata$period <- 24 # assume startTime = 1, deltaTime = 1 by default
extra <- list()
extra$whichOutputDimIsTime <- 2 # Which dim of the output arrays refers to time?
o <- beast123(simdata, metadata, extra = extra)
# Lists of inputs args can also be provided inline:
o <- beast123(simdata,
metadata = list(whichDimIsTime = 1, period = 24),
extra = list(whichOutputDimIsTime = 2))
# Field names can be shortened, provided no ambiguity arises:
o <- beast123(simdata,
metadata = list(whichDim = 1, per = 24),
extra = list(whichOut = 2))
#------------------ Example 5b: transposed simdata ---------------------------------#
simdata1 <- t(simdata) # 3 (series) x 300 (time)
# dim of simdata1: 3 x 300 (num of ts x time )
metadata <- list()
metadata$whichDimIsTime <- 2 # time is second dimension
metadata$period <- 24 # assume startTime = 1, deltaTime = 1 by default
o <- beast123(simdata1, metadata)
o <- beast123(simdata1, metadata = list(whichDim = 2, per = 24))
## End(Not run)
#------------------ Example 6: 3D stacked image time series -------------------------#
# imagestack$ndvi is a 3D array of size 12 x 9 x 1066 (row x col x time). Each pixel
# corresponds to a time series of length 1066.
## Not run:
data(imagestack)
dim(imagestack$ndvi) # 12 x 9 x 1066
imagestack$datestr # A character vector of 1066 date strings
metadata <- list()
metadata$whichDimIsTime <- 3 # Which dim of the input refer to time for 3D inputs?
# 1066 is the ts length, so dim is set to '3' here.
# In this example, this arg is not needed because
# the time$datestr can also help to match and pick up
# the right time dimesion of imagestack$ndvi.
metadata$time$datestr <- imagestack$datestr
metadata$time$strfmt <- "LT05_018032_20080311.yyyy-mm-dd"
metadata$deltaTime <- 1/12 # aggregate to monthly (i.e., 1/12 yr)
metadata$period <- 1.0 # 1-year period
extra <- list()
extra$dumpInputData <- TRUE # Dump a copy of aggregated input ts
extra$numThreadsPerCPU <- 2 # Each cpu core will be assigned 2 threads
extra$numParThreads <- 0 # If 0, total_num_threads=numThreadsPerCPU*num_of_cpu_core
# if >0, used to specify the total number of threads
# Default values for missing parameters
o <- beast123(imagestack$ndvi, metadata = metadata, extra = extra)
print(o, c(5, 3)) # result for pixel at row 5, col 3
plot(o, c(5, 3)) # plot result for pixel at row 5, col 3
image(o$trend$ncp) # map number of trend changepoints over space
## End(Not run)
#------------------ Example 7: GeoTIFF stacks via raster package --------------------#
# Handle 3D stacked images from a series of NDVI GeoTIFF files (e.g., ndvi.zip with
# 437 NDVI TIFF files of dimension 12 x 9). Example code is available at:
# https://github.com/zhaokg/Rbeast/blob/master/R/beast123_raster_example.txt
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.