beast123 | R Documentation |
A Bayesian model averaging algorithm called BEAST to decompose time series or 1D sequential data into individual components, such as abrupt changes, trends, and periodic/seasonal variations. BEAST is useful for changepoint detection (e.g., breakpoints or structural breaks), nonlinear trend analysis, time series decomposition, and time series segmentation.
beast123( Y,
metadata = list(),
prior = list(),
mcmc = list(),
extra = list(),
season = c('harmonic','svd','dummy','none'),
method = c( 'bayes', 'bic', 'aic', 'aicc', 'hic',
'bic0.25', 'bic0.5', 'bic1.5', 'bic2' ),
...
)
Y |
a 1D vector, 2D matrix, or 3D array of numeric data. Missing values are allowed and can be indicated by
|
metadata |
(optional). If present,
|
prior |
(optional). a list object consisting of the hyperprior parameters in the Bayesian formulation of the BEAST model. Because they are part of the model, the fitting result may be sensitive to the choices of these hyperparameters. If
|
mcmc |
(optional). a list object consisting of parameters to configure the MCMC inference. These parameter are not part of the Bayesian formulation of the BEAST model but are the settings for the reversible-jump MCMC to generate MCMC chains. Due to the MCMC nature, the longer the simulation chain is, the better the fitting result. Below are possible parameters:
|
extra |
(optional). a list object consisting of flags to control the outputs from the BEAST runs or configure other program setting. Below are possible parameters:
|
season |
characters (default to 'harmonic'); specify if
|
method |
a string (default to 'bayes'); specify the method for formulating model posterior probability.
|
... |
additional parameters, not used currently but reserved for future extension |
The output is an object of class "beast". It is a list, consisting of the following variables. Exact sizes of the variables depend on the types of the input Y
as well as the specified output time dimension extra$whichOutputDimIsTime
. In the explanations below, we assume the input Y
is a single time series of length N
; the dimensions for 2D or 2D inputs may be interpreted accordingly:
time |
a vector of size |
data |
a vector, matrix, or 3D array; this is a copy of the input |
marg_lik |
numeric; the average of the model marginal likelihood; the larger marg_lik, the better the fitting for a given time series. |
R2 |
numeric; the R-square of the model fitting. |
RMSE |
numeric; the RMSE of the model fitting. |
sig2 |
numeric; the estimated variance of the model error. |
trend |
a list object numeric consisting of various outputs related to the estimated trend component:
|
season |
a list object numeric consisting of various outputs related to the estimated seasonal/periodic component:
|
Zhao, K., Wulder, M.A., Hu, T., Bright, R., Wu, Q., Qin, H., Li, Y., Toman, E., Mallick, B., Zhang, X. and Brown, M., 2019. Detecting change-point, trend, and seasonality in satellite time series data to track abrupt changes and nonlinear dynamics: A Bayesian ensemble algorithm. Remote Sensing of Environment, 232, p.111181 (the beast algorithm paper).
Zhao, K., Valle, D., Popescu, S., Zhang, X. and Mallick, B., 2013. Hyperspectral remote sensing of plant biochemistry using Bayesian model averaging with variable and band selection. Remote Sensing of Environment, 132, pp.102-119 (the Bayesian MCMC scheme used in beast).
Hu, T., Toman, E.M., Chen, G., Shao, G., Zhou, Y., Li, Y., Zhao, K. and Feng, Y., 2021. Mapping fine-scale human disturbances in a working landscape with Landsat time series on Google Earth Engine. ISPRS Journal of Photogrammetry and Remote Sensing, 176, pp.250-261(a beast application paper).
beast
, beast.irreg
, minesweeper
, tetris
, geeLandsat
#--------------------------------NOTE-------------------------------------------------#
# beast123() is an all-inclusive function that duplicates the functionalities of beast
# and beast.irreg. It can handle a single, multiple, or 3D of stacked time series, being
# either regular or irregular. It allows for customization through four LIST arguments:
# metadata -- additional info about the input Y
# prior -- prior parameters for the beast model
# mcmc -- MCMC simulation setting
# extra -- misc parameters turning on/off outputs and setting up parallel computations
#
# Despite being essentially the same as beast and beast.irreg, beast123 is provided mainly
# to support concurrent handling of multiple time series (e.g., stacked satellite images)
# via parallel computing: When processing stacked raster layers, DO NOT iterate pixel by pixel
# using beast() or beast.irreg() via an external parallel caller (e.g., doParallel or foreach).
# Instread, please use beast123(), which supports mulithreading internally.
#------------------------------Example 1: one time series with seasonalty-------------#
# Yellowstone is a half-monthly time series of 774 NDVI measurmments at a Yellowstone
# site starting from July 1-15,1981(i.e., start=c(1981,7,7). It has 24 data points per
# year (freq=24).
library(Rbeast)
data(Yellowstone)
plot(Yellowstone)
# Below, the four option args are missing, so defalut values will be used, with some
# warning messages given to altert this. By default, the input Y is assumed to be regular
# with a seasonal component. The default arg values used will be printed out and they can
# serve as a template to customize the parameters.
o = beast123(Yellowstone)
plot(o)
#------------------------------Example 2: a trend-only time series-------------------#
# Nile is an annual river flow time series (i.e., no periodic variation). So, season
# is set to 'none' to indicate trend-only analysis. Default values are used for other
# missing options. Unlike the beast() function, beast123 does NOT use the time attributes
# of a 'ts' object. For example, Nile is treated as a pure data number; its (start=1871,
# end=1970, freq=1) attributes are ignored. The default times 1:length(Nile) are used
# instead. The true time info need to be specified by the 'metadata' parameter, as shown
# in the next example.
o = beast123(Nile,season='none')
o = beast123(Nile, extra=list(dumpInputData=TRUE, quiet=TRUE),season='none')
o = beast123(Nile, metadata=list(star=1871, hasOutlier=TRUE), season='none')
plot(o)
#------------------------------Example 3: call via the full API interface-----------#
# Specify metadata, prior, mcmc, and extra explicitly. Only 'prior' is the true statistical
# model parameters of BEAST; the other three are just options to configure the input/ouput
# or the computation process.
## Not run:
# metadata is NOT part of BEAST itself, but some extra info to describe the input
# time series Y. Below, the input Y is the 'Yellowstone' ts.
metadata = list()
#metadata$isRegularOrdered = TRUE # This arg not used any longer in this version
metadata$whichDimIsTime = 1 # Which dim of the input refer to time for
# 2D/3D inputs? Ignored for a single time
# series input.
metadata$startTime = c(1981,7,7) # Or startTime=1981.5137
# startTime=as.Date('1981-7-7')
metadata$deltaTime = 1/24 # Half-monthly regular ts: 0.5/12=1/24
metadata$period = 1.0 # The period is 1 year:
# freq x deltaTime = period
# 24 x 1/24 = 1.0
metadata$omissionValue = NaN # By default, NaNs are ignored
metadata$maxMissingRateAllowed = 0.7500 # If missingness is higher than .75, the ts
# is skipped and not fitted
metadata$deseasonalize = FALSE # Do not remove the global seasonal pattern
# before fitting the beast model
metadata$detrend = FALSE # Do not remove the global trend before
# the fitting
# prior is the ONLY true parameters of the beast model,used to specify the priors
# in the Bayesian formulation
prior = list()
prior$seasonMinOrder = 1 #min harmonic order allowed to fit seasonal cmpnt
prior$seasonMaxOrder = 5 #max harmonic order allowed to fit seasonal cmpnt
prior$seasonMinKnotNum = 0 #min number of changepnts in seasonal cmpnt
prior$seasonMaxKnotNum = 3 #max number of changepnts in seasonal cmpnt
prior$seasonMinSepDist = 10 #min inter-chngpts separation for seasonal cmpnt
prior$trendMinOrder = 0 #min polynomial order allowed to fit trend cmpnt
prior$trendMaxOrder = 1 #max polynomial order allowed to fit trend cmpnt
prior$trendMinKnotNum = 0 #min number of changepnts in trend cmpnt
prior$trendMaxKnotNum = 15 #max number of changepnts in trend cmpnt
prior$trendMinSepDist = 5 #min inter-chngpts separation for trend cmpnt
prior$precValue = 10.0 #Initial value of the precision parameter (no
# need to change it unless for precPrioType='const')
prior$precPriorType = 'uniform' # Possible values: const, uniform, and componentwise
# mcmc is NOT part of the beast model itself, but some parameters to configure the
# MCMC inference.
mcmc = list()
mcmc$seed = 9543434# an arbitray seed for random number generator
mcmc$samples = 3000 # samples collected per chain
mcmc$thinningFactor = 3 # take every 3rd sample and discard others
mcmc$burnin = 150 # discard the initial 150 samples per chain
mcmc$chainNumber = 3 # number of chains
mcmc$maxMoveStepSize = 4 # max random jump step when proposing new chngpts
mcmc$trendResamplingOrderProb = 0.100 # prob of choosing to resample polynomial order
mcmc$seasonResamplingOrderProb = 0.100 # prob of choosing to resample harmonic order
mcmc$credIntervalAlphaLevel = 0.950 # the significance level for credible interval
# extra is NOT part of the beast model itself, but some parameters to configure the
# output and computation process
extra = list()
extra$dumpInputData = FALSE #If true, a copy of input time series is outputted
extra$whichOutputDimIsTime = 1 #For 2D or 3D inputs, which dim of the output refers to
# time? Ignored if the input is a single time series
extra$computeCredible = FALSE #If true, compute CI: computing CI is time-intensive.
extra$fastCIComputation = TRUE #If true, a faster way is used to get CI, but it is
# still time-intensive. That is why the function beast()
# is slow because it always compute CI.
extra$computeSeasonOrder = FALSE #If true, dump the estimated harmonic order over time
extra$computeTrendOrder = FALSE #If true, dump the estimated polynomial order over time
extra$computeSeasonChngpt = TRUE #If true, get the most likely locations of s chgnpts
extra$computeTrendChngpt = TRUE #If true, get the most likely locations of t chgnpts
extra$computeSeasonAmp = FALSE #If true, get time-varying amplitude of seasonality
extra$computeTrendSlope = FALSE #If true, get time-varying slope of trend
extra$tallyPosNegSeasonJump= FALSE #If true, get those changpts with +/- jumps in season
extra$tallyPosNegTrendJump = FALSE #If true, get those changpts with +/- jumps in trend
extra$tallyIncDecTrendJump = FALSE #If true, get those changpts with increasing/
# decreasing trend slopes
extra$printProgress = TRUE
extra$printParameter = TRUE
extra$quiet = FALSE # print warning messages, if any
extra$consoleWidth = 0 # If 0, the console width is from the current console
extra$numThreadsPerCPU = 2 # 'numThreadsPerCPU' and 'numParThreads' are used to
extra$numParThreads = 0 # configure multithreading runs; they're used only if
# Y has multiple time series (e.g.,stacked images)
o = beast123(Yellowstone,metadata,prior,mcmc,extra, season='harmonic')
plot(o)
## End(Not run)
#------------------------------Example 4: Handle irregular time series-----------------#
# Handle irregular time series: ohio is a data frame of a Landsat NDVI series observed
# at unevely-spaced times
## Not run:
data(ohio)
str(ohio)
metadata = list()
metadata$time = ohio$time # Must supply individual times for irregular inputs
metadata$deltaTime = 1/12 # Must supply the desired time interval for aggregation
metadata$period = 1.0
o=beast123(ohio$ndvi, metadata) # Default values used for those missing parameters
#####################################################################################
class(ohio$rdate) # Another accepted time format for beast123
metadata = list()
metadata$deltaTime = 1/12 # Must supply the desired time interval for aggregation
metadata$time = ohio$rdate # Must supply individual times for irregular inputs
o=beast123(ohio$ndvi, metadata) # Default values used for those missing parameters
#####################################################################################
ohio$Y # Another accepted time format for beast123
ohio$M
ohio$M
metadata = list()
metadata$deltaTime = 1/12 # Must supply the desired time interval for aggregation
metadata$time$year = ohio$Y
metadata$time$month = ohio$M
metadata$time$day = ohio$D
o=beast123(ohio$ndvi, metadata) # Default values used for those missing parameters
#####################################################################################
ohio$Y # Another accepted time format for beast123
ohio$doy
metadata = list()
metadata$deltaTime = 1/12 # Must supply the desired time interval for aggregation
metadata$time$year = ohio$Y
metadata$time$doy = ohio$doy
o=beast123(ohio$ndvi, metadata) # Default values used for those missing parameters
#####################################################################################
ohio$time # Another accepted time format for beast123
metadata = list()
metadata$deltaTime = 1/12 # Must supply the desired time interval for aggregation
metadata$time = ohio$time # Fractional year
o=beast123(ohio$ndvi, metadata) # Default values used for those missing parameters
#####################################################################################
ohio$datestr1 # Another accepted time format for beast123
metadata = list()
metadata$deltaTime = 1/12 # Must supply the time interval for aggregation
metadata$time$datestr = ohio$datestr1
metadata$time$strfmt = '????yyyy?mm?dd'
o=beast123(ohio$ndvi, metadata) # Default values used for those missing parameters
#####################################################################################
ohio$datestr2 # Another accepted time format for beast123
metadata = list()
metadata$deltaTime = 1/12 # Must supply a desired time interval for aggregation
metadata$time$datestr = ohio$datestr2
metadata$time$strfmt = '????yyyydoy????'
o=beast123(ohio$ndvi, metadata) # Default values used for those missing parameters
#####################################################################################
ohio$datestr3 # Another accepted time format for beast123
metadata = list()
metadata$deltaTime = 1/12 # Must supply the desired time interval for aggregation
metadata$time$datestr = ohio$datestr3
metadata$time$strfmt = 'Y,,M/D'
o=beast123(ohio$ndvi, metadata) # Default values used for those missing parameters
## End(Not run)
#------------------Example 4: Handle multiple time series (i.e., matrix input)-----------#
# Handle multiple time series: 'simdata' is a 2D matrix of dim 300x3; it consits of 3
# time series of length 300 each. For this toy example, I decide to be lazy and use the same
# time series for the three columns.
## Not run:
data(simdata) # dim of simdata: 300 x 3 (time x num_of_time_series)
dim(simdata) # the first dimenion refer to time (i.e, 300)
metadata = list()
metadata$whichDimIsTime = 1 # Which dim of the input refer to time for 2D inputs?
# 300 is the ts length, so dim is set to '1' here.
metadata$period = 24 # By default, we assume startTime=1 and deltaTime=1
extra=list()
extra$whichOutputDimIsTime = 2 # Which dim of the output arrays refers to time?
o=beast123(simdata, metadata,extra=extra) # Default values used for those missing parameters
# The lists of arg parameters can also be directly provided inline within the command
o=beast123( simdata, metadata=list(whichDimIsTime=1,period=24), extra=list(whichOutput=2) )
# The field names of the lists can be shortened as long as no ambiguitity is caused.
o=beast123( simdata, metadata=list(whichDim=1,per=24), extra=list(whichOut=2) )
#------------------Example 4: Another run by transposing simdata--------------------------#
simdata1=t(simdata) # dim of simdata1: 3 x 300 (num of ts x time )
metadata = list()
metadata$whichDimIsTime = 2 # Which dim of the input refer to time for 2D inputs?
# 300 is the ts length, so dim is set to '2' here.
metadata$period = 24 # By default, we assume startTime=1 and deltaTime=1
o=beast123(simdata1, metadata) # Default values used for those missing parameters
o=beast123( simdata1, metadata=list(whichDim=2, per=24) )
## End(Not run)
#------------------Example 5: Handle stacked time series images (e.g., 3d input)--------#
# Handle 3D stacked images of irregular and unordered time-series: imagestack is a 3D
# array of size 12x9x1066, each pixel being a time series of length 1066
## Not run:
data(imagestack)
dim(imagestack$ndvi) # Dim: 12 x 9 X 1066 (row x col x time)
imagestack$datestr # A character vector of 1066 date strings
metadata = list()
metadata$whichDimIsTime = 3 # Which dim of the input refer to time for 3D inputs?
# 1066 is the ts length, so dim is set to '3' here.
# In this example, this arg is not needed because
# the time$datestr can also help to match and pick up
# the right time dimesion of imagestack$ndvi.
metadata$time$datestr = imagestack$datestr
metadata$time$strfmt = 'LT05_018032_20080311.yyyy-mm-dd'
metadata$deltaTime = 1/12 # Aggregate the irregular ts at a monthly interval:1/12 Yr
metadata$period = 1.0 # The period is 1 year: deltaTime*freq=1/12*12=1.0
extra = list()
extra$dumpInputData = TRUE # Get a copy of aggregated input ts
extra$numThreadsPerCPU = 2 # Each cpu core will be assigned 2 threads
extra$numParThreads = 0 # If 0, total_num_threads=numThreadsPerCPU*num_of_cpu_core
# if >0, used to specify the total number of threads
# Default values for missing parameters
o=beast123(imagestack$ndvi, metadata=metadata,extra=extra)
print(o,c(5,3)) # print the result for the pixel at Row 5 and Col 3
plot(o,c(5,3)) # plot the result for the pixel at Row 5 and Col 3
image(o$trend$ncp) # number of trend changepoints over space
## End(Not run)
#---------Example 6: Handle stacked GeoTiff image files imported with the raster package------#
# Handle 3D stacked images of irregular time-series : 'ndvi.zip' is a zip file of
# 437 NDIV tiff image files, each having a dim of 12 x 9.
# Code availlable at https://github.com/zhaokg/Rbeast/blob/master/R/beast123_raster_example.txt
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.