Flexible and user-friendly probabilistic segmentation of time series with smooth and/or abrupt regime changes by a mixture model-based regression approach with a hidden logistic process, fitted by the EM algorithm.
You can install the RHLP package from GitHub with:
# install.packages("devtools")
devtools::install_github("fchamroukhi/RHLP")
To build vignettes for examples of usage, type the command below instead:
# install.packages("devtools")
devtools::install_github("fchamroukhi/RHLP",
build_opts = c("--no-resave-data", "--no-manual"),
build_vignettes = TRUE)
Use the following command to display vignettes:
browseVignettes("RHLP")
library(RHLP)
# Application to a toy data set
data("toydataset")
x <- toydataset$x
y <- toydataset$y
K <- 5 # Number of regimes (mixture components)
p <- 3 # Dimension of beta (order of the polynomial regressors)
q <- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model
n_tries <- 1
max_iter = 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE
rhlp <- emRHLP(X = x, Y = y, K, p, q, variance_type, n_tries,
max_iter, threshold, verbose, verbose_IRLS)
#> EM - RHLP: Iteration: 1 | log-likelihood: -2119.27308534609
#> EM - RHLP: Iteration: 2 | log-likelihood: -1149.01040321999
#> EM - RHLP: Iteration: 3 | log-likelihood: -1118.20384281234
#> EM - RHLP: Iteration: 4 | log-likelihood: -1096.88260636121
#> EM - RHLP: Iteration: 5 | log-likelihood: -1067.55719357295
#> EM - RHLP: Iteration: 6 | log-likelihood: -1037.26620122646
#> EM - RHLP: Iteration: 7 | log-likelihood: -1022.71743069484
#> EM - RHLP: Iteration: 8 | log-likelihood: -1006.11825447077
#> EM - RHLP: Iteration: 9 | log-likelihood: -1001.18491883952
#> EM - RHLP: Iteration: 10 | log-likelihood: -1000.91250763556
#> EM - RHLP: Iteration: 11 | log-likelihood: -1000.62280600209
#> EM - RHLP: Iteration: 12 | log-likelihood: -1000.3030988811
#> EM - RHLP: Iteration: 13 | log-likelihood: -999.932334880131
#> EM - RHLP: Iteration: 14 | log-likelihood: -999.484219706691
#> EM - RHLP: Iteration: 15 | log-likelihood: -998.928118038989
#> EM - RHLP: Iteration: 16 | log-likelihood: -998.234244664472
#> EM - RHLP: Iteration: 17 | log-likelihood: -997.359536276056
#> EM - RHLP: Iteration: 18 | log-likelihood: -996.152654857298
#> EM - RHLP: Iteration: 19 | log-likelihood: -994.697863447307
#> EM - RHLP: Iteration: 20 | log-likelihood: -993.186583974542
#> EM - RHLP: Iteration: 21 | log-likelihood: -991.81352379631
#> EM - RHLP: Iteration: 22 | log-likelihood: -990.611295217008
#> EM - RHLP: Iteration: 23 | log-likelihood: -989.539226273251
#> EM - RHLP: Iteration: 24 | log-likelihood: -988.55311887915
#> EM - RHLP: Iteration: 25 | log-likelihood: -987.539963690533
#> EM - RHLP: Iteration: 26 | log-likelihood: -986.073920116541
#> EM - RHLP: Iteration: 27 | log-likelihood: -983.263549878169
#> EM - RHLP: Iteration: 28 | log-likelihood: -979.340492188909
#> EM - RHLP: Iteration: 29 | log-likelihood: -977.468559852711
#> EM - RHLP: Iteration: 30 | log-likelihood: -976.653534236095
#> EM - RHLP: Iteration: 31 | log-likelihood: -976.5893387433
#> EM - RHLP: Iteration: 32 | log-likelihood: -976.589338067237
rhlp$summary()
#> ---------------------
#> Fitted RHLP model
#> ---------------------
#>
#> RHLP model with K = 5 components:
#>
#> log-likelihood nu AIC BIC ICL
#> -976.5893 33 -1009.589 -1083.959 -1083.176
#>
#> Clustering table (Number of observations in each regimes):
#>
#> 1 2 3 4 5
#> 100 120 200 100 150
#>
#> Regression coefficients:
#>
#> Beta(k = 1) Beta(k = 2) Beta(k = 3) Beta(k = 4) Beta(k = 5)
#> 1 6.031875e-02 -5.434903 -2.770416 120.7699 4.027542
#> X^1 -7.424718e+00 158.705091 43.879453 -474.5888 13.194261
#> X^2 2.931652e+02 -650.592347 -94.194780 597.7948 -33.760603
#> X^3 -1.823560e+03 865.329795 67.197059 -244.2386 20.402153
#>
#> Variances:
#>
#> Sigma2(k = 1) Sigma2(k = 2) Sigma2(k = 3) Sigma2(k = 4) Sigma2(k = 5)
#> 1.220624 1.110243 1.079394 0.9779734 1.028332
rhlp$plot()
# Application to a real data set
data("realdataset")
x <- realdataset$x
y <- realdataset$y2
K <- 5 # Number of regimes (mixture components)
p <- 3 # Dimension of beta (order of the polynomial regressors)
q <- 1 # Dimension of w (order of the logistic regression: to be set to 1 for segmentation)
variance_type <- "heteroskedastic" # "heteroskedastic" or "homoskedastic" model
n_tries <- 1
max_iter = 1500
threshold <- 1e-6
verbose <- TRUE
verbose_IRLS <- FALSE
rhlp <- emRHLP(X = x, Y = y, K, p, q, variance_type, n_tries,
max_iter, threshold, verbose, verbose_IRLS)
#> EM - RHLP: Iteration: 1 | log-likelihood: -3321.6485760125
#> EM - RHLP: Iteration: 2 | log-likelihood: -2286.48632282875
#> EM - RHLP: Iteration: 3 | log-likelihood: -2257.60498391374
#> EM - RHLP: Iteration: 4 | log-likelihood: -2243.74506764308
#> EM - RHLP: Iteration: 5 | log-likelihood: -2233.3426635247
#> EM - RHLP: Iteration: 6 | log-likelihood: -2226.89953345319
#> EM - RHLP: Iteration: 7 | log-likelihood: -2221.77999023589
#> EM - RHLP: Iteration: 8 | log-likelihood: -2215.81305295291
#> EM - RHLP: Iteration: 9 | log-likelihood: -2208.25998029539
#> EM - RHLP: Iteration: 10 | log-likelihood: -2196.27872403055
#> EM - RHLP: Iteration: 11 | log-likelihood: -2185.40049009242
#> EM - RHLP: Iteration: 12 | log-likelihood: -2180.13934245387
#> EM - RHLP: Iteration: 13 | log-likelihood: -2175.4276274402
#> EM - RHLP: Iteration: 14 | log-likelihood: -2170.86113669353
#> EM - RHLP: Iteration: 15 | log-likelihood: -2165.34927170608
#> EM - RHLP: Iteration: 16 | log-likelihood: -2161.12419211511
#> EM - RHLP: Iteration: 17 | log-likelihood: -2158.63709280617
#> EM - RHLP: Iteration: 18 | log-likelihood: -2156.19846850913
#> EM - RHLP: Iteration: 19 | log-likelihood: -2154.04107470071
#> EM - RHLP: Iteration: 20 | log-likelihood: -2153.24544245686
#> EM - RHLP: Iteration: 21 | log-likelihood: -2151.74944795242
#> EM - RHLP: Iteration: 22 | log-likelihood: -2149.90781423151
#> EM - RHLP: Iteration: 23 | log-likelihood: -2146.40042232588
#> EM - RHLP: Iteration: 24 | log-likelihood: -2142.37530025533
#> EM - RHLP: Iteration: 25 | log-likelihood: -2134.85493291884
#> EM - RHLP: Iteration: 26 | log-likelihood: -2129.67399002071
#> EM - RHLP: Iteration: 27 | log-likelihood: -2126.44739300481
#> EM - RHLP: Iteration: 28 | log-likelihood: -2124.94603052064
#> EM - RHLP: Iteration: 29 | log-likelihood: -2122.51637426267
#> EM - RHLP: Iteration: 30 | log-likelihood: -2121.01493646146
#> EM - RHLP: Iteration: 31 | log-likelihood: -2118.45402063643
#> EM - RHLP: Iteration: 32 | log-likelihood: -2116.9336204919
#> EM - RHLP: Iteration: 33 | log-likelihood: -2114.34424563452
#> EM - RHLP: Iteration: 34 | log-likelihood: -2112.84844186712
#> EM - RHLP: Iteration: 35 | log-likelihood: -2110.34494568025
#> EM - RHLP: Iteration: 36 | log-likelihood: -2108.81734757025
#> EM - RHLP: Iteration: 37 | log-likelihood: -2106.26527191053
#> EM - RHLP: Iteration: 38 | log-likelihood: -2104.96591147986
#> EM - RHLP: Iteration: 39 | log-likelihood: -2102.43927829964
#> EM - RHLP: Iteration: 40 | log-likelihood: -2101.27820194404
#> EM - RHLP: Iteration: 41 | log-likelihood: -2098.81151697567
#> EM - RHLP: Iteration: 42 | log-likelihood: -2097.48008514591
#> EM - RHLP: Iteration: 43 | log-likelihood: -2094.98259556552
#> EM - RHLP: Iteration: 44 | log-likelihood: -2093.66517040802
#> EM - RHLP: Iteration: 45 | log-likelihood: -2091.23625905564
#> EM - RHLP: Iteration: 46 | log-likelihood: -2089.91118603989
#> EM - RHLP: Iteration: 47 | log-likelihood: -2087.67388435026
#> EM - RHLP: Iteration: 48 | log-likelihood: -2086.11373786756
#> EM - RHLP: Iteration: 49 | log-likelihood: -2083.84931461869
#> EM - RHLP: Iteration: 50 | log-likelihood: -2082.16175664198
#> EM - RHLP: Iteration: 51 | log-likelihood: -2080.45137011098
#> EM - RHLP: Iteration: 52 | log-likelihood: -2078.37066132008
#> EM - RHLP: Iteration: 53 | log-likelihood: -2077.06827662071
#> EM - RHLP: Iteration: 54 | log-likelihood: -2074.66718553694
#> EM - RHLP: Iteration: 55 | log-likelihood: -2073.68137124781
#> EM - RHLP: Iteration: 56 | log-likelihood: -2071.20390017789
#> EM - RHLP: Iteration: 57 | log-likelihood: -2069.88260759288
#> EM - RHLP: Iteration: 58 | log-likelihood: -2067.30246728287
#> EM - RHLP: Iteration: 59 | log-likelihood: -2066.08897944236
#> EM - RHLP: Iteration: 60 | log-likelihood: -2064.14482062792
#> EM - RHLP: Iteration: 61 | log-likelihood: -2062.39859624374
#> EM - RHLP: Iteration: 62 | log-likelihood: -2060.73756242314
#> EM - RHLP: Iteration: 63 | log-likelihood: -2058.4448132974
#> EM - RHLP: Iteration: 64 | log-likelihood: -2057.23564743141
#> EM - RHLP: Iteration: 65 | log-likelihood: -2054.73129678764
#> EM - RHLP: Iteration: 66 | log-likelihood: -2053.66525147972
#> EM - RHLP: Iteration: 67 | log-likelihood: -2051.05262427909
#> EM - RHLP: Iteration: 68 | log-likelihood: -2049.89030367995
#> EM - RHLP: Iteration: 69 | log-likelihood: -2047.68843285481
#> EM - RHLP: Iteration: 70 | log-likelihood: -2046.16052536146
#> EM - RHLP: Iteration: 71 | log-likelihood: -2044.92677581091
#> EM - RHLP: Iteration: 72 | log-likelihood: -2042.67687818721
#> EM - RHLP: Iteration: 73 | log-likelihood: -2041.77608506749
#> EM - RHLP: Iteration: 74 | log-likelihood: -2039.40345316134
#> EM - RHLP: Iteration: 75 | log-likelihood: -2038.20062153928
#> EM - RHLP: Iteration: 76 | log-likelihood: -2036.05846372404
#> EM - RHLP: Iteration: 77 | log-likelihood: -2034.52492449426
#> EM - RHLP: Iteration: 78 | log-likelihood: -2033.44774900177
#> EM - RHLP: Iteration: 79 | log-likelihood: -2031.15837908019
#> EM - RHLP: Iteration: 80 | log-likelihood: -2030.29908045026
#> EM - RHLP: Iteration: 81 | log-likelihood: -2028.08193331457
#> EM - RHLP: Iteration: 82 | log-likelihood: -2026.82779637097
#> EM - RHLP: Iteration: 83 | log-likelihood: -2025.51219569808
#> EM - RHLP: Iteration: 84 | log-likelihood: -2023.47136697978
#> EM - RHLP: Iteration: 85 | log-likelihood: -2022.86702240332
#> EM - RHLP: Iteration: 86 | log-likelihood: -2021.05803372565
#> EM - RHLP: Iteration: 87 | log-likelihood: -2019.68013062929
#> EM - RHLP: Iteration: 88 | log-likelihood: -2018.57796815284
#> EM - RHLP: Iteration: 89 | log-likelihood: -2016.51065270015
#> EM - RHLP: Iteration: 90 | log-likelihood: -2015.84957111014
#> EM - RHLP: Iteration: 91 | log-likelihood: -2014.25626618564
#> EM - RHLP: Iteration: 92 | log-likelihood: -2012.83069679254
#> EM - RHLP: Iteration: 93 | log-likelihood: -2012.36700738444
#> EM - RHLP: Iteration: 94 | log-likelihood: -2010.80319327333
#> EM - RHLP: Iteration: 95 | log-likelihood: -2009.62231094925
#> EM - RHLP: Iteration: 96 | log-likelihood: -2009.18020396728
#> EM - RHLP: Iteration: 97 | log-likelihood: -2007.70135886708
#> EM - RHLP: Iteration: 98 | log-likelihood: -2006.56703696874
#> EM - RHLP: Iteration: 99 | log-likelihood: -2006.01673291469
#> EM - RHLP: Iteration: 100 | log-likelihood: -2004.41194242792
#> EM - RHLP: Iteration: 101 | log-likelihood: -2003.4625414477
#> EM - RHLP: Iteration: 102 | log-likelihood: -2002.88040058763
#> EM - RHLP: Iteration: 103 | log-likelihood: -2001.35926477816
#> EM - RHLP: Iteration: 104 | log-likelihood: -2000.57003100128
#> EM - RHLP: Iteration: 105 | log-likelihood: -2000.13742634303
#> EM - RHLP: Iteration: 106 | log-likelihood: -1998.8742667185
#> EM - RHLP: Iteration: 107 | log-likelihood: -1997.9672441114
#> EM - RHLP: Iteration: 108 | log-likelihood: -1997.53617878001
#> EM - RHLP: Iteration: 109 | log-likelihood: -1996.26856906479
#> EM - RHLP: Iteration: 110 | log-likelihood: -1995.29073069489
#> EM - RHLP: Iteration: 111 | log-likelihood: -1994.96901833912
#> EM - RHLP: Iteration: 112 | log-likelihood: -1994.04338389315
#> EM - RHLP: Iteration: 113 | log-likelihood: -1992.93228304533
#> EM - RHLP: Iteration: 114 | log-likelihood: -1992.58825334521
#> EM - RHLP: Iteration: 115 | log-likelihood: -1992.08820485443
#> EM - RHLP: Iteration: 116 | log-likelihood: -1990.99459284997
#> EM - RHLP: Iteration: 117 | log-likelihood: -1990.39820233453
#> EM - RHLP: Iteration: 118 | log-likelihood: -1990.25156085256
#> EM - RHLP: Iteration: 119 | log-likelihood: -1990.02689844513
#> EM - RHLP: Iteration: 120 | log-likelihood: -1989.4524459209
#> EM - RHLP: Iteration: 121 | log-likelihood: -1988.77939887023
#> EM - RHLP: Iteration: 122 | log-likelihood: -1988.43670301286
#> EM - RHLP: Iteration: 123 | log-likelihood: -1988.05097380424
#> EM - RHLP: Iteration: 124 | log-likelihood: -1987.13583867675
#> EM - RHLP: Iteration: 125 | log-likelihood: -1986.24508709354
#> EM - RHLP: Iteration: 126 | log-likelihood: -1985.66862327892
#> EM - RHLP: Iteration: 127 | log-likelihood: -1984.91555844651
#> EM - RHLP: Iteration: 128 | log-likelihood: -1984.02840365821
#> EM - RHLP: Iteration: 129 | log-likelihood: -1983.69130067161
#> EM - RHLP: Iteration: 130 | log-likelihood: -1983.59891631866
#> EM - RHLP: Iteration: 131 | log-likelihood: -1983.46950685882
#> EM - RHLP: Iteration: 132 | log-likelihood: -1983.16677154063
#> EM - RHLP: Iteration: 133 | log-likelihood: -1982.7130488681
#> EM - RHLP: Iteration: 134 | log-likelihood: -1982.36482921383
#> EM - RHLP: Iteration: 135 | log-likelihood: -1982.09501016661
#> EM - RHLP: Iteration: 136 | log-likelihood: -1981.45901315766
#> EM - RHLP: Iteration: 137 | log-likelihood: -1980.56116931257
#> EM - RHLP: Iteration: 138 | log-likelihood: -1979.78682525118
#> EM - RHLP: Iteration: 139 | log-likelihood: -1978.57039689029
#> EM - RHLP: Iteration: 140 | log-likelihood: -1977.62583903156
#> EM - RHLP: Iteration: 141 | log-likelihood: -1976.44993964017
#> EM - RHLP: Iteration: 142 | log-likelihood: -1975.34352117182
#> EM - RHLP: Iteration: 143 | log-likelihood: -1973.94511304916
#> EM - RHLP: Iteration: 144 | log-likelihood: -1972.69707782729
#> EM - RHLP: Iteration: 145 | log-likelihood: -1971.24412635765
#> EM - RHLP: Iteration: 146 | log-likelihood: -1970.06230181165
#> EM - RHLP: Iteration: 147 | log-likelihood: -1968.63106242841
#> EM - RHLP: Iteration: 148 | log-likelihood: -1967.54773416029
#> EM - RHLP: Iteration: 149 | log-likelihood: -1966.19481640747
#> EM - RHLP: Iteration: 150 | log-likelihood: -1965.07487280506
#> EM - RHLP: Iteration: 151 | log-likelihood: -1963.69466194804
#> EM - RHLP: Iteration: 152 | log-likelihood: -1962.43103040224
#> EM - RHLP: Iteration: 153 | log-likelihood: -1961.13942311651
#> EM - RHLP: Iteration: 154 | log-likelihood: -1959.76348415393
#> EM - RHLP: Iteration: 155 | log-likelihood: -1958.66111557445
#> EM - RHLP: Iteration: 156 | log-likelihood: -1957.08412155615
#> EM - RHLP: Iteration: 157 | log-likelihood: -1956.38405033098
#> EM - RHLP: Iteration: 158 | log-likelihood: -1955.13976323662
#> EM - RHLP: Iteration: 159 | log-likelihood: -1954.0307602366
#> EM - RHLP: Iteration: 160 | log-likelihood: -1953.28771131999
#> EM - RHLP: Iteration: 161 | log-likelihood: -1951.68947232015
#> EM - RHLP: Iteration: 162 | log-likelihood: -1950.97779043109
#> EM - RHLP: Iteration: 163 | log-likelihood: -1950.82786273359
#> EM - RHLP: Iteration: 164 | log-likelihood: -1950.39568293481
#> EM - RHLP: Iteration: 165 | log-likelihood: -1949.51404624208
#> EM - RHLP: Iteration: 166 | log-likelihood: -1948.906374824
#> EM - RHLP: Iteration: 167 | log-likelihood: -1948.43487893552
#> EM - RHLP: Iteration: 168 | log-likelihood: -1947.2118394595
#> EM - RHLP: Iteration: 169 | log-likelihood: -1946.34871715855
#> EM - RHLP: Iteration: 170 | log-likelihood: -1946.22041468711
#> EM - RHLP: Iteration: 171 | log-likelihood: -1946.2132265072
#> EM - RHLP: Iteration: 172 | log-likelihood: -1946.21315057723
rhlp$summary()
#> ---------------------
#> Fitted RHLP model
#> ---------------------
#>
#> RHLP model with K = 5 components:
#>
#> log-likelihood nu AIC BIC ICL
#> -1946.213 33 -1979.213 -2050.683 -2050.449
#>
#> Clustering table (Number of observations in each regimes):
#>
#> 1 2 3 4 5
#> 16 129 180 111 126
#>
#> Regression coefficients:
#>
#> Beta(k = 1) Beta(k = 2) Beta(k = 3) Beta(k = 4) Beta(k = 5)
#> 1 2187.539 330.05723 1508.2809 -13446.7332 6417.62830
#> X^1 -15032.659 -107.79782 -1648.9562 11321.4509 -3571.94090
#> X^2 -56433.432 14.40154 786.5723 -3062.2825 699.55894
#> X^3 494014.670 56.88016 -118.0693 272.7844 -45.42922
#>
#> Variances:
#>
#> Sigma2(k = 1) Sigma2(k = 2) Sigma2(k = 3) Sigma2(k = 4) Sigma2(k = 5)
#> 8924.363 49.22616 78.2758 105.6606 15.66317
rhlp$plot()
In this package, it is possible to select models based on information criteria such as BIC, AIC and ICL.
The selection can be done for the two following parameters:
Let’s select a RHLP model for the following time series Y:
data("toydataset")
x <- toydataset$x
y <- toydataset$y
plot(x, y, type = "l", xlab = "x", ylab = "Y")
selectedrhlp <- selectRHLP(X = x, Y = y, Kmin = 2, Kmax = 6, pmin = 0, pmax = 3)
#> The RHLP model selected via the "BIC" has K = 5 regimes
#> and the order of the polynomial regression is p = 0.
#> BIC = -1041.40789532438
#> AIC = -1000.84239591291
selectedrhlp$plot(what = "estimatedsignal")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.