| parametric_dsmm | R Documentation |
Creates a parametric model specification for a drifting
semi-Markov model. Returns an object of class
(dsmm_parametric, dsmm).
parametric_dsmm(
model_size,
states,
initial_dist,
degree,
f_is_drifting,
p_is_drifting,
p_dist,
f_dist,
f_dist_pars
)
model_size |
Positive integer that represents the size of
the drifting semi-Markov model |
states |
Character vector that represents the state space |
initial_dist |
Numerical vector of |
degree |
Positive integer that represents the polynomial degree |
f_is_drifting |
Logical. Specifies if |
p_is_drifting |
Logical. Specifies if |
p_dist |
Numerical array, that represents the probabilities of the
transition matrix
|
f_dist |
Character array, that represents the discrete sojourn time
distribution
|
f_dist_pars |
Numerical array, that represents the parameters of the
sojourn time distributions given in
|
Defined Arguments
For the parametric case, we explicitly define:
The transition matrix of the embedded Markov chain
(J_{t})_{t\in \{0,\dots,n\}}, given in
the attribute p_dist:
If p is not drifting, it contains the values:
p(u,v), \forall u, v \in E,
given in an array with dimensions of s \times s,
where the first dimension corresponds to the previous state u and
the second dimension corresponds to the current state v.
If p is drifting, for i \in \{ 0,\dots,d \},
it contains the values:
p_{\frac{i}{d}}(u,v), \forall u, v \in E,
given in an array with dimensions of s \times s \times (d + 1),
where the first and second dimensions are defined as in the non-drifting
case, and the third dimension corresponds to the d+1 different
matrices p_{\frac{i}{d}}.
The conditional sojourn time distribution, given in the
attribute f_dist:
If f is not drifting, it contains the discrete
distribution names (as characters or NA), given in an
array with dimensions of s \times s,
where the first dimension corresponds to the previous state u,
the second dimension corresponds to the current state v.
If f is drifting, it contains the discrete
distribution names (as characters or NA) given in an
array with dimensions of s \times s \times (d + 1),
where the first and second dimensions are defined as in the
non-drifting case, and the third dimension corresponds to the
d+1 different arrays f_{\frac{i}{d}}.
The conditional sojourn time distribution parameters,
given in the attribute f_dist_pars:
If f is not drifting, it contains the
numerical values (or NA) of the corresponding
distributions defined in f_dist, given in
an array with dimensions of s \times s,
where the first dimension corresponds to the previous state u,
the second dimension corresponds to the current state v.
If f is drifting, it contains the
numerical values (or NA) of the corresponding
distributions defined in f_dist, given in an array
with dimensions of s \times s \times (d + 1),
where the first and second dimensions are defined as in the
non-drifting case, and the third dimension corresponds to the
d+1 different arrays f_{\frac{i}{d}}.
Sojourn time distributions
In this package, the available distributions for the modeling of the
conditional sojourn times, of the drifting semi-Markov model, used through
the argument f_dist, are the following:
Uniform (n):
f(x) = 1/n, for x = 1, 2, \dots, n, where n is a
positive integer.
This can be specified through the following:
f_dist = "unif"
f_dist_pars = (n, NA)
(n as defined here).
Geometric (p):
f(x) = p (1-p)^{x-1}, for
x = 1, 2, \dots, where p \in (0, 1)
is the probability of success.
This can be specified through the following:
f_dist = "geom"
f_dist_pars = (p, NA)
(p as defined here).
Poisson (\lambda):
f(x) = \frac{\lambda^{x-1} exp(-\lambda)}{(x-1)!}, for
x = 1, 2, \dots, where \lambda > 0.
This can be specified through the following:
f_dist = "pois"
f_dist_pars = (\lambda, NA)
Negative binomial (\alpha, p):
f(x)=\frac{\Gamma(x+\alpha-1)}{\Gamma(\alpha)(x-1)!}
p^{\alpha}(1-p)^{x-1}, for x = 1, 2,\dots, where
\Gamma is the Gamma function,
\alpha \in (0, +\infty) is the parameter describing the
target for number of successful trials, or the dispersion parameter
(the shape parameter of the gamma mixing distribution).
p is the probability of success, 0 < p < 1.
f_dist = "nbinom"
f_dist_pars = (\alpha, p)
(p as defined here)
Discrete Weibull of type 1 (q, \beta):
f(x)=q^{(x-1)^{\beta}}-q^{x^{\beta}}, for x=1,2,\dots, with
q \in (0, 1) is the first parameter (probability)
and \beta \in (0, +\infty) is the second parameter.
This can be specified through the following:
f_dist = "dweibull"
f_dist_pars = (q, \beta)
(q as defined here)
From these discrete distributions, by using "dweibull", "nbinom"
we require two parameters. It's for this reason that the attribute
f_dist_pars is an array of dimensions
s \times s \times 2 if f
is not drifting or s \times s \times 2 \times (d+1)
if f is drifting.
Returns an object of the S3 class dsmm_parametric, dsmm.
It has the following attributes:
dist : List. Contains 3 arrays, passing down from the arguments:
p_drift or p_notdrift, corresponding to whether the
defined p transition matrix is drifting or not.
f_drift_parametric or f_notdrift_parametric,
corresponding to whether the
defined f sojourn time distribution is drifting or not.
f_drift_parameters or f_notdrift_parameters,
which are the defined f sojourn time distribution parameters,
depending on whether f is drifting or not.
initial_dist : Numerical vector. Passing down from the arguments.
It contains the initial distribution of the drifting semi-Markov model.
states : Character vector. Passing down from the arguments.
It contains the state space E.
s : Positive integer. It contains the number of states in the
state space, s = |E|, which is given in the attribute states.
degree : Positive integer. Passing down from the arguments.
It contains the polynomial degree d considered for the drifting of
the model.
model_size : Positive integer. Passing down from the arguments.
It contains the size of the drifting semi-Markov model n, which
represents the length of the embedded Markov chain
(J_{t})_{t\in \{0,\dots,n\}}, without the last state.
f_is_drifting : Logical. Passing down from the arguments.
Specifies if f is drifting or not.
p_is_drifting : Logical. Passing down from the arguments.
Specifies if p is drifting or not.
Model : Character. Possible values:
"Model_1" : Both p and f are drifting.
"Model_2" : p is drifting and f
is not drifting.
"Model_3" : f is drifting and p
is not drifting.
A_i : Numerical matrix. Represents the polynomials
A_i(t) with degree d that are used for solving
the system MJ = P. Used for the methods defined for the
object. Not printed when viewing the object.
V. S. Barbu, N. Limnios. (2008). semi-Markov Chains and Hidden semi-Markov Models Toward Applications - Their Use in Reliability and DNA Analysis. New York: Lecture Notes in Statistics, vol. 191, Springer.
Vergne, N. (2008). Drifting Markov models with Polynomial Drift and Applications to DNA Sequences. Statistical Applications in Genetics Molecular Biology 7 (1).
Barbu V. S., Vergne, N. (2019). Reliability and survival analysis for drifting Markov models: modeling and estimation. Methodology and Computing in Applied Probability, 21(4), 1407-1429.
T. Nakagawa and S. Osaki. (1975). The discrete Weibull distribution. IEEE Transactions on Reliability, R-24, 300-301.
Methods applied to this object: simulate.dsmm, get_kernel.
For the non-parametric drifting semi-Markov model specification: nonparametric_dsmm.
For the theoretical background of drifting semi-Markov models: dsmmR.
# We can also define states in a flexible way, including spaces.
states <- c("Dollar $", " /1'2'3/ ", " Z E T A ", "O_M_E_G_A")
s <- length(states)
d <- 1
# ===========================================================================
# Defining parametric drifting semi-Markov models.
# ===========================================================================
# ---------------------------------------------------------------------------
# Defining the drifting distributions for Model 1.
# ---------------------------------------------------------------------------
# `p_dist` has dimensions of: (s, s, d + 1).
# Sums over v must be 1 for all u and i = 0, ..., d.
# First matrix.
p_dist_1 <- matrix(c(0, 0.1, 0.4, 0.5,
0.5, 0, 0.3, 0.2,
0.3, 0.4, 0, 0.3,
0.8, 0.1, 0.1, 0),
ncol = s, byrow = TRUE)
# Second matrix.
p_dist_2 <- matrix(c(0, 0.3, 0.6, 0.1,
0.3, 0, 0.4, 0.3,
0.5, 0.3, 0, 0.2,
0.2, 0.3, 0.5, 0),
ncol = s, byrow = TRUE)
# get `p_dist` as an array of p_dist_1 and p_dist_2.
p_dist_model_1 <- array(c(p_dist_1, p_dist_2), dim = c(s, s, d + 1))
# `f_dist` has dimensions of: (s, s, d + 1).
# First matrix.
f_dist_1 <- matrix(c(NA, "unif", "dweibull", "nbinom",
"geom", NA, "pois", "dweibull",
"dweibull", "pois", NA, "geom",
"pois", NA, "geom", NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_2 <- matrix(c(NA, "pois", "geom", "nbinom",
"geom", NA, "pois", "dweibull",
"unif", "geom", NA, "geom",
"pois", "pois", "geom", NA),
nrow = s, ncol = s, byrow = TRUE)
# get `f_dist` as an array of `f_dist_1` and `f_dist_2`
f_dist_model_1 <- array(c(f_dist_1, f_dist_2), dim = c(s, s, d + 1))
# `f_dist_pars` has dimensions of: (s, s, 2, d + 1).
# First array of coefficients, corresponding to `f_dist_1`.
# First matrix.
f_dist_1_pars_1 <- matrix(c(NA, 5, 0.4, 4,
0.7, NA, 5, 0.6,
0.2, 3, NA, 0.6,
4, NA, 0.4, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_1_pars_2 <- matrix(c(NA, NA, 0.2, 0.6,
NA, NA, NA, 0.8,
0.6, NA, NA, NA,
NA, NA, NA, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second array of coefficients, corresponding to `f_dist_2`.
# First matrix.
f_dist_2_pars_1 <- matrix(c(NA, 6, 0.4, 3,
0.7, NA, 2, 0.5,
3, 0.6, NA, 0.7,
6, 0.2, 0.7, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_2_pars_2 <- matrix(c(NA, NA, NA, 0.6,
NA, NA, NA, 0.8,
NA, NA, NA, NA,
NA, NA, NA, NA),
nrow = s, ncol = s, byrow = TRUE)
# Get `f_dist_pars`.
f_dist_pars_model_1 <- array(c(f_dist_1_pars_1, f_dist_1_pars_2,
f_dist_2_pars_1, f_dist_2_pars_2),
dim = c(s, s, 2, d + 1))
# ---------------------------------------------------------------------------
# Parametric object for Model 1.
# ---------------------------------------------------------------------------
obj_par_model_1 <- parametric_dsmm(
model_size = 10000,
states = states,
initial_dist = c(0.8, 0.1, 0.1, 0),
degree = d,
p_dist = p_dist_model_1,
f_dist = f_dist_model_1,
f_dist_pars = f_dist_pars_model_1,
p_is_drifting = TRUE,
f_is_drifting = TRUE
)
# p drifting array.
p_drift <- obj_par_model_1$dist$p_drift
p_drift
# f distribution.
f_dist_drift <- obj_par_model_1$dist$f_drift_parametric
f_dist_drift
# parameters for the f distribution.
f_dist_pars_drift <- obj_par_model_1$dist$f_drift_parameters
f_dist_pars_drift
# ---------------------------------------------------------------------------
# Defining Model 2 - p is drifting, f is not drifting.
# ---------------------------------------------------------------------------
# `p_dist` has the same dimensions as in Model 1: (s, s, d + 1).
p_dist_model_2 <- array(c(p_dist_1, p_dist_2), dim = c(s, s, d + 1))
# `f_dist` has dimensions of: (s, s).
f_dist_model_2 <- matrix(c( NA, "pois", NA, "nbinom",
"geom", NA, "geom", "dweibull",
"unif", "geom", NA, "geom",
"nbinom", "unif", "dweibull", NA),
nrow = s, ncol = s, byrow = TRUE)
# `f_dist_pars` has dimensions of: (s, s, 2),
# corresponding to `f_dist_model_2`.
# First matrix.
f_dist_pars_1_model_2 <- matrix(c(NA, 0.2, NA, 3,
0.2, NA, 0.2, 0.5,
3, 0.4, NA, 0.7,
2, 3, 0.7, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_pars_2_model_2 <- matrix(c(NA, NA, NA, 0.6,
NA, NA, NA, 0.8,
NA, NA, NA, NA,
0.2, NA, 0.3, NA),
nrow = s, ncol = s, byrow = TRUE)
# Get `f_dist_pars`.
f_dist_pars_model_2 <- array(c(f_dist_pars_1_model_2,
f_dist_pars_2_model_2),
dim = c(s, s, 2))
# ---------------------------------------------------------------------------
# Parametric object for Model 2.
# ---------------------------------------------------------------------------
obj_par_model_2 <- parametric_dsmm(
model_size = 10000,
states = states,
initial_dist = c(0.8, 0.1, 0.1, 0),
degree = d,
p_dist = p_dist_model_2,
f_dist = f_dist_model_2,
f_dist_pars = f_dist_pars_model_2,
p_is_drifting = TRUE,
f_is_drifting = FALSE
)
# p drifting array.
p_drift <- obj_par_model_2$dist$p_drift
p_drift
# f distribution.
f_dist_notdrift <- obj_par_model_2$dist$f_notdrift_parametric
f_dist_notdrift
# parameters for the f distribution.
f_dist_pars_notdrift <- obj_par_model_2$dist$f_notdrift_parameters
f_dist_pars_notdrift
# ---------------------------------------------------------------------------
# Defining Model 3 - f is drifting, p is not drifting.
# ---------------------------------------------------------------------------
# `p_dist` has dimensions of: (s, s).
p_dist_model_3 <- matrix(c(0, 0.1, 0.3, 0.6,
0.4, 0, 0.1, 0.5,
0.4, 0.3, 0, 0.3,
0.9, 0.01, 0.09, 0),
ncol = s, byrow = TRUE)
# `f_dist` has the same dimensions as in Model 1: (s, s, d + 1).
f_dist_model_3 <- array(c(f_dist_1, f_dist_2), dim = c(s, s, d + 1))
# `f_dist_pars` has the same dimensions as in Model 1: (s, s, 2, d + 1).
f_dist_pars_model_3 <- array(c(f_dist_1_pars_1, f_dist_1_pars_2,
f_dist_2_pars_1, f_dist_2_pars_2),
dim = c(s, s, 2, d + 1))
# ---------------------------------------------------------------------------
# Parametric object for Model 3.
# ---------------------------------------------------------------------------
obj_par_model_3 <- parametric_dsmm(
model_size = 10000,
states = states,
initial_dist = c(0.3, 0.2, 0.2, 0.3),
degree = d,
p_dist = p_dist_model_3,
f_dist = f_dist_model_3,
f_dist_pars = f_dist_pars_model_3,
p_is_drifting = FALSE,
f_is_drifting = TRUE
)
# p drifting array.
p_notdrift <- obj_par_model_3$dist$p_notdrift
p_notdrift
# f distribution.
f_dist_drift <- obj_par_model_3$dist$f_drift_parametric
f_dist_drift
# parameters for the f distribution.
f_dist_pars_drift <- obj_par_model_3$dist$f_drift_parameters
f_dist_pars_drift
# ===========================================================================
# Parametric estimation using methods corresponding to an object
# which inherits from the class `dsmm_parametric`.
# ===========================================================================
### Comments
### 1. Using a larger `klim` and a larger `model_size` will increase the
### accuracy of the model, with the need of larger memory requirements
### and computational cost.
### 2. For the parametric estimation it is recommended to use a common set
### of distributions while only the parameters are drifting. This results
### in higher accuracy.
# ---------------------------------------------------------------------------
# Defining the distributions for Model 1 - both p and f are drifting.
# ---------------------------------------------------------------------------
# `p_dist` has dimensions of: (s, s, d + 1).
# First matrix.
p_dist_1 <- matrix(c(0, 0.2, 0.4, 0.4,
0.5, 0, 0.3, 0.2,
0.3, 0.4, 0, 0.3,
0.5, 0.3, 0.2, 0),
ncol = s, byrow = TRUE)
# Second matrix.
p_dist_2 <- matrix(c(0, 0.3, 0.5, 0.2,
0.3, 0, 0.4, 0.3,
0.5, 0.3, 0, 0.2,
0.2, 0.4, 0.4, 0),
ncol = s, byrow = TRUE)
# get `p_dist` as an array of p_dist_1 and p_dist_2.
p_dist_model_1 <- array(c(p_dist_1, p_dist_2), dim = c(s, s, d + 1))
# `f_dist` has dimensions of: (s, s, d + 1).
# We will use the same sojourn time distributions.
f_dist_1 <- matrix(c( NA, "unif", "dweibull", "nbinom",
"geom", NA, "pois", "dweibull",
"dweibull", "pois", NA, "geom",
"pois", 'nbinom', "geom", NA),
nrow = s, ncol = s, byrow = TRUE)
# get `f_dist`
f_dist_model_1 <- array(f_dist_1, dim = c(s, s, d + 1))
# `f_dist_pars` has dimensions of: (s, s, 2, d + 1).
# First array of coefficients, corresponding to `f_dist_1`.
# First matrix.
f_dist_1_pars_1 <- matrix(c(NA, 7, 0.4, 4,
0.7, NA, 5, 0.6,
0.2, 3, NA, 0.6,
4, 4, 0.4, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_1_pars_2 <- matrix(c(NA, NA, 0.2, 0.6,
NA, NA, NA, 0.8,
0.6, NA, NA, NA,
NA, 0.3, NA, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second array of coefficients, corresponding to `f_dist_2`.
# First matrix.
f_dist_2_pars_1 <- matrix(c(NA, 6, 0.5, 3,
0.5, NA, 4, 0.5,
0.4, 5, NA, 0.7,
6, 5, 0.7, NA),
nrow = s, ncol = s, byrow = TRUE)
# Second matrix.
f_dist_2_pars_2 <- matrix(c(NA, NA, 0.4, 0.5,
NA, NA, NA, 0.6,
0.5, NA, NA, NA,
NA, 0.4, NA, NA),
nrow = s, ncol = s, byrow = TRUE)
# Get `f_dist_pars`.
f_dist_pars_model_1 <- array(c(f_dist_1_pars_1, f_dist_1_pars_2,
f_dist_2_pars_1, f_dist_2_pars_2),
dim = c(s, s, 2, d + 1))
# ---------------------------------------------------------------------------
# Defining the parametric object for Model 1.
# ---------------------------------------------------------------------------
obj_par_model_1 <- parametric_dsmm(
model_size = 4000,
states = states,
initial_dist = c(0.8, 0.1, 0.1, 0),
degree = d,
p_dist = p_dist_model_1,
f_dist = f_dist_model_1,
f_dist_pars = f_dist_pars_model_1,
p_is_drifting = TRUE,
f_is_drifting = TRUE
)
cat("The object has class of (",
paste0(class(obj_par_model_1),
collapse = ', '), ").")
# ---------------------------------------------------------------------------
# Generating a sequence from the parametric object.
# ---------------------------------------------------------------------------
# A larger klim will lead to an increase in accuracy.
klim <- 20
sim_seq <- simulate(obj_par_model_1, klim = klim, seed = 1)
# ---------------------------------------------------------------------------
# Fitting the generated sequence under the same distributions.
# ---------------------------------------------------------------------------
fit_par_model1 <- fit_dsmm(sequence = sim_seq,
states = states,
degree = d,
f_is_drifting = TRUE,
p_is_drifting = TRUE,
estimation = 'parametric',
f_dist = f_dist_model_1)
cat("The object has class of (",
paste0(class(fit_par_model1),
collapse = ', '), ").")
cat("\nThe estimated parameters are:\n")
fit_par_model1$dist$f_drift_parameters
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.