Description Details References

The main method of the package is `mphcrm`

. It has an interface
somewhat similar to `lm`

. There is an example of use in `datagen`

, with
a generated dataset similiar to the ones in Gaure et al. (2007). For those who have
used the program used in that paper, a mixture of R, Fortran, C, and python,
this is an entirely new self-contained package, written from scratch with 12 years of experience.
Currently not all functionality from that behemoth has been implemented, but most of it.

A short description of the model follows.

There are some individuals with some observed covariates *X_i*. The individuals are
observed for some time, so there is typically more than one observation of each individual.
At any point they experience one or more hazards. The hazards are assumed to be of the form
*h_i^j = exp(X_i β_j)*, where *β_j* are coefficients for hazard *j*.
The hazards themselves are not observed, but an event associated with them is, i.e. a transition
of some kind. The time of the transition, either exactly recorded, or within an interval, must also
be in the data set. With enough observations it is then possible to estimate the coefficients *β_j*.

However, it just so happens that contrary to ordinary linear models, any unobserved heterogeneity
may bias the estimates, not just increase uncertainty. To account for unobserved heterogeneity, a
random intercept is introduced, so that the hazards are of the form *h_i^j(μ_k) = exp(X_i β_j + μ_k)*
for *k* between 1 and some *n*. The intercept may of course be written multiplicatively as
*exp(X_i β_j) exp(μ_k)*, that is why they are called *proportional* hazards.

The individual likelihood depends on the intercept, i.e. *L_i(μ_k)*, but we integrate it out
so that the individual likelihood becomes *∑ p_k L_i(μ_k)*. The resulting mixture
likelihood is maximized over all the *β*s, *n*, the *μ_k*s, and the probabilities *p_k*.

Besides the function `mphcrm`

which does the actual estimation, there are functions for
extracting the estimated mixture, they are `mphdist`

, `mphmoments`

and a few more.

There's a summary function for the fitted model, and there is a data set available with `data(durdata)`

which
is used for demonstration purposes. Also, an already fitted model is available there, as `fit`

.

The package may use more than one cpu, the default is taken from `getOption("durmod.threads")`

which is initialized from the environment variable DURMOD_THREADS, OMP_THREAD_LIMIT,
OMP_NUM_THREADS or NUMBER_OF_PROCESSORS, or parallel::detectCores() upon loading the package.

For more demanding problems, a cluster of machines (from packages parallel or snow) can be used, in combination with the use of threads.

There is a vignette (`vignette("whatmph")`

) with more details about durmod and data layout.

Gaure, S., K. Røed and T. Zhang (2007) Time and causality: A Monte-Carlo Assessment of the timing-of-events approach, Journal of Econometrics 141(2), 1159-1195. https://doi.org/10.1016/j.jeconom.2007.01.015

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.