Description Usage Arguments Details Note Examples

Generate a data table with example data

1 | ```
datagen(N, censor = 80)
``` |

`N` |
integer. The number of individuals in the dataset. |

`censor` |
numeric. The total observation period. Individuals are removed
from the dataset if they do not exit to |

The dataset simulates a labour market programme. People entering the dataset are without a job.

They experience two hazards, i.e. probabilities per time period. They can either get a job and exit from
the dataset, or they can enter a labour market programme, e.g. a subsidised job or similar, and remain
in the dataset and possibly get a job later.
In the terms of this package, there are two transitions, `"job"`

and `"program"`

.

The two hazards are influenced by covariates observed by the researcher, called `"x1"`

and
`"x2"`

. In addition there are unobserved characteristics influencing the hazards. Being
on a programme also influences the hazard to get a job. In the generated dataset, being on
a programme is the indicator variable `alpha`

. While on a programme, the only transition that can
be made is `"job"`

.

The dataset is organized as a series of rows for each individual. Each row is a time period with constant covariates.

The length of the time period is in the covariate `duration`

.

The transition being made at the end of the period is coded in the covariate `d`

. This
is an integer which is 0 if no transition occurs (e.g. if a covariate changes), it is 1 for
the first transition, 2 for the second transition. It can also be a factor, in which case the
level marking no transition must be called `"none"`

.

The covariate `alpha`

is zero when unemployed, and 1 if on a programme. It is used
for two purposes. It is used as an explanatory variable for transition to job, this yields
a coefficient which can be interpreted as the effect of being on the programme. It is also
used as a "state variable", as an index into a "risk set". I.e. when estimating, the
`mphcrm`

function must be told which risks/hazards are present.
When on a programme the `"toprogram"`

transition can not be made. This is implemented
by specifying a list of risksets and using `alpha+1`

as an index into this set.

The two hazards are modeled as *exp(X β + μ)*, where *X* is a matrix of covariates
*β* is a vector of coefficients to be estimated, and *μ* is an intercept. All of
these quantities are transition specific. This yields an individual likelihood which we call
*M_i(μ)*. The idea behind the mixed proportional hazard model is to model the
individual heterogeneity as a probability distribution of intercepts. We obtain the individual
likelihood *L_i = ∑_j p_j M_i(μ_j)*, and, thus, the likelihood *L = ∑_j L_j*.

The likelihood is to be maximized over the parameter vectors *β* (one for each transition),
the masspoints *μ_j*, and probabilites *p_j*.

The probability distribution is built up in steps. We start with a single masspoint, with probability 1. Then we search for another point with a small probability, and maximize the likelihood from there. We continue with adding masspoints until we no longer can improve the likelihood.

The example illustrates how `data(durdata)`

was generated.

1 2 3 4 5 6 7 8 9 10 11 | ```
data.table::setDTthreads(1) # avoid screams from cran-testing
dataset <- datagen(5000,80)
print(dataset)
risksets <- list(unemp=c("job","program"), onprogram="job")
# just two iterations to save time
Fit <- mphcrm(d ~ x1+x2 + ID(id) + D(duration) + S(alpha+1) + C(job,alpha),
data=dataset, risksets=risksets,
control=mphcrm.control(threads=1,iters=2))
best <- Fit[[1]]
print(best)
summary(best)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.