Nothing
#' Estimation of latent process joint models for multivariate longitudinal
#' outcomes and time-to-event data.
#'
#' This function fits extended joint models with shared random effects.
#' The longitudinal submodel handles multiple continuous longitudinal outcomes
#' (Gaussian or non-Gaussian, curvilinear) as well as
#' ordinal longitudinal outcomes in a mixed effects framework. The model assumes that all the outcomes
#' measure the same underlying latent process defined as their common
#' factor, and each outcome is related to this latent common factor by
#' outcome-specific measurement models whose nature depends on the type of the
#' associated outcome (linear model for Gaussian outcome, curvilinear model for
#' non-Gaussian outcome, cumulative probit model for ordinal outcome).
#' At the latent process level, the model estimates a standard linear mixed model.
#' The survival submodel handles right-censored (possibly left-truncated) time-to-events with competing risks.
#' The association between the longitudinal and the survival data is captured by including
#' the random effect from the mixed model or the predicted current level
#' of the underlying process as a linear predictor in the proportional hazard survival model.
#' Parameters of the measurement models, of the latent process mixed model and of the
#' survival model are estimated simultaneously using a maximum likelihood method,
#' through a Marquardt-Levenberg algorithm.
#'
#'
#'
#' A. THE MEASUREMENT MODELS
#'
#' \code{jointLPM} function estimates one measurement model per outcome to link
#' each outcome Y_k(t) with the underlying latent common factor L(t) they measure.
#' To fix the latent process dimension, we chose to constrain at the latent process
#' level the intercept of the mixed model at 0 and the
#' standard error of the first random effect at 1. The nature of each measurment
#' model adapts to the type of the outcome it models.
#'
#' 1. For continuous Gaussian outcomes, linear models are used and required 2 parameters for
#' the transformation (Y(t) - b1)/b2
#'
#' 2. For continuous non-Gaussian outcomes, curvilinear models use
#' parametrized link function to link outcomes to the latent process.
#' With the "beta" link function, 4 parameters are required for the
#' following transformation: [ h(Y(t)',b1,b2) - b3]/b4 where h is the Beta CDF
#' with canonical parameters c1 and c2 that can be derived from b1 and b2 as
#' c1=exp(b1)/[exp(b2)*(1+exp(b1))] and c2=1/[exp(b2)*(1+exp(b1))], and Y(t)'
#' is the rescaled outcome i.e. Y(t)'= [ Y(t) - min(Y(t)) + epsY ] / [
#' max(Y(t)) - min(Y(t)) +2*epsY ].
#' With the "splines" link function, n+2 parameters are required for the
#' following transformation b_1 + b_2*I_1(Y(t)) + ... + b_{n+2}*I_{n+1}(Y(t)),
#' where I_1,...,I_{n+1} is the basis of quadratic I-splines. To constraint the
#' parameters to be positive, except for b_1, the program estimates b_k^* (for
#' k=2,...,n+2) so that b_k=(b_k^*)^2.
#'
#' 3. For discrete ordinal outcomes, cumulative probit models are used. For a
#' (n+1)-level outcome, the model consist of determining n thresholds t_k in the
#' latent process scale which correspond to the outcome level changes. Then,
#' Y(t) = n' <=> t_n' < L(t) + e <= t_(n'+1) with e the standard error of the outcome.
#' To ensure that t_1 < t_2 < ... < t_n, the program estimates t'_1, t'_2, ..., t'_n
#' such that t_1=t'_1, t_2=t_1+(t'_2)^2, t_3=t_2+(t'_3)^2, ...
#'
#'
#' B. THE SURVIVAL MODEL
#'
#' a. BASELINE RISK FUNCTIONS
#'
#' For the baseline risk functions, the following parameterizations were considered.
#'
#' 1. With the "Weibull" function: 2 parameters are necessary w_1 and w_2 so that
#' the baseline risk function a_0(t) = w_1^2*w_2^2*(w_1^2*t)^(w_2^2-1) if logscale=FALSE
#' and \cr
#' a_0(t) = exp(w_1)*exp(w_2)(t)^(exp(w_2)-1) if logscale=TRUE.
#'
#' 2. with the "piecewise" step function and nz nodes (y_1,...y_nz), nz-1 parameters
#' are necesssary p_1,...p_nz-1 so that the baseline risk function a_0(t) = p_j^2
#' for y_j < t =< y_j+1 if logscale=FALSE and a_0(t) = exp(p_j) for y_j < t =< y_j+1 if logscale=TRUE.
#'
#' 3. with the "splines" function and nz nodes (y_1,...y_nz), nz+2 parameters
#' are necessary s_1,...s_nz+2 so that the baseline risk function a_0(t) = sum_j s_j^2 M_j(t)
#' if logscale=FALSE and a_0(t) = sum_j exp(s_j) M_j(t) if logscale=TRUE where M_j is the basis of cubic M-splines.
#' Two parametrizations of the baseline risk function are proposed (logscale=TRUE or FALSE)
#' because in some cases, especially when the instantaneous risks are very close to 0,
#' some convergence problems may appear with one parameterization or the other.
#' As a consequence, we recommend to try the alternative parameterization (changing logscale option)
#' when a model does not converge (maximum number of iterations reached) and
#' where convergence criteria based on the parameters and likelihood are small.
#'
#'
#' b. ASSOCIATION BETWEEN LONGITUDINAL AND SURVIVAL DATA
#'
#' The association between the longitudinal and the survival data is captured by including
#' a function of the elements from the latent process mixed model as a predictor in the survival model.
#' We implement two association structures,
#' that should be specified through \code{sharedtype} argument.
#'
#' 1. the random effect from the latent process linear mixed model (\code{sharedtype='RE'}) :
#' the q random effects modeling the individual deviation in the longitudinal model are also included
#' in the survival model, so that a q-vector of parameters measures the association
#' between the risk of event and the longitudinal outcome(s).
#'
#' 2. the predicted current level of the underlying process (\code{sharedtype='CL'}) :
#' the predicted latent process defined by the mixed model appears as
#' time-dependent covariate in the survival model.
#' The association between the longitudinal process and the risk of event
#' is then quantified by a unique parameter.
#'
#'
#' C. THE VECTOR OF PARAMETERS B
#'
#' The parameters in the vector of initial values \code{B} or in the vector of
#' maximum likelihood estimates \code{best} are included in the following
#' order:
#' (1) parameters for the baseline risk function: 2 parameters for each Weibull,
#' nz-1 for each piecewise constant risk and nz+2 for each splines risk. In the
#' presence of competing events, the number of parameters should be adapted to
#' the number of causes of event;
#' (2) for all covariates in survival, one parameter
#' is required. Covariates parameters should be included in the same order as in survival.
#' In the presence of cause-specific effects, the number of parameters should be
#' multiplied by the number of causes;
#' (3) parameter(s) of association between the longitudinal
#' and the survival process: for \code{sharedtype='RE'}, one parameter per random effect
#' and per cause of event is
#' required; for \code{sharedtype='CL'}, one parameter per cause of event is required;
#' (4) for all covariates in fixed, one parameter is required. Parameters should
#' be included in the same order as in fixed;
#' (5)for all covariates included with \code{contrast()} in \code{fixed}, one
#' supplementary parameter per outcome is required excepted for the last
#' outcome for which the parameter is not estimated but deduced from the others;
#' (6) the variance of each random-effect specified in random
#' (excepted the intercept which is constrained to 1)
#' if idiag=TRUE and the inferior triangular variance-covariance matrix of all
#' the random-effects if idiag=FALSE;
#' (7) if \code{cor} is specified, the standard error of the Brownian motion or
#' the standard error and the correlation parameter of the autoregressive process;
#' (8) parameters of each measurement model: 2 for "linear", 4 for "beta",
#' n+2 for "splines" with n nodes, n for "thresholds" for a (n+1)-level outcome;
#' (9) if \code{randomY=TRUE}, the standard
#' error of the outcome-specific random intercept (one per outcome);
#' (10) the outcome-specific standard errors (one per outcome)
#'
#'
#'
#' C. CAUTIONS REGARDING THE USE OF THE PROGRAM
#'
#' Some caution should be made when using the program. Convergence criteria are
#' very strict as they are based on the derivatives of the log-likelihood in
#' addition to the parameter and log-likelihood stability. In some cases, the
#' program may not converge and reach the maximum number of iterations fixed at
#' 100 by default. In this case, the user should check that parameter estimates at the
#' last iteration are not on the boundaries of the parameter space.
#'
#' If the parameters are on the boundaries of the parameter space, the
#' identifiability of the model is critical. This may happen especially with
#' splines parameters that may be too close to 0 (lower boundary). When
#' identifiability of some parameters is suspected, the program can be run
#' again from the former estimates by fixing the suspected parameters to their
#' value with option posfix. This usually solves the problem. An alternative is
#' to remove the parameters of the Beta of Splines link function from the
#' inverse of the Hessian with option partialH. If not, the program should be
#' run again with other initial values, with a higher maximum number of iterations
#' or less strict convergence tolerances.
#'
#' To reduce the computation time, this program can be carried out in parallel mode,
#' ie. using multiple cores which number can be specified with argument \code{nproc}.
#'
#'
#'
#' @param fixed a two-sided linear formula object for specifying the
#' fixed-effects in the linear mixed model at the latent process level. The
#' response outcomes are separated by \code{+} on the left of \code{~} and the
#' covariates are separated by \code{+} on the right of the \code{~}. For
#' identifiability purposes, the intercept specified by default should not be
#' removed by a \code{-1}. Variables on which a contrast above the different
#' outcomes should also be estimated are included with \code{contrast()}.
#' @param random a one-sided formula for the random-effects in the
#' latent process mixed model. Covariates with a random-effect are separated
#' by \code{+}. An intercept should always be included for identifiability.
#' @param subject name of the covariate representing the grouping structure.
#' @param idiag optional logical for the variance-covariance structure of the
#' random-effects. If \code{FALSE}, a non structured matrix of
#' variance-covariance is considered (by default). If \code{TRUE} a diagonal
#' matrix of variance-covariance is considered.
#' @param randomY optional logical for including an outcome-specific random
#' intercept. If \code{FALSE} no outcome-specific random intercept is added
#' (default). If \code{TRUE} independent outcome-specific random intercepts
#' with parameterized variance are included.
#' @param link optional vector of families of parameterized link functions defining
#' the measurement models (one by outcome). Option "linear" (by default) specifies a linear
#' link function. Other possibilities include "beta" for estimating a link
#' function from the family of Beta cumulative distribution functions, "Splines"
#' for approximating the link function by I-splines and "thresholds" for ordinal
#' outcomes modelled by cumulative probit models. For splines case, the number of
#' nodes and the nodes location should be also specified. The number of nodes is
#' first entered followed by \code{-}, then the location is specified with "equi",
#' "quant" or "manual" for respectively equidistant
#' nodes, nodes at quantiles of the marker distribution or interior nodes
#' entered manually in argument \code{intnodes}. It is followed by \code{-splines}.
#' For example, "7-equi-splines" means
#' I-splines with 7 equidistant nodes, "6-quant-splines" means I-splines with 6
#' nodes located at the quantiles of the marker distribution and
#' "9-manual-splines" means I-splines with 9 nodes, the vector of 7 interior
#' nodes being entered in the argument \code{intnodes}.
#' @param intnodes optional vector of interior nodes. This argument is only
#' required for a I-splines link function with nodes entered manually.
#' @param epsY optional positive real used to rescale the marker in
#' (0,1) when the beta link function is used. By default, epsY=0.5.
#' @param cor optional indicator for inclusion of an autocorrelated Gaussian
#' process in the linear mixed model at the latent process level. Option
#' \code{BM(time)} indicates a brownian motion with parameterized variance while
#' option \code{AR(time)} specifies an autoregressive process in continuous time
#' with parameterized variance and correlation intensity. In both cases, \code{time}
#' is the variable representing the measurement times. By default, no autocorrelated
#' Gaussian process is added.
#' @param survival two-sided formula object. The left side of the formula corresponds
#' to a Surv() object for right-censored (\code{Surv(EntryTime,Time,Indicator)})
#' and possibly left-truncated (\code{Surv(EntryTime,Time,Indicator)}).
#' Multiple causes of event can be considered
#' in the Indicator (0 for censored, k for event of cause k). The right side of the
#' formula specifies the covariates to include in the survival model.
#' Cause-specific covariate effect are specified with \code{cause()}.For example,
#' Surv(Time,Indicator) ~ X1 + cause(X2) indicates a common effect of X1 and a cause-specific effect of X2.
#' @param hazard optional family of hazard function assumed for the survival model.
#' By default, "Weibull" specifies a Weibull baseline risk function. Other possibilities
#' are "piecewise" for a piecewise constant risk function or "splines" for a cubic M-splines
#' baseline risk function. For these two latter families, the number of nodes and the
#' location of the nodes should be specified as well, separated by -. The number of
#' nodes is entered first followed by -, then the location is specified with "equi",
#' "quant" or "manual" for respectively equidistant nodes, nodes at quantiles of the
#' times of event distribution or interior nodes entered manually in argument hazardnodes.
#' It is followed by - and finally "piecewise" or "splines" indicates the family of
#' baseline risk function considered. Examples include "5-equi-splines" for M-splines
#' with 5 equidistant nodes, "6-quant-piecewise" for piecewise constant risk over 5
#' intervals and nodes defined at the quantiles of the times of events distribution
#' and "9-manual-splines" for M-splines risk function with 9 nodes, the vector of 7
#' interior nodes being entered in the argument hazardnodes. In the presence of competing
#' events, a vector of hazards should be provided such as hazard=c("Weibull","5-quant-splines")
#' with 2 causes of event, the first one modelled by a Weibull baseline cause-specific
#' risk function and the second one by splines.
#' @param hazardnodes optional vector containing interior nodes if splines or piecewise
#' is specified for the baseline hazard function in hazard.
#' @param TimeDepVar optional vector containing an intermediate time
#' corresponding to a change in the risk of event. This time-dependent
#' covariate can only take the form of a time variable with the assumption that
#' there is no effect on the risk before this time and a constant effect on the
#' risk of event after this time (example: initiation of a treatment to account
#' for).
#' @param logscale optional boolean indicating whether an exponential
#' (logscale=TRUE) or a square (logscale=FALSE -by default) transformation is
#' used to ensure positivity of parameters in the baseline risk functions. See
#' details section
#' @param startWeibull optional numeric with Weibull hazard functions only.
#' Indicates the shift in the Weibull distribution.
#' @param hazardrange optional vector indicating the range of the survival times
#' (that is the minimum and maximum). By default, the range is defined according
#' to the minimum and maximum observed values of the survival times. The option
#' should be used only for piecewise constant and Splines hazard functions.
#' @param data data frame containing all variables named in \code{fixed},
#' \code{random}, \code{cor}, \code{survival} and \code{subject}.
#' @param B optional specification for the initial values for the parameters.
#' Initial values should be entered in the order detailed in \code{details} section.
#' @param convB optional threshold for the convergence criterion based on the
#' parameter stability. By default, convB=0.0001.
#' @param convL optional threshold for the convergence criterion based on the
#' log-likelihood stability. By default, convL=0.0001.
#' @param convG optional threshold for the convergence criterion based on the
#' derivatives. By default, convG=0.0001.
#' @param maxiter optional maximum number of iterations for the Marquardt
#' iterative algorithm. By default, maxiter=100.
#' @param nsim number of points used to plot the estimated link functions. By
#' default, nsim=100.
#' @param range optional vector indicating the range of the outcomes (that is
#' the minimum and maximum). By default, the range is defined according to the
#' minimum and maximum observed values of the outcome. The option should be
#' used only for Beta and Splines transformations.
#' @param subset optional vector giving the subset of observations in
#' \code{data} to use. By default, all lines.
#' @param na.action Integer indicating how NAs are managed. The default is 1
#' for 'na.omit'. The alternative is 2 for 'na.fail'. Other options such as
#' 'na.pass' or 'na.exclude' are not implemented in the current version.
#' @param posfix Optional vector giving the indices in vector B of the
#' parameters that should not be estimated. Default to NULL, all parameters are
#' estimated.
#' @param partialH optional vector giving the indices in vector B of parameters that
#' can be dropped from the Hessian matrix to define convergence criteria.
#' @param verbose logical indicating if information about computation should be
#' reported. Default to TRUE.
#' @param returndata logical indicating if data used for computation should be
#' returned. Default to FALSE, data are not returned.
#' @param methInteg character indicating the type of integration to compute the
#' log-likelihood. 'MCO' for ordinary Monte Carlo, 'MCA' for antithetic Monte Carlo,
#' 'QMC' for quasi Monte Carlo. Default to "QMC".
#' @param nMC integer, number of Monte Carlo simulations. Default to 1000.
#' @param sharedtype indicator of shared random function type. \code{'RE'} indicates
#' an association through the random effects included in the linear mixed model.
#' \code{'CL'} defines a association through the predicted current level of the latent process.
#' @param var.time name of the variable representing the measurement times.
#' @param nproc number of cores for parallel computation.
#' @param clustertype the type of cluster that should internally be created.
#' See \code{parallel::makeCluster} for possible values.
#'
#' @return An object of class "jointLPM" is returned containing some internal information
#' used in related functions. Users may investigate the following elements :
#' \item{ns}{number of grouping units in the dataset}
#' \item{loglik}{log-likelihood of the model}
#' \item{best}{vector of parameter estimates in the same order as specified in
#' \code{B} and detailed in section \code{details}}
#' \item{V}{vector containing the upper triangle matrix of variance-covariance
#' estimates of \code{best} with exception for variance-covariance parameters
#' of the random-effects for which \code{V} contains the variance-covariance
#' estimates of the Cholesky transformed parameters displayed in \code{cholesky}}
#' \item{gconv}{vector of convergence criteria: 1. on the parameters, 2. on the
#' likelihood, 3. on the derivatives}
#' \item{conv}{status of convergence: =1 if the convergence criteria were satisfied,
#' =2 if the maximum number of iterations was reached, =4 or 5 if a problem occured
#' during optimisation}
#' \item{call}{the matched call}
#' \item{niter}{number of Marquardt iterations}
#' \item{nevent}{number of occured event}
#' \item{pred}{table of individual predictions and residuals in the underlying
#' latent process scale; it includes marginal predictions (pred_m), marginal
#' residuals (resid_m), subject-specific predictions (pred_ss) and
#' subject-specific residuals (resid_ss) and finally the transformed
#' observations in the latent process scale (obs).}
#' \item{predRE}{table containing individual predictions of the random-effects :
#' a column per random-effect, a line per subject.}
#' \item{predRE_Y}{table containing individual predictions of the outcome-specific
#' random intercept}
#' \item{predSurv}{table containing the predicted baseline risk function and
#' the predicted cumulative baseline risk function }
#' \item{cholesky}{vector containing the estimates of the Cholesky transformed
#' parameters of the variance-covariance matrix of the random-effects}
#' \item{estimlink}{table containing the simulated values of each outcome and
#' the corresponding estimated link function}
#' \item{epsY}{definite positive reals used to rescale the markers in (0,1) when
#' the beta link function is used. By default, epsY=0.5.}
#' \item{AIC}{the Akaike's information criterion}
#' \item{BIC}{the Bayesian information criterion}
#' \item{CPUtime}{the runtime in seconds}
#' \item{data}{the original data set (if returndata is TRUE)}
#' @author Viviane Philipps, Tiphaine Saulnier and Cecile Proust-Lima
#'
#' @references
#' Saulnier, Philipps, Meissner, Rascol, Pavy-Le-Traon, Foubert-Samier, Proust-Lima (2021).
#' Joint models for the longitudinal analysis of measurement scales in the presence
#' of informative dropout arXiv:2110.02612
#'
#' Philipps, Hejblum, Prague, Commenges, Proust-Lima (2021).
#' Robust and efficient optimization using a Marquardt-Levenberg algorithm with
#' R package marqLevAlg, The R Journal 13:2.
#'
#' @examples
#' #### Examples with paquid data from R-package lcmm
#' library(lcmm)
#' paq <- paquid[which(paquid$age_init<paquid$agedem),]
#' paq$age65 <- (paq$age-65)/10
#'
#' #### Example with one Gaussian marker :
#' ## We model the cognitive test IST according to age, sexe and eduction level. We assume
#' ## a Weibull distribution for the time to dementia and link the longitudinal and survival
#' ## data using the random effects.
#' ## We provide here the call to the jointLPM function without optimization (maxiter=0). The
#' ## results should therefore not be interpreted.
#' M0 <- jointLPM(fixed = IST~age65*(male+CEP),
#' random=~age65,
#' idiag=FALSE,
#' subject="ID",
#' link="linear",
#' survival=Surv(age_init,agedem,dem)~male,
#' sharedtype='RE',
#' hazard="Weibull",
#' data=paq,
#' var.time="age65",
#' maxiter=0)
#' M0$best ## these are the initial values of each of the 15 parameters
#'
#' \donttest{
#' ## Estimation with one Gaussian marker
#' ## We remove the maxiter=0 option to estimate the model. We specify initial values
#' ## to reduce the runtime, but this can take several minutes.
#' binit1 <- c(0.1039, 5.306, -0.1887, -1.0355, -4.3817, -1.0543, -0.1161, 0.8588,
#' 0.0538, -0.1722, -0.2224, 0.3296, 30.7768, 4.6169, 0.7396)
#' M1 <- jointLPM(fixed = IST~age65*(male+CEP),
#' random=~age65,
#' idiag=FALSE,
#' subject="ID",
#' link="linear",
#' survival=Surv(age_init,agedem,dem)~male,
#' sharedtype='RE',
#' hazard="Weibull",
#' data=paq,
#' var.time="age65",
#' B=binit1)
#' ## Optimized the parameters to be interpreted :
#' summary(M1)
#'
#'
#' #### Estimation with one ordinal marker :
#' ## We consider here the 4-level hierarchical scale of dependence HIER and use "thresholds"
#' ## to model it as an ordinal outcome. We assume an association between the current level
#' ## of dependency and the risk of dementia through the option sharedtype="CL".
#' ## We use a parallel optimization on 2 cores to reduce computation time.
#' binit2 <- c(0.0821, 2.4492, 0.1223, 1.7864, 0.0799, -0.2864, 0.0055, -0.0327, 0.0017,
#' 0.3313, 0.9763, 0.9918, -0.4402)
#' M2 <- jointLPM(fixed = HIER~I(age-65)*male,
#' random = ~I(age-65),
#' subject = "ID",
#' link = "thresholds",
#' survival = Surv(age_init,agedem,dem)~male,
#' sharedtype = 'CL',
#' var.time = "age",
#' data = paq,
#' methInteg = "QMC",
#' nMC = 1000,
#' B=binit2,
#' nproc=2)
#' summary(M2)
#'
#' }
#'
#' @export
#'
jointLPM <- function(fixed,random,subject,idiag=FALSE,cor=NULL,link="linear",intnodes=NULL,epsY=0.5,randomY=FALSE, var.time,
survival=NULL,hazard="Weibull",hazardrange=NULL,hazardnodes=NULL,TimeDepVar=NULL,logscale=FALSE,startWeibull=0, sharedtype='RE',
methInteg="QMC",nMC=1000,data,subset=NULL,na.action=1,
B,posfix=NULL,maxiter=100,convB=0.0001,convL=0.0001,convG=0.0001,partialH=NULL,
nsim=100,range=NULL,verbose=TRUE,returndata=FALSE,
nproc=1, clustertype=NULL)
{
ptm <- proc.time()
cl <- match.call()
nom.subject <- as.character(subject)
if(missing(random)) stop("At least one random effect is required")
if(random==~-1) stop("At least one random effect is required")
if(missing(fixed)) stop("The argument fixed must be specified in any model")
if(!inherits(fixed,"formula")) stop("The argument fixed must be a formula")
if(!inherits(random,"formula")) stop("The argument random must be a formula")
if(missing(data)){ stop("The argument data should be specified and defined as a data.frame")}
if(nrow(data)==0) stop("Data should not be empty")
if(missing(subject)){ stop("The argument subject must be specified")}
if(!is.numeric(data[,subject])) stop("The argument subject must be numeric")
if(all(link %in% c("linear","beta","thresholds")) & !is.null(intnodes)) stop("Intnodes should only be specified with splines links")
if(!(na.action%in%c(1,2)))stop("only 1 for 'na.omit' or 2 for 'na.fail' are required in na.action argument")
if(!(sharedtype%in%c('RE','CL')))stop("The value of argument sharedtype must be 'RE' (for shared random effects) or 'CL' (for shared latent process current level)")
if(missing(var.time) | length(var.time)!=1)stop("The argument var.time is missing or is not of length 1")
if(!(var.time %in% colnames(data))) stop("Unable to find variable 'var.time' in 'data'")
if(sharedtype == 'CL' & missing(cor)==FALSE) print("WARNING : wi not computed on current level prediction 'cause considered not shared")
if(sharedtype == 'CL' & missing(TimeDepVar)==FALSE) stop("model with sharedtype='CL' not yet programmed with time dependent effect on survival")
# if(length(posfix) & missing(B)) stop("A set of initial parameters must be specified if some parameters are not estimated")
## garder data tel quel pour le renvoyer
if(returndata==TRUE)
{
datareturn <- data
}
else
{
datareturn <- NULL
}
## test de l'argument cor
ncor <- 0
cor.type <- cl$cor[1]
cor.time <- cl$cor[2]
cor <- paste(cor.type,cor.time,sep="-")
if (!isTRUE(all.equal(cor,character(0))))
{
if (substr(cor,1,2)=="AR") { ncor <- 2 }
else if (substr(cor,1,2)=="BM") { ncor <- 1 }
else { stop("The argument cor must be of type AR or BM") }
if(!(strsplit(cor,"-")[[1]][2] %in% colnames(data))) stop("Unable to find time variable from argument 'cor' in 'data'")
else { cor.var.time <- strsplit(cor,"-")[[1]][2] }
}
##pour acces aux attributs des formules
afixed <- terms(fixed, specials=c("factor","contrast"))
if(attr(afixed,"intercept")==0) stop("An intercept should appear in fixed for identifiability purposes")
arandom <- terms(random, specials=c("factor"))
##fixed sans contrast
fixed2 <- gsub("contrast","",fixed)
fixed2 <- formula(paste(fixed2[2],fixed2[3],sep="~"))
afixed2 <- terms(fixed2)
##verifier si toutes les varialbes sont dans data
variables <- c(attr(afixed,"variables"),attr(arandom,"variables"))
variables <- unlist(lapply(variables,all.vars))
if(!all(variables %in% colnames(data))) stop(paste("Data should contain the variables",paste(unique(variables),collapse=" ")))
##contrast
contr <- ~-1
if(!is.null(attr(afixed,"specials")$contrast))
{
vcontr <- attr(afixed,"term.labels")[setdiff(attr(afixed,"specials")$contrast-1,untangle.specials(afixed,"contrast",2)$terms)]
vcontr <- gsub("contrast","",vcontr)
contr <- as.formula(paste("~-1+",paste(vcontr,collapse="+")))
}
acontr <- terms(contr)
##liste des outcomes
nomsY <- as.character(attr(afixed,"variables")[2])
nomsY <- strsplit(nomsY,split=" + ",fixed=TRUE)
nomsY <- as.vector(nomsY[[1]])
ny <- length(nomsY)
##pas de contrast ni randomY si un seul Y
if(ny<2 & length(attr(afixed,"specials")$contrast)) stop("No contrast can be included with less than two outcomes")
if(ny<2 & randomY==TRUE) stop("With less than 2 outcomes randomY should be FALSE")
##liste des variables utilisees (sans les interactions et sans les Y)
ttesLesVar <- colnames(get_all_vars(afixed,data=data[1,]))
ttesLesVar <- c(ttesLesVar, colnames(get_all_vars(arandom,data=data[1,])))
if (ncor>0) ttesLesVar <- unique(c(ttesLesVar,cor.var.time))
else ttesLesVar <- unique(ttesLesVar)
ttesLesVar <- setdiff(ttesLesVar, nomsY)
## argument subset
form1 <- paste(c(nom.subject,nomsY,ttesLesVar),collapse="+")
if(!isTRUE(all.equal(as.character(cl$subset),character(0))))
{
cc <- cl
cc <- cc[c(1,which(names(cl)=="subset"))]
cc[[1]] <- as.name("model.frame")
cc$formula <- formula(paste("~",form1))
cc$data <- data
cc$na.action <- na.pass
data <- eval(cc)
}
attributes(data)$terms <- NULL
## si subject est un factor
if(is.factor(data[,nom.subject]))
{
data[,nom.subject] <- as.numeric(data[,nom.subject])
}
## partie survie
if(is.null(survival))
{
nbevt <- 0
idtrunc <- 0
nprisq <- 0
nrisqtot <- 0
nvarxevt <- 0
nvarxevt2 <- 0
typrisq <- 0
noms.surv <- NULL
nom.timedepvar <- NULL
form.commun <- ~-1
form.cause <- ~-1
survival <- ~-1
nz <- 0
zi <- 0
minT <- 0
maxT <- 0
}
else
{
## objet Surv
surv <- cl$survival[[2]]
if(length(surv)==3) #censure droite sans troncature gauche
{
idtrunc <- 0
Tevent <- getElement(object=data,name=as.character(surv[2]))
Event <- getElement(object=data,name=as.character(surv[3]))
Tentry <- rep(0,length(Tevent)) #si pas de troncature, Tentry=0
noms.surv <- c(as.character(surv[2]),as.character(surv[3]))
surv <- do.call("Surv",list(time=Tevent,event=Event,type="mstate"))
}
if(length(surv)==4) #censure droite et troncature
{
idtrunc <- 1
Tentry <- getElement(object=data,name=as.character(surv[2]))
Tevent <- getElement(object=data,name=as.character(surv[3]))
Event <- getElement(object=data,name=as.character(surv[4]))
noms.surv <- c(as.character(surv[2]),as.character(surv[3]),as.character(surv[4]))
surv <- do.call("Surv",list(time=Tentry,time2=Tevent,event=Event,type="mstate"))
}
## nombre d'evenement concurrents
nbevt <- length(attr(surv,"states"))
if(nbevt<1) stop("No observed event in the data")
## pour la formule pour survivial, creer 3 formules :
## une pour les covariables en mixture, une pour les covariables avec effet specifique a la cause, et une pour les effets communs.
form.surv <- cl$survival[3]
noms.form.surv <- all.vars(attr(terms(formula(paste("~",form.surv))),"variables"))
if(length(noms.form.surv)==0)
{
form.cause <- ~-1
form.commun <- ~-1
asurv <- terms(~-1)
}
else
{
##creer la formula pour cause
form1 <- gsub("mixture","",form.surv) #TS: pas de classes latentes
form1 <- formula(paste("~",form1))
asurv1 <- terms(form1,specials="cause")
ind.cause <- attr(asurv1,"specials")$cause
if(length(ind.cause))
{
form.cause <- paste(labels(asurv1)[ind.cause],collapse="+")
form.cause <- gsub("cause","",form.cause)
form.cause <- formula(paste("~",form.cause))
}
else
{
form.cause <- ~-1
}
## creer la formule pour ni cause ni mixture
asurv <- terms(formula(paste("~",form.surv)),specials=c("cause"))
ind.commun <- setdiff(1:length(labels(asurv)),unlist(attr(asurv,"specials")))
if(length(ind.commun))
{
form.commun <- paste(labels(asurv)[ind.commun],collapse="+")
form.commun <- gsub("cause","",form.commun) # si X1:cause(X2)
form.commun <- formula(paste("~",form.commun))
}
else
{
form.commun <- ~-1
}
}
}
nom.timedepvar <- NULL
if(!missing(TimeDepVar))
{
if(!is.null(TimeDepVar))
{
nom.timedepvar <- as.character(TimeDepVar)
if(!isTRUE(nom.timedepvar %in% colnames(data))) stop(paste("Data should contain variable",nom.timedepvar))
}
}
##verifier si toutes les variables sont dans data
varSurv <- unique(all.vars(terms(survival)))
if(!is.null(nom.timedepvar)){if(!(nom.timedepvar %in% all.vars(terms(survival)))) stop("Variable in 'TimeDepVar' should also appear as a covariate in the 'survival' argument")}
if(!all(varSurv %in% colnames(data))) stop(paste("Data should contain the variables",paste(varSurv,collapse=" ")))
ttesLesVar <- unique(c(ttesLesVar,varSurv))
##subset de data avec les variables utilisees
newdata <- data[,c(nom.subject,nomsY,noms.surv,ttesLesVar),drop=FALSE]
## remplacer les NA de TimeDepVar par Tevent
Tint <- Tevent
nvdepsurv <- 0
if(!is.null(nom.timedepvar))
{
Tint <- newdata[,nom.timedepvar]
Tint[(is.na(Tint))] <- Tevent[(is.na(Tint))]
Tint[Tint>Tevent] <- Tevent[Tint>Tevent]
Tint[Tint<Tentry] <- Tentry[Tint<Tentry]
nvdepsurv <- 1
if (length(Tint[Tint<Tevent])==0)
{
stop("TimeDepVar is always greater than Time of Event. \n")
nvdepsurv <- 0
}
if (length(Tint[Tint>Tentry])==0)
{
Tint <- Tevent
stop("TimeDepVar is always lower than Time of Entry (0 by default). \n")
nvdepsurv <- 0
}
newdata[,nom.timedepvar] <- Tint
}
dataSurv <- NULL
if((nbevt>0))
{
dataSurv <- data.frame(getElement(object=data,name=nom.subject),Tentry,Tevent,Event,Tint)
}
##un data frame par outcome et creation Y0
dataY <- paste("data",nomsY,sep=".")
Y0 <- NULL
IND <- NULL
outcome <- NULL
data0 <- NULL
nayk <- vector("list",ny)
for (k in 1:ny)
{
dtemp <- newdata[,c(nom.subject,nomsY[k],noms.surv,ttesLesVar)]
##enlever les NA
linesNA <- apply(dtemp,2,function(v) which(is.na(v)))
linesNA <- unique(unlist(linesNA))
if(length(linesNA)) nayk[[k]] <- linesNA
if(na.action==1 & length(linesNA)>0) dtemp <- dtemp[-linesNA,]
if(na.action==2 & length(linesNA)>0) stop("Data contains missing values")
assign(dataY[k],dtemp)
Y0 <- c(Y0, dtemp[,nomsY[k]])
IND <- c(IND, dtemp[,nom.subject])
outcome <- c(outcome,rep(nomsY[k],nrow(dtemp)))
data0 <- rbind(data0, dtemp[,setdiff(colnames(dtemp),nomsY[k]),drop=FALSE]) #dataset sans NA avec les covariables utilisees; obs ordonnees par outcome
}
##creation de X0 (ttes les var + interactions)
Xfixed <- model.matrix(fixed2[-2], data=data0)
Xrandom <- model.matrix(random, data=data0)
Xcontr <- model.matrix(contr,data=data0)
Xsurv <- model.matrix(form.commun,data=data0)
Xsurvcause <- model.matrix(form.cause,data=data0)
z.fixed <- strsplit(colnames(Xfixed),split=":",fixed=TRUE)
z.fixed <- lapply(z.fixed,sort)
z.random <- strsplit(colnames(Xrandom),split=":",fixed=TRUE)
z.random <- lapply(z.random,sort)
if(contr != ~-1)
{
z.contr <- strsplit(colnames(Xcontr),split=":",fixed=TRUE)
z.contr <- lapply(z.contr,sort)
}
else
{
z.contr <- list()
}
if(form.commun != ~-1)
{
z.surv <- strsplit(colnames(Xsurv),split=":",fixed=TRUE)
z.surv <- lapply(z.surv,sort)
}
else
{
z.surv <- list()
}
if(form.cause != ~-1)
{
z.survcause <- strsplit(colnames(Xsurvcause),split=":",fixed=TRUE)
z.survcause <- lapply(z.survcause,sort)
}
else
{
z.survcause <- list()
}
if(!all(z.contr %in% z.fixed)) stop("The covariates in contrast should also appear in fixed")
X0 <- cbind(Xfixed, Xrandom, Xsurv, Xsurvcause)
nom.unique <- unique(colnames(X0))
X0 <- X0[,nom.unique,drop=FALSE]
form.cor <- ~-1
if (ncor>0)
{
if(!(cor.var.time %in% colnames(X0)))
{
X0 <- cbind(X0, data0[,cor.var.time])
colnames(X0) <- c(nom.unique, cor.var.time)
nom.unique <- c(nom.unique,cor.var.time)
form.cor <- formula(paste("~-1+",cor.var.time))
}
}
X0 <- as.matrix(X0)
##X0 fini
##test de link
if (length(link)!=1 & length(link)!=ny) stop("One link per outcome should be specified")
if(any(link %in% c("splines","Splines")))
{
link[which(link %in% c("splines","Splines"))] <- "5-quant-splines"
}
if(length(link)==1 & ny>1)
{
link <- rep(link, ny)
}
idlink <- rep(2,ny)
idlink[which(link=="linear")] <- 0
idlink[which(link=="beta")] <- 1
idlink[which(link=="thresholds")] <- 3
spl <- strsplit(link[which(idlink==2)],"-")
if(any(sapply(spl,length)!=3)) stop("Invalid argument 'link'")
nySPL <- length(spl)
nybeta <- sum(idlink==1)
nyORD <- sum(idlink==3)
##remplir range si pas specifie
if(!is.null(range) & length(range)!=2*(nySPL+nybeta)) stop("Length of vector range is not correct.")
if((length(range)==2*(nySPL+nybeta)) & (nySPL+nybeta>0))
{
ind12 <- which(idlink==1 | idlink==2)
for (i in 1:(nySPL+nybeta))
{
rg <- range(get(dataY[ind12[i]])[,nomsY[ind12[i]]])
if(rg[1]<range[2*(i-1)+1] | rg[2]>range[2*(i-1)+2]) stop("The range specified do not cover the entire range of the data")
}
}
if((is.null(range) & (nybeta+nySPL)>0) | length(range)!=2*(nySPL+nybeta))
{
range <- NULL
for(k in which(idlink!=0))
{
min1 <- min(get(dataY[k])[,nomsY[k]])
min2 <- round(min1,3)
if(min1<min2) min2 <- min2-0.001
max1 <- max(get(dataY[k])[,nomsY[k]])
max2 <- round(max1,3)
if(max1>max2) max2 <- max2+0.001
range <- c(range, min2, max2)
}
}
## epsY
if (any(idlink==1))
{
if (any(epsY<=0))
{
stop("Argument 'epsY' should be positive.")
}
if(length(epsY)==1) epsY <- rep(epsY,nybeta)
if(length(epsY)!=nybeta) stop(paste("Argument 'epsY' should be of length",nybeta))
if(nybeta!=ny)
{
epsY2 <- rep(0,ny)
epsY2[which(idlink==1)] <- epsY
epsY <- epsY2
}
}
nbzitr <- rep(2,ny) #nbzitr = nb de noeuds si splines, 2 sinon
nbnodes <- NULL #que pour les splines
spltype <- NULL
if(nySPL>0)
{
for (i in 1:nySPL)
{
nbnodes <- c(nbnodes, spl[[i]][1])
spltype <- c(spltype, spl[[i]][2])
if(spl[[i]][3] != "splines") stop("Invalid argument link")
}
}
nbnodes <- as.numeric(nbnodes)
nbzitr[which(idlink==2)] <- nbnodes
##test splines
if(!(all(spltype %in% c("equi","quant","manual")))) stop("The location of the nodes should be 'equi', 'quant' or 'manual'")
##tester longueur de intnodes
if(!is.null(intnodes))
{
if(length(intnodes) != sum(nbnodes[which(spltype=="manual")]-2)) stop(paste("Vector intnodes should be of length",sum(nbnodes[which(spltype=="manual")]-2)))
}
##intnodes2 : contient tous les noeuds interieurs (pas seulement ceux de manual)
intnodes2 <- rep(NA,sum(nbnodes-2))
nb <- 0
nbspl <- 0
for (k in 1:ny)
{
if (idlink[k]!=2) next
else
{
nbspl <- nbspl+1
if(spltype[nbspl]=="manual")
{
nodes <- intnodes[(nb+1):(nb+nbnodes[nbspl]-2)]
if(!length(nodes)) stop("The length of intnodes is not correct")
intnodes2[(sum(nbnodes[1:nbspl]-2)-(nbnodes[nbspl]-2)+1):sum(nbnodes[1:nbspl]-2)] <- nodes
nb <- nb+nbnodes[nbspl]-2
idrg <- length(which(idlink[1:k] != 0))
if(any(nodes <= range[2*(idrg-1)+1]) | any(nodes >= range[2*idrg])) stop("Interior nodes must be in the range of the outcome")
}
if(spltype[nbspl]=="equi")
{
nodes <- seq(range[2*(nbspl-1)+1], range[2*nbspl], length.out=nbnodes[nbspl])
nodes <- nodes[-nbnodes[nbspl]]
nodes <- nodes[-1]
intnodes2[(sum(nbnodes[1:nbspl]-2)-(nbnodes[nbspl]-2)+1):sum(nbnodes[1:nbspl]-2)] <- nodes
}
if(spltype[nbspl]=="quant")
{
nodes <- quantile(get(dataY[k])[,nomsY[k]], probs=seq(0,1,length.out=nbnodes[nbspl]))
if(length(unique(nodes)) != length(nodes)) stop(paste("Some nodes are equal for link number",k,"; Please try to reduce the number of nodes or use manual location."))
nodes <- nodes[-nbnodes[nbspl]]
nodes <- nodes[-1]
intnodes2[(sum(nbnodes[1:nbspl]-2)-(nbnodes[nbspl]-2)+1):sum(nbnodes[1:nbspl]-2)] <- as.vector(nodes)
}
}
}
if(nb != length(intnodes)) stop(paste("The vector intnodes should be of length",nb))
##remplir zitr
m <- 0
if(nySPL>0) m <- max(nbnodes)
zitr <- matrix(0,max(m,2),ny)
nb12 <- 0
nbspl <- 0
for (k in 1:ny)
{
if((idlink[k]==0) | (idlink[k]==3)) zitr[1:2,k] <- c(min(get(dataY[k])[,nomsY[k]]),max(get(dataY[k])[,nomsY[k]]))
if(idlink[k]==1)
{
nb12 <- nb12 + 1
zitr[1:2,k] <- range[2*(nb12-1)+1:2]
}
if(idlink[k]==2)
{
nb12 <- nb12+1
nbspl <- nbspl+1
zitr[2:(nbzitr[k]-1),k] <- intnodes2[ifelse(nbspl==1,0,1)*sum(nbnodes[1:(nbspl-1)]-2) + 1:(nbnodes[nbspl]-2)]
zitr[1,k] <- range[2*(nb12-1)+1]
zitr[nbnodes[nbspl],k] <- range[2*nb12]
##verifier s'il y a des obs entre les noeuds
hcounts <- hist(get(dataY[k])[,nomsY[k]],breaks=zitr[1:nbnodes[nbspl],k],plot=FALSE,include.lowest=TRUE,right=TRUE)$counts
if(any(hcounts==0)) stop(paste("Link function number",k,"can not be estimated. Please try other nodes such that there are observations in each interval."))
}
}
##uniqueY0 et indiceY0
uniqueY0 <- NULL
indiceY0 <- NULL
nvalSPLORD <- rep(0,ny)
nbmod <- rep(0,ny)
modalites <- vector("list",ny)
nb <- 0
for (k in 1:ny)
{
if((idlink[k]!=2) & (idlink[k]!=3))
{
indiceY0 <- c(indiceY0, rep(0,length(get(dataY[k])[,nomsY[k]])))
next
}
yk <- get(dataY[k])[,nomsY[k]]
uniqueTemp <- sort(unique(yk))
permut <- order(order(yk)) # sort(y)[order(order(y))] = y
if(length(as.vector(table(yk)))==length(uniqueTemp))
{
indice <- rep(1:length(uniqueTemp), as.vector(table(yk)))
if(idlink[k]==2)
{
indiceTemp <- nb + indice[permut]
}
else
{
indiceTemp <- indice[permut]
}
nb <- nb + length(uniqueTemp)
uniqueY0 <- c(uniqueY0, uniqueTemp)
indiceY0 <- c(indiceY0, indiceTemp)
nvalSPLORD[k] <- length(uniqueTemp)
}
else
{
uniqueY0 <- c(uniqueY0, yk)
indiceY0 <- c(indiceY0, ifelse(idlink[k]==2,nb,0)+c(1:length(yk)))
nb <- nb + length(yk)
nvalSPLORD[k] <- length(yk)
}
if(idlink[k]==3)
{
nbmod[k] <- length(na.omit(uniqueTemp))
modalites[[k]] <- uniqueTemp
}
}
##ordonner les mesures par individu
matYX <- cbind(IND,Y0,indiceY0,outcome,X0)
matYXord <- matYX[order(IND),]
Y0 <- as.numeric(matYXord[,2])
X0 <- apply(matYXord[,-c(1:4),drop=FALSE],2,as.numeric)
IND <- matYXord[,1]
outcome <- matYXord[,4]
indiceY0 <- as.numeric(matYXord[,3])
dataSurv <- dataSurv[which(dataSurv[,1] %in% IND),]
dataSurv <- dataSurv[order(dataSurv[,1]),]
nmes <- as.vector(table(dataSurv[,1]))
data.surv <- apply(dataSurv[cumsum(nmes),-1],2,as.numeric)
tsurv0 <- data.surv[,1]
tsurv <- data.surv[,2]
devt <- data.surv[,3]
tsurvint <- data.surv[,4]
ind_survint <- (tsurvint<tsurv) + 0
## test de hazard
arghaz <- hazard
hazard <- rep(hazard,length.out=nbevt)
if(any(hazard %in% c("splines","Splines")))
{
hazard[which(hazard %in% c("splines","Splines"))] <- "5-quant-splines"
}
if(any(hazard %in% c("piecewise","Piecewise")))
{
hazard[which(hazard %in% c("piecewise","Piecewise"))] <- "5-quant-piecewise"
}
haz13 <- strsplit(hazard[which(!(hazard=="Weibull"))],"-")
if(any(sapply(haz13,length)!=3)) stop("Invalid argument hazard")
nz <- rep(2,nbevt)
locnodes <- NULL
typrisq <- rep(2,nbevt)
nprisq <- rep(2,nbevt)
nznodes <- 0 #longueur de hazardnodes
ii <- 0
if(any(hazard!="Weibull"))
{
for (i in 1:nbevt)
{
if(hazard[i]=="Weibull") next;
ii <- ii+1
nz[i] <- as.numeric(haz13[[ii]][1])
if(nz[i]<3) stop("At least 3 nodes are required")
typrisq[i] <- ifelse(haz13[[ii]][3] %in% c("splines","Splines"),3,1)
nprisq[i] <- ifelse(haz13[[ii]][3] %in% c("splines","Splines"),nz[i]+2,nz[i]-1)
locnodes <- c(locnodes, haz13[[ii]][2])
if(!(haz13[[ii]][3] %in% c("splines","Splines","piecewise","Piecewise"))) stop("Invalid argument hazard")
if((haz13[[ii]][2]=="manual"))
{
nznodes <- nznodes + nz[i]-2
}
if(!all(locnodes %in% c("equi","quant","manual"))) stop("The location of the nodes should be 'equi', 'quant' or 'manual'")
}
if(!is.null(hazardnodes))
{
if(!any(locnodes == "manual")) stop("hazardnodes should be NULL if the nodes are not chosen manually")
if(length(hazardnodes) != nznodes) stop(paste("Vector hazardnodes should be of length",nznodes))
}
}
else
{
if(!is.null(hazardnodes)) stop("hazardnodes should be NULL if Weibull baseline risk functions are chosen")
}
if(nbevt>1 & length(arghaz)==1 & nznodes>0)
{
hazardnodes <- rep(hazardnodes,length.out=nznodes*nbevt)
}
nrisqtot <- sum(nprisq)
zi <- matrix(0,nrow=max(nz),ncol=nbevt)
nb <- 0
minT1 <- 0
maxT1 <- max(tsurv)
tsurvevt <- tsurv
if(idtrunc==1)
{
minT1 <- min(tsurv,tsurv0)
maxT1 <- max(tsurv,tsurv0)
}
## arrondir
minT2 <- round(minT1,3)
if(minT1<minT2) minT2 <- minT2-0.001
minT <- minT2
maxT2 <- round(maxT1,3)
if(maxT1>maxT2) maxT2 <- maxT2+0.001
maxT <- maxT2
if(length(hazardrange)){
if(hazardrange[1]>minT) stop(paste("hazardrange[1] should be <=",minT))
if(hazardrange[2]>maxT) stop(paste("hazardrange[2] should be >=",maxT))
minT <- hazardrange[1]
maxT <- hazardrange[2]
}
startWeib <- rep(0,nbevt)
startWeib[which(typrisq==2)] <- rep(startWeibull, length.out=length(which(typrisq==2)))
ii <- 0
for(i in 1:nbevt)
{
if(typrisq[i]==2)
{
if(minT < startWeib[i]) stop("Some entry or event times are bellow startWeibull")
zi[1:2,i] <- c(startWeib[i],maxT)
}
else
{
ii <- ii+1
if(locnodes[ii]=="manual")
{
zi[1:nz[i],i] <- c(minT,hazardnodes[nb+1:(nz[i]-2)],maxT)
nb <- nb + nz[i]-2
}
if(locnodes[ii]=="equi")
{
zi[1:nz[i],i] <- seq(minT,maxT,length.out=nz[i])
}
if(locnodes[ii]=="quant")
{
pi <- c(1:(nz[i]-2))/(nz[i]-1)
qi <- quantile(tsurvevt,prob=pi)
zi[1,i] <- minT
zi[2:(nz[i]-1),i] <- qi
zi[nz[i],i] <- maxT
}
}
}
###TS: cas ou gi(bi,t)=niv.courant -> construction des matrices design pr predire lambda a chq pnt de quadrature GK
if(sharedtype == 'RE'){ #gi(bi,t)=bi
Xpred <- 0
Xpred_Ti <- 0
nbXpred <- 0
}
if(sharedtype == 'CL'){ #gi(bi,t)=niv.courant(t)
# /!\ si varexp dependante du tps (autre que var.time), prediction impossible
nom.var <- c(attr(terms(fixed2[-2]), "term.labels"),attr(terms(random), "term.labels")) # noms des covariables EF et EA
nom.var <- nom.var[!str_detect(nom.var,var.time)] # nom.var[!(nom.var == var.time)] # noms des variables, autre que celle incluant var.time
if(length(nom.var)>0){
for(v in 1:length(nom.var)){ #verif: covariables (hors var.time) independantes du temps
tmp <- unique(na.omit(data[,c(subject,nom.var[v])])) #dataframe 2 colonnes : subject et var v, en supprimant les lignes doublons
if(nrow(tmp) != length(unique(IND))) #var v dependante du temps
stop(paste(nom.var[v]," variable seems to be time dependant, can't use sharedtype='CL' due to impossibility to predict"))
}
}
## matrices design
# database contenant les variables des EFs et des EAs, 1 ligne par sujet
data_tmp <- as.data.frame(unique(na.omit(data[,c(subject,nom.var)])))
colnames(data_tmp) <- c(subject,nom.var)
data_tmp$tmp_name <- NA
colnames(data_tmp)[which(colnames(data_tmp) == "tmp_name")] <- var.time
if(nrow(data_tmp)!=length(unique(IND))) stop("Make sure at least 1 visit per subject has complete data") # si un sujet a un valeur NA a 1 covar a chq visit, alors il n'est plus pris en compte
if(nrow(data_tmp)!=length(tsurv)) stop("pblm")
# Xpredcl_vectTi (Ti event) vecteur par patient
data_tmp[,var.time] <- tsurv # remplacer la variable de temps par T
#matrices design
mat_ef <- model.matrix(object = fixed2[-2], data = data_tmp) # EF sans contraste et sans outcome,
mat_ef <- as.data.frame(mat_ef[,-1]) # et sans intercept car non-estime
mat_ea <- model.matrix(object = random, data = data_tmp) # EA
# concatenation
Xpredcl_vectTi <- cbind(data_tmp[,var.time],mat_ef,mat_ea) # 1ere colonne = tps de quadrature
#nb_ind_Xpred <- c(ncol(mat_ef),ncol(mat_ea)) #nb de var des EF, EA
Xpredcl_vectTi <- as.matrix(Xpredcl_vectTi)
Xpred_Ti <- Xpredcl_vectTi
nbXpred <- ncol(Xpred_Ti)
# noeuds de quadrature Gauss-Kronrod
ptGK_1 <- 0.991455371120812639206854697526329
ptGK_2 <- 0.949107912342758524526189684047851
ptGK_3 <- 0.864864423359769072789712788640926
ptGK_4 <- 0.741531185599394439863864773280788
ptGK_5 <- 0.586087235467691130294144838258730
ptGK_6 <- 0.405845151377397166906606412076961
ptGK_7 <- 0.207784955007898467600689403773245
ptGK_8 <- 0.000000000000000000000000000000000
# Xpredcl (Ti event)
data_tmp[,var.time] <- tsurv # remplacer la variable de temps par T
data_tmp[,var.time] <- (data_tmp[,var.time] - 0) / 2 #centr = centre de l intervalle 0 -> Ti
data_tmp$mid_interv_surv <- data_tmp[,var.time] #hlgth = demi longueur de l intervalle 0 -> Ti
data_tmp <- data_tmp[rep(1:nrow(data_tmp),each=15),] #1ligne/point quadrature/patient
data_tmp$ptGK <- c(ptGK_1,-ptGK_1,ptGK_2,-ptGK_2,ptGK_3,-ptGK_3,ptGK_4,-ptGK_4,ptGK_5,-ptGK_5,ptGK_6,-ptGK_6,ptGK_7,-ptGK_7,ptGK_8) #pnts de quadrature GK
data_tmp[,var.time] <- data_tmp[,var.time] + data_tmp$mid_interv_surv * data_tmp$ptGK #centr+absc ou absc=hlgth*pnt
#matrices design
mat_ef <- model.matrix(object = fixed2[-2], data = data_tmp) # EF sans contraste et sans outcome,
mat_ef <- as.data.frame(mat_ef[,-1]) # et sans intercept car non-estime
mat_ea <- model.matrix(object = random, data = data_tmp) # EA
# concatenation
Xpredcl <- cbind(data_tmp[,var.time],mat_ef,mat_ea) # 1ere colonne = tps de quadrature
Xpredcl <- as.matrix(Xpredcl)
Xpred <- Xpredcl
# Xpredcl0 (Ti0, tronc gche)
if(idtrunc==1){
data_tmp[,var.time] <- rep(tsurv0,each=15) # remplacer la variable de temps par T0
data_tmp[,var.time] <- (data_tmp[,var.time] - 0) / 2 #centr = centre de l intervalle 0 -> T0i
data_tmp$mid_interv_surv <- data_tmp[,var.time] #hlgth = demi longueur de l intervalle 0 -> T0i
data_tmp[,var.time] <- data_tmp[,var.time] + data_tmp$mid_interv_surv * data_tmp$ptGK #centr+absc ou absc=hlgth*pnt
#matrices design
mat_ef <- model.matrix(object = fixed2[-2], data = data_tmp) # EF sans contraste et sans outcome,
mat_ef <- as.data.frame(mat_ef[,-1]) # et sans intercept car non-estime
mat_ea <- model.matrix(object = random, data = data_tmp) # EA
# concatenation
Xpredcl0 <- cbind(data_tmp[,var.time],mat_ef,mat_ea)
Xpredcl0 <- as.matrix(Xpredcl0)
Xpred <- cbind(Xpred,Xpredcl0)
}else{
Xpredcl0 <- NULL
}
nbXpred <- c(nbXpred, ncol(Xpred))
}
##parametres pour Fortran
ns <- length(unique(IND))
nv <- dim(X0)[2]
nobs <- length(Y0)
idiag0 <- ifelse(idiag==TRUE,1,0)
nalea <- ifelse(randomY==TRUE,ny,0)
logspecif <- as.numeric(logscale)
#loglik <- 0
ni <- 0
istop <- 0
gconv <- rep(0,3)
resid_m <- rep(0,nobs)
resid_ss <- rep(0,nobs)
Yobs <- rep(0,nobs)
time <- seq(minT,maxT,length.out=nsim)
risq_est <- matrix(0,nrow=nsim,ncol=nbevt)
risqcum_est <- matrix(0,nrow=nsim,ncol=nbevt)
predRE_Y <- rep(0,ns*nalea)
rlindiv <- rep(0,ns)
marker <- rep(0,nsim*ny)
transfY <- rep(0,nsim*ny)
##nmes
nmes <- matrix(0,ns,ny)
for (k in 1:ny)
{
INDpresents <- which(unique(IND) %in% get(dataY[k])[,nom.subject])
nmes[INDpresents,k] <- as.vector(table(get(dataY[k])[,nom.subject]))
}
maxmes <- max(apply(nmes,1,sum))
##remplir idprob, etc
z.X0 <- strsplit(nom.unique,split=":",fixed=TRUE)
z.X0 <- lapply(z.X0,sort)
idea <- (z.X0 %in% z.random) + 0
idg <- (z.X0 %in% z.fixed) + 0
idcontr <- (z.X0 %in% z.contr) + 0
if (ncor>0) idcor <- colnames(X0) %in% cor.var.time +0
else idcor <- rep(0,nv)
idsurv <- z.X0 %in% z.surv + 2*(z.X0 %in% z.survcause)
idsurv[1] <- 0 # 0 pour l'intercept
idtdv <- z.X0 %in% nom.timedepvar + 0
## Si pas TimeDepVar dans formule survival
if(length(nom.timedepvar) & all(idtdv==0))
{
stop("Variable in 'TimeDepVar' should also appear as a covariate in the 'survival' argument")
}
## nb coef ppour survie
nvarxevt <- sum(idsurv==1) + nbevt*sum(idsurv==2)
nea <- sum(idea)
predRE <- rep(0,ns*nea)
## nb parametres d'association #TS
if(sharedtype == 'RE')
nasso <- nea*nbevt
if(sharedtype == 'CL')
nasso <- nbevt
## prm partie long
nef <- sum(idg==1)-1
ncontr <- (ny-1)*sum(idcontr)
nvc <- ifelse(idiag0==1,nea,nea*(nea+1)/2)-1
ntr <- rep(0,ny)
ntr[which(idlink==0)] <- 2
ntr[which(idlink==1)] <- 4
ntr[which(idlink==2)] <- nbzitr[which(idlink==2)]+2
ntr[which(idlink==3)] <- nbmod[which(idlink==3)]-1
ntrtot <- sum(ntr)
##nombre total de parametres
NPM <- nrisqtot + nvarxevt + nasso +
nef + ncontr + nvc + ncor + ntrtot + nalea + ny
V <- rep(0, NPM*(NPM+1)/2) #pr variance des parametres
## parametres MC
methInteg <- switch(methInteg,"MCO"=1,"MCA"=2,"QMC"=3)
seqMC <- 0
dimMC <- 0
if(methInteg==3)
{
dimMC <- nea+nalea
if(ncor>0) dimMC <- dimMC+maxmes
if(dimMC>0) seqMC <- randtoolbox::sobol(n=nMC,dim=dimMC,normal=TRUE,scrambling=1)
}
###valeurs initiales
if(!(missing(B)))
{
if(!is.vector(B)) stop("B should be a vector")
if (length(B)==NPM) b <- B
else stop(paste("Vector B should be of length",NPM))
}
else ## B missing
{
b <- rep(0,NPM)
if(nbevt>0){ #TS : si Weibull et prms au carre pr positivite -> valeurs par defaut = 1
if(any(hazard!="Weibull")==FALSE & isFALSE(logscale)==TRUE){
for(i in 1:nbevt){
if(typrisq[i]==2){
b[sum(nprisq[1:i])-nprisq[i]+1:nprisq[i]] <- 1
if(idtrunc==1 & any(Tentry==0)) #TS : si entree retardee et au moins un temps vaut 0
b[sum(nprisq[1:i])-nprisq[i]+nprisq[i]] <- 1.25 #sinon pblm lors de recherche prms, risq instant tend vers infini qd w2 < 1 et t=0
}
}
}
if(any(hazard %in% c("splines","Splines")) & idtrunc==1 & any(Tentry==0)){
for(i in 1:nbevt){
if(typrisq[i]==3){
b[sum(nprisq[1:i])-nprisq[i]+1:nprisq[i]] <- 10^-7 #sinon pgrm crash
}
}
}
}
if (nvc>0)
{
if(idiag==1) b[nrisqtot+nvarxevt+nasso+nef+ncontr+1:nvc] <- rep(1,nvc)
if(idiag==0)
{
init.nvc <- diag(nea)
init.nvc <- init.nvc[upper.tri(init.nvc, diag=TRUE)]
b[nrisqtot+nvarxevt+nasso+nef+ncontr+1:nvc] <- init.nvc[-1]
}
}
if(ncor>0) b[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor] <- 1
if(nalea>0) b[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+ntrtot+1:nalea] <- 1
b[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+ntrtot+nalea+1:ny] <- 1
for(k in 1:ny)
{
if(idlink[k]==0)
{
b[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:k])-1] <- mean(get(dataY[k])[,nomsY[k]])
b[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:k])] <- 1
}
if(idlink[k]==1)
{
b[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:k])-3] <- 0
b[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:k])-2] <- -log(2)
b[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:k])-1] <- 0.7
b[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:k])] <- 0.1
}
if(idlink[k]==2)
{
b[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:k])-ntr[k]+1] <- -2
b[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:k])-ntr[k]+2:ntr[k]] <- 0.1
}
if(idlink[k]==3)
{
b[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:k])-ntr[k]+1] <- 0
if(ntr[k]>1) b[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:k])-ntr[k]+2:ntr[k]] <- 0.1
}
}
}
## ## faire wRandom et b0Random
## nef2 <- sum(idg!=0)-1 + (ny-1)*sum(idcontr)
## NPM2 <- nef2+ nvc+ncor+nalea+ny+ntrtot
## wRandom <- rep(0,NPM)
## b0Random <- rep(0,ng-1)
## l <- 0
## t <- 0
## m <- 0
## for (i in 1:nv)
## {
## if(idg[i]==1)
## {
## if(i==1) next
## l <- l+1
## t <- t+1
## wRandom[nprob+t] <- l
## }
## if(idg[i]==2)
## {
## if(i==1)
## {
## t <- t+ng-1
## b0Random <- c(b0Random,rep(0,ng-1))
## next
## }
## l <- l+1
## for (g in 1:ng)
## {
## t <- t+1
## wRandom[nprob+t] <- l
## }
## }
## if(idcontr[i]==1)
## {
## wRandom[nef-ncontr+m+1:(ny-1)] <- nef2-ncontr+m+1:(ny-1)
## m <- m+ny-1
## }
## }
## if(nvc>0)
## {
## wRandom[nef+1:nvc] <- nef2+1:nvc
## }
## if(nw>0)
## {
## b0Random <- c(b0Random,rep(1,ng-1))
## }
## if(ncor>0) wRandom[nef+nvc+nw+1:ncor] <- nef2+nvc+1:ncor
## wRandom[nef+nvc+nw+ncor+1:ny] <- nef2+nvc+ncor+1:ny
## if(nalea>0)
## {
## wRandom[nef+nvc+nw+ncor+ny+1:nalea] <- nef2+nvc+ncor+ny+1:nalea
## }
## wRandom[nef+nvc+nw+ncor+ny+nalea+1:ntrtot] <- nef2+nvc+ncor+ny+nalea+1:ntrtot
## ## wRandom et b0Random ok.
##------------------------------------------
##------nom au vecteur best
##--------------------------------------------
nom.X0 <- colnames(X0)
nom.X0[nom.X0=="(Intercept)"] <- "intercept"
if(nbevt>0)
{
##prm fct de risque
if(isTRUE(logscale))
{
for(i in 1:nbevt)
{
nom1 <- rep(paste("event",i,sep=""),nprisq[i])
if(typrisq[i]==2)
{
names(b)[sum(nprisq[1:i])-nprisq[i]+1:nprisq[i]] <- paste(nom1[1:2]," log(Weibull",1:2,")",sep="")
}
if(typrisq[i]==1)
{
names(b)[sum(nprisq[1:i])-nprisq[i]+1:nprisq[i]] <- paste(nom1[1:(nz[i]-1)]," log(piecewise",1:(nz[i]-1),")",sep="")
}
if(typrisq[i]==3)
{
names(b)[sum(nprisq[1:i])-nprisq[i]+1:nprisq[i]] <- paste(nom1[1:(nz[i]-1)]," log(splines",1:(nz[i]+2),")",sep="")
}
}
}
else
{
for(i in 1:nbevt)
{
nom1 <- rep(paste("event",i,sep=""),nprisq[i])
if(typrisq[i]==2)
{
names(b)[sum(nprisq[1:i])-nprisq[i]+1:nprisq[i]] <- paste(nom1[1:2]," +/-sqrt(Weibull",1:2,")",sep="")
}
if(typrisq[i]==1)
{
names(b)[sum(nprisq[1:i])-nprisq[i]+1:nprisq[i]] <- paste(nom1[1:(nz[i]-1)]," +/-sqrt(piecewise",1:(nz[i]-1),")",sep="")
}
if(typrisq[i]==3)
{
names(b)[sum(nprisq[1:i])-nprisq[i]+1:nprisq[i]] <- paste(nom1[1:(nz[i]-1)]," +/-sqrt(splines",1:(nz[i]+2),")",sep="")
}
}
}
##prm covariables survival
nom1 <- NULL
for(j in 1:nv)
{
if(idsurv[j]==1) #X
{
if(idtdv[j]==1)
{
nom1 <- c(nom1,paste("I(t>",nom.timedepvar,")",sep=""))
}
else
{
nom1 <- c(nom1,nom.X0[j])
}
}
if(idsurv[j]==2) #cause(X)
{
if(idtdv[j]==1)
{
nom1 <- c(nom1,paste("I(t>",nom.timedepvar,") event",1:nbevt,sep=""))
}
else
{
nom1 <- c(nom1,paste(nom.X0[j],paste("event",1:nbevt,sep="")))
}
}
}
if(nvarxevt>0) names(b)[nrisqtot+1:nvarxevt] <- nom1
if(sharedtype == 'RE'){ #TS
for(i in 1:nbevt)
{
names(b)[nrisqtot+nvarxevt+(nbevt-1)*nea+1:nea] <- paste("event",i," asso",1:nea,sep="")
}
}
if(sharedtype == 'CL')
names(b)[nrisqtot+nvarxevt+1:nbevt] <- paste("event",1:nbevt," asso",sep="")
}
names(b)[nrisqtot+nvarxevt+nasso+1:nef] <- nom.X0[-1][idg[-1]!=0]
if(ncontr!=0) names(b)[nrisqtot+nvarxevt+nasso+nef+1:ncontr] <- paste("contrast",paste(rep(1:sum(idcontr),each=ny-1),rep(1:(ny-1),sum(idcontr)),sep=""),sep="")
if(idlink[1]==0) names(b)[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+1:ntr[1]]<- c("Linear 1","Linear 2")
if(idlink[1]==1) names(b)[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+1:ntr[1]]<- paste("Beta",c(1:ntr[1]),sep="")
if(idlink[1]==2) names(b)[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+1:ntr[1]]<- paste("I-splines",c(1:ntr[1]),sep="")
if(idlink[1]==3) names(b)[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+1:ntr[1]]<- paste("Thresh",c(1:ntr[1]),sep="")
if(ny>1)
{
for (yk in 2:ny)
{
if(idlink[yk]==0) names(b)[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:(yk-1)])+1:ntr[yk]]<- c("Linear 1","Linear 2")
if(idlink[yk]==1) names(b)[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:(yk-1)])+1:ntr[yk]]<- paste("Beta",c(1:ntr[yk]),sep="")
if(idlink[yk]==2) names(b)[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:(yk-1)])+1:ntr[yk]]<- paste("I-splines",c(1:ntr[yk]),sep="")
if(idlink[yk]==3) names(b)[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sum(ntr[1:(yk-1)])+1:ntr[yk]]<- paste("Thresh",c(1:ntr[yk]),sep="")
}
}
if(nvc!=0)names(b)[nrisqtot+nvarxevt+nasso+nef+ncontr+1:nvc] <- paste("varcov",c(1:nvc))
if(ncor>0) {names(b)[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+1:ncor] <- paste("cor",1:ncor,sep="")}
if(nalea!=0) names(b)[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+ntrtot+1:nalea] <- paste("std.randomY",1:ny,sep="")
names(b)[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+ntrtot+nalea+1:ny] <- paste("std.err",1:ny)
namesb <- names(b)
## prm fixes
fix <- rep(0,NPM)
if(length(posfix))
{
if(any(!(posfix %in% 1:NPM))) stop("Indexes in posfix are not correct")
fix[posfix] <- 1
}
if(length(posfix)==NPM) stop("No parameter to estimate")
if(!all(partialH %in% 1:NPM)) stop(paste("partialH should contain indices between 1 and",NPM))
# varcov RE
if(nvc!=0)
{
mvc <- matrix(0,nea,nea)
if(idiag0==0)
{
mvc[upper.tri(mvc, diag=TRUE)] <- c(1,b[nrisqtot+nvarxevt+nasso+nef+ncontr+1:nvc])
mvc <- t(mvc)
mvc[upper.tri(mvc, diag=TRUE)] <- c(1,b[nrisqtot+nvarxevt+nasso+nef+ncontr+1:nvc])
ch <- chol(mvc)
b[nrisqtot+nvarxevt+nasso+nef+ncontr+1:nvc] <- ch[upper.tri(ch, diag=TRUE)][-1]
}
else
{
b[nrisqtot+nvarxevt+nasso+nef+ncontr+1:nvc] <- sqrt(b[nrisqtot+nvarxevt+nasso+nef+ncontr+1:nvc])
}
}
# fixed prms
nfix <- sum(fix)
bfix <- 0
if(nfix>0)
{
bfix <- b[which(fix==1)]
b <- b[which(fix==0)]
NPM <- NPM-nfix
}
#browser()
###estimation
conv3 <- c(convB, convL, convG) #TS: pr reduire le nb d arguments ds appel fct Fortran
sharedtype <- ifelse(sharedtype == 'RE',1,2) #recodage 1 pr RE, 2 pr CL #TS
###############
### MLA ###
###############
if(maxiter==0)
{
vrais <- loglik(b,Y0,X0,tsurv0,tsurv,devt,ind_survint,idea,idg,idcor,idcontr,
idsurv,idtdv,typrisq,nz,zi,nbevt,idtrunc,logspecif,ny,ns,nv,
nobs,nmes,idiag0,ncor,nalea,NPM,nfix,bfix,epsY,
idlink,nbzitr,zitr,uniqueY0,indiceY0,nvalSPLORD,fix,
methInteg,nMC,dimMC,seqMC,sharedtype,nbXpred,Xpred_Ti,Xpred)
out <- list(conv=2, V=rep(NA, length(b)), best=b, predRE=NA, predRE_Y=NA,
Yobs=NA, resid_m=NA, resid_ss=NA, risqcum_est=NA, risq_est=NA,
marker=NA, transfY=NA, gconv=rep(NA,3), niter=0, loglik=vrais)
}
else
{
res <- mla(b=b, m=length(b), fn=loglik,
clustertype=clustertype,.packages=NULL,
epsa=convB,epsb=convL,epsd=convG,
digits=8,print.info=verbose,blinding=FALSE,
multipleTry=25,file="",partialH=partialH,
nproc=nproc,maxiter=maxiter,minimize=FALSE,
Y0=Y0,X0=X0,Tentr0=tsurv0,Tevt0=tsurv,Devt0=devt,
ind_survint0=ind_survint,idea0=idea,idg0=idg,idcor0=idcor,
idcontr0=idcontr,idsurv0=idsurv,idtdv0=idtdv,typrisq0=typrisq,
nz0=nz,zi0=zi,nbevt0=nbevt,idtrunc0=idtrunc,logspecif0=logspecif,
ny0=ny,ns0=ns,nv0=nv,nobs0=nobs,nmes0=nmes,idiag0=idiag0,
ncor0=ncor,nalea0=nalea,npm0=NPM,nfix0=nfix,bfix0=bfix,
epsY0=epsY,idlink0=idlink,nbzitr0=nbzitr,zitr0=zitr,
uniqueY0=uniqueY0,indiceY0=indiceY0,nvalSPLORD0=nvalSPLORD,
fix0=fix,methInteg0=methInteg,nMC0=nMC,dimMC0=dimMC,seqMC0=seqMC,
idst0=sharedtype,nXcl0=nbXpred,Xcl_Ti0=Xpred_Ti,Xcl_GK0=Xpred)
out <- list(conv=res$istop, V=res$v, best=res$b, predRE=NA, predRE_Y=NA, Yobs=NA,
resid_m=NA, resid_ss=NA, risqcum_est=NA, risq_est=NA, marker=NA,
transfY=NA, gconv=c(res$ca, res$cb, res$rdm), niter=res$ni,
loglik=res$fn.value)
}
## creer best a partir de b et bfix
best <- rep(NA,length(fix))
best[which(fix==0)] <- out$best
best[which(fix==1)] <- bfix
out$best <- best
NPM <- NPM+nfix
## mettre NA pour les variances et covariances non calculees et 0 pr les prm fixes
if(length(posfix))
{
mr <- NPM-length(posfix)
Vr <- matrix(0,mr,mr)
Vr[upper.tri(Vr,diag=TRUE)] <- out$V[1:(mr*(mr+1)/2)]
Vr <- t(Vr)
Vr[upper.tri(Vr,diag=TRUE)] <- out$V[1:(mr*(mr+1)/2)]
V <- matrix(0,NPM,NPM)
V[setdiff(1:NPM,posfix),setdiff(1:NPM,posfix)] <- Vr
V <- V[upper.tri(V,diag=TRUE)]
}
else
{
V <- out$V
}
## Creation du vecteur cholesky
Cholesky <- rep(0,(nea*(nea+1)/2))
if(idiag0==0 & nvc>0)
{
Cholesky[1:(nvc+1)] <- c(1,out$best[nrisqtot+nvarxevt+nasso+nef+ncontr+1:nvc])
## Construction de la matrice U
U <- matrix(0,nrow=nea,ncol=nea)
U[upper.tri(U,diag=TRUE)] <- Cholesky[1:(nvc+1)]
z <- t(U) %*% U
out$best[nrisqtot+nvarxevt+nasso+nef+ncontr+1:nvc] <- z[upper.tri(z,diag=TRUE)][-1]
}
if(idiag0==1 & nvc>0)
{
id <- 1:nea
indice <- rep(id+id*(id-1)/2)
Cholesky[indice] <- c(1,out$best[nrisqtot+nvarxevt+nasso+nef+ncontr+1:nvc])
out$best[nrisqtot+nvarxevt+nasso+nef+ncontr+1:nvc] <- out$best[nrisqtot+nvarxevt+nasso+nef+ncontr+1:nvc]**2
}
##predictions
predRE <- matrix(out$predRE,ncol=nea,byrow=T)
predRE <- data.frame(unique(IND),predRE)
colnames(predRE) <- c(nom.subject,nom.X0[idea!=0])
if (nalea!=0)
{
predRE_Y <- matrix(out$predRE_Y,ncol=ny,byrow=TRUE)
predRE_Y <- data.frame(unique(IND),predRE_Y)
colnames(predRE_Y) <- c(nom.subject,nomsY)
}
else
{
predRE_Y <- rep(NA,nalea*ns)
}
##pred
pred_m <- out$Yobs-out$resid_m
pred_ss <- out$Yobs - out$resid_ss
pred <- data.frame(IND,outcome,pred_m,out$resid_m,pred_ss,out$resid_ss,out$Yobs)
colnames(pred)<-c(nom.subject,"Yname","pred_m","resid_m","pred_ss","resid_ss","obs")
rownames(pred) <- NULL
## risques
if(nbevt>0)
{
risqcum_est <- matrix(out$risqcum_est,nrow=nsim,ncol=nbevt)
risq_est <- matrix(out$risq_est,nrow=nsim,ncol=nbevt)
predSurv <- cbind(time,risq_est,risqcum_est)
temp <- paste("event",1:nbevt,".RiskFct",sep="")
temp1 <- paste("event",1:nbevt,".CumRiskFct",sep="")
colnames(predSurv) <- c("time",temp,temp1)
rownames(predSurv) <- 1:nsim
}
else
{
predSurv <- NA
}
##estimlink
ysim <- matrix(out$marker,nsim,ny)
transfo <- matrix(out$transfY,nsim,ny)
estimlink <- as.vector(rbind(ysim,transfo))
estimlink <- matrix(estimlink,nsim,2*ny)
colnames(estimlink) <- paste(c("","transf"),rep(nomsY, each=2),sep="")
if(any(idlink==3))
{
if(any(2*nbmod[which(idlink==3)] > nsim))
{
nsim2 <- max(2*nbmod[which(idlink==3)])
estimlink2 <- matrix(NA, nrow=nsim2, ncol=2*ny)
estimlink2[1:nsim,] <- estimlink
colnames(estimlink2) <- colnames(estimlink)
estimlink <- estimlink2
}
seuils <- function(x)
{
n <- length(x)
if(n>1) res <- c(x[1],x[1]+cumsum(x[-1]^2))
if(n==1) res <- x[1]
return(res)
}
sumntr <- 0
for (k in 1:ny)
{
if(idlink[k]==3)
{
estimlink[,2*(k-1)+1] <- NA
estimlink[,2*(k-1)+2] <- NA
nb <- nbmod[k]-1
marker <- rep(modalites[[k]], each=2)
seuilsk <- seuils(out$best[nrisqtot+nvarxevt+nasso+nef+ncontr+nvc+ncor+sumntr+1:nb])
transfY <- rep(seuilsk, each=2)
transfY <- c(-Inf,transfY,Inf)
estimlink[1:length(marker), 2*(k-1)+1] <- marker
estimlink[1:length(transfY), 2*(k-1)+2] <- transfY
}
sumntr <- sumntr + ntr[k]
}
## remplacer -Inf et Inf
hy <- estimlink[,2*(1:ny)]
maxhy <- max(hy[is.finite(hy)])
minhy <- min(hy[is.finite(hy)])
for (k in 1:ny)
{
if(idlink[k]==3)
{
estimlink[1, 2*(k-1)+2] <- minhy - (maxhy - minhy)/10
estimlink[2*nbmod[k], 2*(k-1)+2] <- maxhy + (maxhy - minhy)/10
}
}
}
N <- rep(NA,12)
N[1] <- 0 #nprob
N[2] <- nrisqtot
N[3] <- nvarxevt
N[4] <- nasso
N[5] <- nef
N[6] <- ncontr
N[7] <- nvc
N[8] <- 0 #nw
N[9] <- ncor
N[10] <- ntrtot
N[11] <- nalea
N[12] <- ny
N[13] <- nobs
N[14] <- nbevt
nevent <- rep(0,nbevt)
for(ke in 1:nbevt)
{
nevent[ke] <- length(which(devt==ke))
}
Nprm <- c(nprisq,ntr)
nom.X0[nom.X0=="(Intercept)"] <- "Intercept"
## noms des variables
Names <- list(Xnames=nom.X0,Ynames=nomsY,
ID=nom.subject,Tnames=noms.surv,
TimeDepVar.name=nom.timedepvar,
Xvar=setdiff(ttesLesVar,noms.surv))
names(modalites) <- nomsY
form <- list(fixed=fixed2[-2], random=random, contr=contr,
form.commun=form.commun, form.cause=form.cause,
form.cor=form.cor)
cost <- proc.time()-ptm
res <-list(ns=ns,idg=idg,idcontr=idcontr,idea=idea,idcor=idcor,
idsurv=idsurv,idtdv=idtdv,
loglik=out$loglik,best=out$best,V=V,gconv=out$gconv,conv=out$conv,
call=cl,niter=out$niter,N=N,nevent=nevent,Nprm=Nprm,
idiag=idiag0,pred=pred,predRE=predRE,
predRE_Y=predRE_Y,Names=Names,form=form,cholesky=Cholesky,
logspecif=logspecif,predSurv=predSurv,typrisq=typrisq,hazardnodes=zi,nz=nz,
estimlink=estimlink,epsY=epsY,linktype=idlink,linknodes=zitr,nbnodes=nbnodes,nbmod=nbmod,mod=modalites,
na.action=nayk,AIC=2*(length(out$best)-length(posfix)-out$loglik),BIC=(length(out$best)-length(posfix))*log(ns)-2*out$loglik,data=datareturn,
#wRandom=wRandom,b0Random=b0Random,
posfix=posfix,CPUtime=cost[3])
names(res$best) <- namesb
class(res) <-c("jointLPM")
return(res)
}
loglik <- function(b0,Y0,X0,Tentr0,Tevt0,Devt0,ind_survint0,idea0,idg0,idcor0,idcontr0,
idsurv0,idtdv0,typrisq0,nz0,zi0,nbevt0,idtrunc0,logspecif0,ny0,ns0,
nv0,nobs0,nmes0,idiag0,ncor0,nalea0,npm0,nfix0,bfix0,
epsY0,idlink0,nbzitr0,zitr0,uniqueY0,indiceY0,nvalSPLORD0,fix0,
methInteg0,nMC0,dimMC0,seqMC0,idst0,nXcl0,Xcl_Ti0,Xcl_GK0)
{
res <- 0
.Fortran(C_loglik,as.double(Y0),as.double(X0),as.double(Tentr0),as.double(Tevt0),as.integer(Devt0),as.integer(ind_survint0),as.integer(idea0),as.integer(idg0),as.integer(idcor0),as.integer(idcontr0),as.integer(idsurv0),as.integer(idtdv0),as.integer(typrisq0),as.integer(nz0),as.double(zi0),as.integer(nbevt0),as.integer(idtrunc0),as.integer(logspecif0),as.integer(ny0),as.integer(ns0),as.integer(nv0),as.integer(nobs0),as.integer(nmes0),as.integer(idiag0),as.integer(ncor0),as.integer(nalea0),as.integer(npm0),as.double(b0),as.integer(nfix0),as.double(bfix0),as.double(epsY0),as.integer(idlink0),as.integer(nbzitr0),as.double(zitr0),as.double(uniqueY0),as.integer(indiceY0),as.integer(nvalSPLORD0),as.integer(fix0),as.integer(methInteg0),as.integer(nMC0),as.integer(dimMC0),as.double(seqMC0),as.integer(idst0),as.integer(nXcl0),as.double(Xcl_Ti0),as.double(Xcl_GK0),loglik_res=as.double(res))$loglik_res
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.