# haplo.surv: Fit Cox-Aalen survival model for haplotype effects In HaploSurvival: Modelling of haplo-effects for survival data, including competing risks. Effect of haplomatch for BMT studies.

## Description

Fits an Cox-Aalen survival model. Time dependent variables and counting process data (multiple events per subject) are possible.

λ_{i}(t) = Y_i(t) ( X_{i}(h)^T(t) α(t) ) \exp(Z_{i}(h)^T β )

where X_i(h) and Z_i(h) are the design vectors specified by the design functions given in the call.

The model thus contains the Cox's regression model as special case.

Resampling is used for computing p-values for tests of time-varying effects. Test for proportionality is considered by considering the score processes for the proportional effects of model.

The modelling formula uses the standard survival modelling given in the survival package.

The haplotype model assumes Hardy-Weinberg equilibrium

P(H=(h_k,h_j)) = π_k pi_j

with K different haplotypes, where we parametrize the frequencies as

π_k= \frac{ \exp(θ_k) }{1+θ_1+...+θ_{K-1}+0}

We allow a regression design on the haplotype parameters to reduce the dimensionality

θ = G α

## Usage

 1 2 3 4 5 6 7 8 haplo.surv(formula = formula(data),data=sys.parent(), designfuncX,designfuncZ, beta = 0, match=FALSE, Nit = 10, detail = 0, start.time = 0, max.time = NULL, id = NULL, n.sim = 500, geno.type = NULL, geno.setup=NULL, haplo.freq = NULL, fix.beta = 0, fix.haplofreq = 0, two.stage = 0, weighted.test = 0, step = 1, lev.marq=1, min.lev.marq=0, haplo.design=NULL,haplo.baseline=NULL, alpha=NULL,resample.iid=1, covnamesX=NULL,covnamesZ=NULL) 

## Arguments

 formula a formula object with the response on the left of a '~' operator, and the independent terms on the right as regressors. The response must be a survival object as returned by the ‘Surv’ function. Terms with a proportional effect are specified by the wrapper prop(). Haplotype effects specified using the two designfunctions. data a data.frame with the variables. designfuncX R-function(x,z,h) that gives the X design given x,z and h. See ?design for more details on this. designfuncZ R-function(x,z,h) that gives the Z design given x,z and h. start.time start of observation period where estimates are computed. max.time end of observation period where estimates are computed. Estimates thus computed from [start.time, max.time]. Default is max of data. id For timevarying covariates the variable must associate each record with the id of a subject. n.sim number of simulations in resampling. weighted.test to compute a variance weighted version of the test-processes used for testing time-varying effects. beta starting value for relative risk estimates match if match is true it the matched survival model, see haplo.match. Nit number of iterations for Newton-Raphson algorithm. detail if 0 no details is printed during iterations, if 1,2,3 more details are given. geno.type geno types associated with subjects. geno.setup genotype setup based on genotype function. haplo.freq haplo type frequencies for starting values or for fixed values. fix.beta option for keeping beta fixed. fix.haplofreq option for keeping haplotype frequencies fixed. two.stage option to specify that genotype likelihood is based only on genotype likelihood, if two.stage=0 score equations for haplotype parameters also uses information from survival data. step step size for iteration steps. lev.marq starting value for Levenberg-Marquart algorithm. min.lev.marq stopping value for Levenberg-Marquart algorithm after initial steps, min.lev.marq=0 is stopping value for Levenberg-Marquart algorithm after initial steps, min.lev.marq=0 is Newton-Raphson steps, that are needed to get correct standard errors. haplo.design design for haplo parameters, default is diagonal of size K-1 haplo.baseline gives the name of the geno.setup that should be used as baseline. alpha starting value of parameters for haplo.design resample.iid set to 1 get i.i.d. decomposition of baseline and beta covnamesX names for output related to additive part of model. covnamesZ names for output related to proportional part of model.

## Details

The data for a subject is presented as multiple rows or 'observations', each of which applies to an interval of observation (start, stop]. For counting process data with the )start,stop] notation is used the 'id' variable is needed to identify the records for each subject. The program assumes that there are no ties, and if such are present random noise is added to break the ties.

## Value

returns an object of type "cox.aalen". With the following arguments:

 cum cumulative timevarying regression coefficient estimates are computed within the estimation interval. var.cum the martingale based pointwise variance estimates. robvar.cum robust pointwise variances estimates. gamma estimate of parametric components of model. var.gamma variance for gamma. robvar.gamma robust variance for gamma. haplo.pars estimates for haplotype frequency parameters. haplo.freqs estimates for haplotype frequencies based on parameters. haplo.alpha estimates for haplotype frequencies based on haplo.design. freq.names names of haplotypes. var.haplo.alpha variance matrix for alpha robvar.haplo.alpha variance matrix for alpha D2linv.all inverse of derivative of score for all parameters. D2linv.gamma inverse of derivative of score for relative risk parameters. D2linv.haplo.alpha inverse of derivative of score for alpha (haplo-frequency) parameters. score score of last iteration. obs.testBeq0 observed absolute value of supremum of cumulative components scaled with the variance. pval.testBeq0 p-value for covariate effects based on supremum test. sim.testBeq0 resampled supremum values. obs.testBeqC observed absolute value of supremum of difference between observed cumulative process and estimate under null of constant effect. pval.testBeqC p-value based on resampling. sim.testBeqC resampled supremum values. obs.testBeqC.is observed integrated squared differences between observed cumulative and estimate under null of constant effect. pval.testBeqC.is p-value based on resampling. sim.testBeqC.is resampled supremum values. conf.band resampling based constant to construct robust 95% uniform confidence bands. test.procBeqC observed test-process of difference between observed cumulative process and estimate under null of constant effect over time. sim.test.procBeqC list of 50 random realizations of test-processes under null based on resampling. covariance covariances for nonparametric terms of model. B.iid Resample processes for nonparametric terms of model. gamma.iid Resample processes for parametric terms of model. D2linv inverse of the derivative of the score function. score value of score for final estimates. test.procProp observed score process for proportional part of model. var.score variance of score process (optional variation estimator for beta.fixed=1 and robust estimator otherwise). pval.Prop p-value based on resampling. sim.supProp re-sampled absolute supremum values. sim.test.procProp list of 50 random realizations of test-processes for proportionality under the model based on resampling. haplo.design design that relates haplotype parameters to alpha parameters.

Thomas Scheike

## References

Scheike, Martinussen and Silver, Estimating haplotype effects for survival data, Biometrics, to appear.

Scheike, Martinussen and Zhang, The additive risk model for estimation of haplotype effects, Submitted.

## Examples

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 library(timereg) ### Simulating data n<-400; haplo.types<-rbind(c(0,0),c(1,0),c(0,1),c(1,1)) ### randomly selects haplotypes pairs<-matrix(sample(1:4,n*2,replace=TRUE),n,2) ### observed genotype, haplotype-pair 1 = g[c(1,3),] g<-matrix(haplo.types[pairs,],n,4) gs<-geno.setup(g); Z<-matrix(rnorm(n),n,1); Z<-cbind(Z,Z*0.5+1+rnorm(n)) X<-matrix(rbinom(n,1,0.5),n,1); X<-cbind(1,X); Zh<- apply((g[,c(1,3)]==c(0,0)),1,prod)+ apply((g[,c(2,4)]==c(0,0)),1,prod) ### counts 0,0 ### simulates model with baselines depending on X and ### proportional in Z and Zh ### Zh observed only through genotype time<-exp(Z %*% c(-0.3,0.3) + Zh * 0.5 ) * rexp(n)/ X %*% c(0.2,0.5) status<-(time<3); time[status==0]<-3 simdata=data.frame(time=time,status=status, x1=X[,1],x2=X[,2],z1=Z[,1],z2=Z[,2],zh=Zh) ### specifies Cox-Aalen model designX<-function(x,z,h) { ### design for baseline, does not depend on haplotype return(x) } designZ<-function(x,z,h) { ### design for proportional part of model h<-round(h); zh<-(h[1]==0)+(h[2]==0) ### counts number of haplotype 0 = "0,0" see gs y<-c(z,zh) return(y) } designZ(c(1,3),c(1,1),c(0,0)); designZ(c(1,3),c(1,1),c(1,0)) hapfit<-haplo.freqs(g,geno.setup=gs) out<-haplo.surv(Surv(time,status)~x2+prop(z1)+prop(z2),data=simdata, designfuncX=designX,designfuncZ=designZ,geno.setup=gs, two.stage=1,alpha=hapfit$haplo.alpha) ### MLE starting value alpha summary(out); plot(out,sim.ci=1) ### plots baseline with confidence bands plot(out,score=1) ### score process test for proportionality out$haplo.freq ### haplotype frequencies in model ### predicts survival with Z=0 and with 0 and 2 number of "0,0" pout<-predict(out,X=rbind(c(1,1),c(1,1)),Z=rbind(c(0,0,0),c(0,0,2))) plot(pout) ### the two survival curves plot(pout,uniform=1,se=1) ### with 95 % confidence bands and intervals ######################################################################## #### structured haplotype model ######################################## ######################################################################## X<-matrix(1,3,1); hapfit<-haplo.freqs(g,geno.setup=gs,haplo.design=X) out<-haplo.surv(Surv(time,status)~x2+prop(z1)+prop(z2),data=simdata, designfuncX=designX,designfuncZ=designZ,geno.setup=gs, two.stage=1,haplo.design=X,alpha=hapfit$haplo.alpha) summary(out); out$haplo.freq ### haplotype frequencies in model par(mfrow=c(1,2)) plot(out,sim.ci=1) ############################################################### ### Fits additive hazard model ################################ ############################################################### ### design for baseline, does not depend on haplotype designXa<-function(x,z,h) { h<-round(h); ### counts number of haplotype 0 = "0,0" see gs for coding zh<-(h[1]==0)+(h[2]==0) y<-c(x,zh) return(y) } ### proportional part, returns dummy with fixed effect 0 designZa<-function(x,z,h) { return(z) } ### Nit=1 because we start with the correct MLE haplotype estimates out<-haplo.surv(Surv(time,status)~x2,data=simdata, designfuncX=designXa,designfuncZ=designZa,geno.setup=gs, fix.beta=1,two.stage=1,haplo.design=X,alpha=hapfit\$haplo.alpha) summary(out); par(mfrow=c(1,3)); plot(out,sim.ci=1) ### plots baseline with confidence bands ### predicts survival with Z=0 and with 0 and 2 number of "0,0" pout<-predict(out,X=rbind(c(1,1,0),c(1,1,2)),Z=rbind(0,0)) par(mfrow=c(1,3)) plot(pout,multiple=1,se=0,uniform=0,col=2:3) plot(pout,uniform=1,se=1) par(mfrow=c(1,2)) plot(pout,multiple=1,uniform=1,col=2:3,transparency=1) plot(pout,multiple=1,uniform=1,col=2:3,transparency=2) 

HaploSurvival documentation built on May 2, 2019, 5:49 p.m.