haplo.surv: Fit Cox-Aalen survival model for haplotype effects

Description Usage Arguments Details Value Author(s) References Examples

Description

Fits an Cox-Aalen survival model. Time dependent variables and counting process data (multiple events per subject) are possible.

λ_{i}(t) = Y_i(t) ( X_{i}(h)^T(t) α(t) ) \exp(Z_{i}(h)^T β )

where X_i(h) and Z_i(h) are the design vectors specified by the design functions given in the call.

The model thus contains the Cox's regression model as special case.

Resampling is used for computing p-values for tests of time-varying effects. Test for proportionality is considered by considering the score processes for the proportional effects of model.

The modelling formula uses the standard survival modelling given in the survival package.

The haplotype model assumes Hardy-Weinberg equilibrium

P(H=(h_k,h_j)) = π_k pi_j

with K different haplotypes, where we parametrize the frequencies as

π_k= \frac{ \exp(θ_k) }{1+θ_1+...+θ_{K-1}+0}

We allow a regression design on the haplotype parameters to reduce the dimensionality

θ = G α

Usage

1
2
3
4
5
6
7
8
haplo.surv(formula = formula(data),data=sys.parent(),
    designfuncX,designfuncZ, beta = 0, match=FALSE,
    Nit = 10, detail = 0, start.time = 0, max.time = NULL, 
    id = NULL, n.sim = 500, geno.type = NULL, geno.setup=NULL, 
    haplo.freq = NULL, fix.beta = 0, fix.haplofreq = 0, 
    two.stage = 0, weighted.test = 0, step = 1, lev.marq=1, 
    min.lev.marq=0, haplo.design=NULL,haplo.baseline=NULL,
    alpha=NULL,resample.iid=1, covnamesX=NULL,covnamesZ=NULL)

Arguments

formula

a formula object with the response on the left of a '~' operator, and the independent terms on the right as regressors. The response must be a survival object as returned by the ‘Surv’ function. Terms with a proportional effect are specified by the wrapper prop(). Haplotype effects specified using the two designfunctions.

data

a data.frame with the variables.

designfuncX

R-function(x,z,h) that gives the X design given x,z and h. See ?design for more details on this.

designfuncZ

R-function(x,z,h) that gives the Z design given x,z and h.

start.time

start of observation period where estimates are computed.

max.time

end of observation period where estimates are computed. Estimates thus computed from [start.time, max.time]. Default is max of data.

id

For timevarying covariates the variable must associate each record with the id of a subject.

n.sim

number of simulations in resampling.

weighted.test

to compute a variance weighted version of the test-processes used for testing time-varying effects.

beta

starting value for relative risk estimates

match

if match is true it the matched survival model, see haplo.match.

Nit

number of iterations for Newton-Raphson algorithm.

detail

if 0 no details is printed during iterations, if 1,2,3 more details are given.

geno.type

geno types associated with subjects.

geno.setup

genotype setup based on genotype function.

haplo.freq

haplo type frequencies for starting values or for fixed values.

fix.beta

option for keeping beta fixed.

fix.haplofreq

option for keeping haplotype frequencies fixed.

two.stage

option to specify that genotype likelihood is based only on genotype likelihood, if two.stage=0 score equations for haplotype parameters also uses information from survival data.

step

step size for iteration steps.

lev.marq

starting value for Levenberg-Marquart algorithm.

min.lev.marq

stopping value for Levenberg-Marquart algorithm after initial steps, min.lev.marq=0 is stopping value for Levenberg-Marquart algorithm after initial steps, min.lev.marq=0 is Newton-Raphson steps, that are needed to get correct standard errors.

haplo.design

design for haplo parameters, default is diagonal of size K-1

haplo.baseline

gives the name of the geno.setup that should be used as baseline.

alpha

starting value of parameters for haplo.design

resample.iid

set to 1 get i.i.d. decomposition of baseline and beta

covnamesX

names for output related to additive part of model.

covnamesZ

names for output related to proportional part of model.

Details

The data for a subject is presented as multiple rows or 'observations', each of which applies to an interval of observation (start, stop]. For counting process data with the )start,stop] notation is used the 'id' variable is needed to identify the records for each subject. The program assumes that there are no ties, and if such are present random noise is added to break the ties.

Value

returns an object of type "cox.aalen". With the following arguments:

cum

cumulative timevarying regression coefficient estimates are computed within the estimation interval.

var.cum

the martingale based pointwise variance estimates.

robvar.cum

robust pointwise variances estimates.

gamma

estimate of parametric components of model.

var.gamma

variance for gamma.

robvar.gamma

robust variance for gamma.

haplo.pars

estimates for haplotype frequency parameters.

haplo.freqs

estimates for haplotype frequencies based on parameters.

haplo.alpha

estimates for haplotype frequencies based on haplo.design.

freq.names

names of haplotypes.

var.haplo.alpha

variance matrix for alpha

robvar.haplo.alpha

variance matrix for alpha

D2linv.all

inverse of derivative of score for all parameters.

D2linv.gamma

inverse of derivative of score for relative risk parameters.

D2linv.haplo.alpha

inverse of derivative of score for alpha (haplo-frequency) parameters.

score

score of last iteration.

obs.testBeq0

observed absolute value of supremum of cumulative components scaled with the variance.

pval.testBeq0

p-value for covariate effects based on supremum test.

sim.testBeq0

resampled supremum values.

obs.testBeqC

observed absolute value of supremum of difference between observed cumulative process and estimate under null of constant effect.

pval.testBeqC

p-value based on resampling.

sim.testBeqC

resampled supremum values.

obs.testBeqC.is

observed integrated squared differences between observed cumulative and estimate under null of constant effect.

pval.testBeqC.is

p-value based on resampling.

sim.testBeqC.is

resampled supremum values.

conf.band

resampling based constant to construct robust 95% uniform confidence bands.

test.procBeqC

observed test-process of difference between observed cumulative process and estimate under null of constant effect over time.

sim.test.procBeqC

list of 50 random realizations of test-processes under null based on resampling.

covariance

covariances for nonparametric terms of model.

B.iid

Resample processes for nonparametric terms of model.

gamma.iid

Resample processes for parametric terms of model.

D2linv

inverse of the derivative of the score function.

score

value of score for final estimates.

test.procProp

observed score process for proportional part of model.

var.score

variance of score process (optional variation estimator for beta.fixed=1 and robust estimator otherwise).

pval.Prop

p-value based on resampling.

sim.supProp

re-sampled absolute supremum values.

sim.test.procProp

list of 50 random realizations of test-processes for proportionality under the model based on resampling.

haplo.design

design that relates haplotype parameters to alpha parameters.

Author(s)

Thomas Scheike

References

Scheike, Martinussen and Silver, Estimating haplotype effects for survival data, Biometrics, to appear.

Scheike, Martinussen and Zhang, The additive risk model for estimation of haplotype effects, Submitted.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
library(timereg)
### Simulating data 
n<-400; 
haplo.types<-rbind(c(0,0),c(1,0),c(0,1),c(1,1))
### randomly selects haplotypes
pairs<-matrix(sample(1:4,n*2,replace=TRUE),n,2) 

### observed genotype, haplotype-pair 1 = g[c(1,3),]
g<-matrix(haplo.types[pairs,],n,4) 

gs<-geno.setup(g); 

Z<-matrix(rnorm(n),n,1); Z<-cbind(Z,Z*0.5+1+rnorm(n))
X<-matrix(rbinom(n,1,0.5),n,1); X<-cbind(1,X);
Zh<- apply((g[,c(1,3)]==c(0,0)),1,prod)+
     apply((g[,c(2,4)]==c(0,0)),1,prod)  ### counts 0,0

### simulates model with baselines depending on X and 
### proportional in Z and Zh 
### Zh observed only through genotype

time<-exp(Z %*% c(-0.3,0.3) + Zh * 0.5 ) * rexp(n)/ X %*% c(0.2,0.5)
status<-(time<3); time[status==0]<-3
simdata=data.frame(time=time,status=status,
             x1=X[,1],x2=X[,2],z1=Z[,1],z2=Z[,2],zh=Zh)

### specifies Cox-Aalen model 
designX<-function(x,z,h) { ### design for baseline, does not depend on haplotype
return(x)
}
designZ<-function(x,z,h) { ### design for proportional part of model 
h<-round(h); 
zh<-(h[1]==0)+(h[2]==0) ### counts number of haplotype 0 = "0,0" see gs
y<-c(z,zh)
return(y)
}
designZ(c(1,3),c(1,1),c(0,0)); designZ(c(1,3),c(1,1),c(1,0))

hapfit<-haplo.freqs(g,geno.setup=gs)

out<-haplo.surv(Surv(time,status)~x2+prop(z1)+prop(z2),data=simdata,
designfuncX=designX,designfuncZ=designZ,geno.setup=gs,
two.stage=1,alpha=hapfit$haplo.alpha) 
### MLE starting value alpha

summary(out);
plot(out,sim.ci=1) ### plots baseline with confidence bands
plot(out,score=1)  ### score process test for proportionality
out$haplo.freq     ### haplotype frequencies in model 

### predicts survival with Z=0 and with 0 and 2 number of "0,0"
pout<-predict(out,X=rbind(c(1,1),c(1,1)),Z=rbind(c(0,0,0),c(0,0,2)))
plot(pout)                 ### the two survival curves
plot(pout,uniform=1,se=1)  ### with 95 % confidence bands and intervals

########################################################################
#### structured haplotype model ########################################
########################################################################

X<-matrix(1,3,1); 
hapfit<-haplo.freqs(g,geno.setup=gs,haplo.design=X)

out<-haplo.surv(Surv(time,status)~x2+prop(z1)+prop(z2),data=simdata,
designfuncX=designX,designfuncZ=designZ,geno.setup=gs,
two.stage=1,haplo.design=X,alpha=hapfit$haplo.alpha)
summary(out);
out$haplo.freq ### haplotype frequencies in model 
par(mfrow=c(1,2))
plot(out,sim.ci=1) 

###############################################################
### Fits additive hazard model ################################
###############################################################

### design for baseline, does not depend on haplotype
designXa<-function(x,z,h) 
{ 
h<-round(h); 
### counts number of haplotype 0 = "0,0" see gs for coding
zh<-(h[1]==0)+(h[2]==0) 
y<-c(x,zh) 
return(y)
}

### proportional part, returns dummy with fixed effect 0
designZa<-function(x,z,h) 
{ 
return(z)
}

### Nit=1 because we start with the correct MLE haplotype estimates 

out<-haplo.surv(Surv(time,status)~x2,data=simdata,
designfuncX=designXa,designfuncZ=designZa,geno.setup=gs,
fix.beta=1,two.stage=1,haplo.design=X,alpha=hapfit$haplo.alpha)
summary(out);
par(mfrow=c(1,3)); 
plot(out,sim.ci=1) ### plots baseline with confidence bands

### predicts survival with Z=0 and with 0 and 2 number of "0,0"
pout<-predict(out,X=rbind(c(1,1,0),c(1,1,2)),Z=rbind(0,0))
par(mfrow=c(1,3))
plot(pout,multiple=1,se=0,uniform=0,col=2:3) 
plot(pout,uniform=1,se=1)   
par(mfrow=c(1,2))
plot(pout,multiple=1,uniform=1,col=2:3,transparency=1) 
plot(pout,multiple=1,uniform=1,col=2:3,transparency=2)   

HaploSurvival documentation built on May 2, 2019, 5:49 p.m.