ah.2ph: Fit Additive Hazards Regression Models to Two-phase Sampling

Description Usage Arguments Value Note References See Also Examples

View source: R/ah2phase.R

Description

The function fits a semiparametric additive hazards model

λ(t|Z=z) = λ_0(t) + β'z.

to two-phase sampling data. The estimating procedures follow Hu (2014).

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
ah.2ph(
  formula,
  data,
  R,
  Pi = NULL,
  weights = NULL,
  ties,
  robust = FALSE,
  calibration.variables = NULL,
  seed = 20,
  ...
)

Arguments

formula

a formula object for the regression model of the form response ~ predictors. The outcome is a survival object created by Surv.

data

a data frame. Input dataset.

R

a phase II membership indicator. A vector of values of 0 and 1. The subject is selected to phase II if R = 1.

Pi

the probability of a subject to be selected to the phase II subsample.

weights

weight assigned to each individual, inverse of the selection probability

ties

a string. If there are ties in the survival time, when ties = 'break' a small random number is added to the survival time to break the ties.

robust

a logical variable. Robust standard errors are provided if robust = TRUE.

calibration.variables

a vector of strings of some column names of the data. These are the variables available for every observation. They are used to calibrate the weight assigned to each subject

seed

an integer. Seed number used to generate random increment when breaking ties. The default number is 20.

...

additional arguments to be passed to the low level regression fitting functions.

Value

An object of class 'ah.2h' representing the fit.

Note

This function estimates both model-based and robust standard errors. It can be used to analyze case-cohort studies with subsampling among cases. It allows weight calibration with auxiliary information from the full cohort (phase I sample). By this means, more information is used and thus weight calibration potentially could further improve the precision of prediction or our estimation on the regression coefficients.

References

Jie Hu (2014) A Z-estimation System for Two-phase Sampling with Applications to Additive Hazards Models and Epidemiologic Studies. Dissertation, University of Washington.

See Also

predict.ah.2ph for prediction based on fitted additive hazards model with two-phase sampling and nwtsco for the description of nwtsco dataset.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
library(survival)
### fit an additive hazards model to two-phase sampling data without calibration
fit1 <- ah.2ph(Surv(trel,relaps) ~ age + histol, data = nwts2ph, R = in.ph2, Pi = Pi,
 robust = FALSE,  ties = 'break')
summary(fit1)

### use weight instead of the selection probability Pi in the input
fit1 <- ah.2ph(Surv(trel,relaps) ~ age + histol, data = nwts2ph, R = in.ph2, weights = wts,
 robust = FALSE,  ties = 'break')
summary(fit1)

### fit an additve hazards model with calibration on age
fit2 <- ah.2ph(Surv(trel,relaps) ~ age + histol, data = nwts2ph, R = in.ph2, 
            Pi = Pi, robust = FALSE, ties = 'break', calibration.variables = 'age')
summary(fit2)

### calibrate on age square
### note if users create a calibration variable, then
### the new variable should be added to the original data frame
nwts2ph$age2 <- nwts2ph$age^2
fit3 <- ah.2ph(Surv(trel,relaps) ~ age + histol,  data = nwts2ph, 
 R = in.ph2, Pi = Pi, robust = FALSE, ties = 'break', calibratio.variables = 'age2')
summary(fit3)

#############################################################################
## When phase II samples are obtained by finite Sampling     
#############################################################################

### calculating the sample size for each straum
### calculate the strata size
strt.size <- table(nwts2ph$strt)
ph2.strt.size <- table(subset(nwts2ph, in.ph2 == 1)$strt)
### fit an additve hazards model with finite stratified sampling
### calculate the sampling fractions
frac <- ph2.strt.size/strt.size
### treating the problem as bernoulli sampling coupled with calibration on strata sizes
### using frac as the sampling probilities
nwts2ph_by_FPSS <- nwts2ph
nwts2ph_by_FPSS$Pi <- NULL
for (i in 1:length(strt.size)){
  nwts2ph_by_FPSS$Pi[nwts2ph_by_FPSS$strt ==i] <- frac[i]
}

### create strt indicators, which become our calibration variables
for (i in 1:length(strt.size)){
   nwts2ph_by_FPSS$strt_ind <- as.numeric(nwts2ph_by_FPSS$strt ==i)
   names(nwts2ph_by_FPSS)[ncol(nwts2ph_by_FPSS)]= paste0('strt', i)
}
### fit an additve hazards model with finate sampling
fit4 <- ah.2ph(Surv(trel,relaps) ~ age + histol,
                                   data = nwts2ph_by_FPSS, 
                                   R = in.ph2, Pi = Pi,
                                   robust = FALSE,
                                   ties = 'break',
                                   calibration.variables = c('strt1','strt2','strt3'))
summary(fit4)

katehu/addhazard documentation built on July 20, 2020, 5:06 a.m.