Description Usage Arguments Details Value References See Also Examples
The sleete function uses a super learner to minimize the variance of an augmented estimator of a specified treatment effect measure in a randomized clinical trial. It returns a matrix of point estimates and standard errors for the super learner as well as individual algorithms in the super learner library, with or without sample splitting.
1 2 3 4 5 6 7 8 9 10 11 12 |
y |
Outcome data represented as a vector (for a univariate outcome) or a matrix (for a right-censored survival outcome or multiple outcomes to be analyzed together). For a right-censored survival outcome, y is a matrix with two columns: observed time followed by event type (1 failure; 0 censoring). |
t |
A vector of 1s and 0s representing treatment assignment. The values 1 and 0 represent the experimental and control treatments, respectively. The length of t should be equal to the number of subjects. |
X |
A matrix of baseline covariates that may be related to y in one or both treatment groups. The number of rows in X should be equal to the number of subjects. The number of columns in X is the number of covariates. There is no need to have a column of 1s in X. |
pi |
The probability of receiving the experimental treatment, usually known in a randomized clinical trial. If missing, will be replaced by the proportion of study subjects who were assigned to the experimental treatment. |
bounds |
Known lower and upper bounds, if any, for the treatment effect measure to be estimated. For example, if the effect measure is a difference between two probabilities, the natural bounds are c(-1,1). |
method |
A list of two mandatory components and one optional component
specifying the (unadjusted) method for estimating the treatment effect of
interest. The two mandatory components are |
... |
If specified, such optional arguments will be fed into the
specified method. For instance, the |
SL.library |
A character vector of SuperLearner wrapper functions for the
prediction algorithms that comprise the super learner library. A full list
of wrapper functions included in the SuperLearner package can be found with
|
cv |
The number of folds in the cross-validation for the super learner. |
cf |
The number of folds in the sample splitting or cross-fitting procedure |
Currently, there are eight built-in methods available for method
. Four
of them are for fully observed univariate outcomes: mean.diff
for the
difference between two means or proportions, log.ratio
for the
log-ratio of two means or proportions, log.odds.ratio
for the
log-odds-ratio of two proportions, and wmw
for the
Wilcoxon-Mann-Whitney (WMW) effect (Zhang et al., 2019), the default version
of which is also known as the win-lose probability difference. There are four
other methods for right-censored survival outcomes: wmw.cens
for the
WMW effect for restricted survival times, surv.diff
for the difference
between two survival probabilities, mrst.diff
(or rmst.diff
) for
the difference in mean restricted survival time, and log.haz.ratio
for
the log-hazard-ratio. The methods for right-censored survival outcomes are
implemented without an analytical influence function (i.e.,
inf.fct.avail=FALSE
). Users can define their own methods under the same
guidelines. For illustration, the current definitions of the
log.odds.ratio
and wmw
methods are provided below as examples.
A matrix with two columns: point estimates of the treatment effect of
interest and their standard errors. The number of rows is 2K+3, where K is
the length of SL.library
. The first row is for the unadjusted
estimate as specified in the method argument. The next K+1 rows are for
augmented estimates based on the individual algorithms in the super learner
library (in the original order) followed by the super learner itself, all
without sample splitting. The next K+1 rows are for augmented estimates
based on the same set of algorithms (in the same order) with sample
splitting. The standard error for the unadjusted estimate is based on the
(analytical or empirical) influence function. The standard errors for the
augmented estimates are cross-validated in the sample splitting procedure.
Thus, the two sub-sets of augmented estimates (with and without sample
splitting) have the same set of cross-validated standard errors.
Zhang Z, Ma S (2019). Machine learning methods for leveraging baseline covariate information to improve the efficiency of clinical trials. Statistics in Medicine, 38(10), 1703-1714.
Zhang Z, Ma S, Shen C, Liu C (2019). Estimating Mann-Whitney-type causal effects. International Statistical Review, 87(3), 514-530.
Zhang Z, Li W, Zhang H (2020). Efficient estimation of Mann-Whitney-type effect measures for right-censored survival outcomes in randomized clinical trials. Statistics in Biosciences, 12(2), 246-262.
See SuperLearner
for details on
SL.library
, and family
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 | # analysis of colon cancer data in the survival package
library(survival)
library(sleete)
data(colon)
dim(colon); names(colon)
colon.data <- na.omit(subset(colon, subset=((etype==2)&(rx!="Lev")),
select = c(rx, time, status, sex, age, obstruct, perfor,
adhere, nodes, node4, surg, differ, extent)))
dim(colon.data)
attach(colon.data)
t = as.numeric(rx=="Lev+5FU")
y = cbind(time, status)
X = cbind(sex, age, obstruct, perfor, adhere, nodes, node4, surg, differ, extent)
detach()
pi = 0.5; tau = 5*365
sleete(y, t, X, pi=pi, method=surv.diff, bounds=c(-1,1), tau=tau)
sleete(y, t, X, pi=pi, method=mrst.diff, tau=tau)
sleete(y, t, X, pi=pi, method=wmw.cens, bounds=c(-1,1), tau=tau)
# the log-odds-ratio method
# logit = log-odds
logit = function(p) log(p/(1-p))
# point estimate
pt.est.log.or = function(y, t) logit(mean(y[t>0.5]))-logit(mean(y[t<0.5]))
# influence function estimated from subjects in set I
# then applied to subjects in set J
inf.fct.log.or = function(y, t, I=1:length(t), J=I, pi=NULL) {
if (is.null(pi)) pi = mean(t[I])
p1 = mean(y[I][t[I]>0.5]); p0 = mean(y[I][t[I]<0.5])
(t[J]*(y[J]-p1)/(pi*p1*(1-p1)))-((1-t[J])*(y[J]-p0)/((1-pi)*p0*(1-p0)))
}
log.odds.ratio = list(pt.est=pt.est.log.or, inf.fct.avail=TRUE, inf.fct=inf.fct.log.or)
# the wmw method with an arbitrary h (default = h0)
# Agresti definition of h
h0 = function(y1, y0) as.numeric(y1>y0)-as.numeric(y1<y0)
# Mann-Whitney definition of h
h1 = function(y1, y0) as.numeric(y1>y0)+0.5*as.numeric(y1==y0)
# point estimate
pt.est.wmw = function(y, t, h=h0) mean(outer(y[t>0.5], y[t<0.5], FUN=h))
# influence function estimated from subjects in set I
# then applied to subjects in set J
inf.fct.wmw = function(y, t, I=1:length(t), J=I, pi=NULL, h=h0) {
if (is.null(pi)) pi = mean(t[I])
theta = pt.est.wmw(y[I],t[I],h=h)
m = length(J); inf = numeric(m)
for (k in 1:m) {
if (t[J[k]]>0.5) {
inf[k] = (mean(h(y[J[k]],y[I]))-theta)/pi
} else {
inf[k] = (mean(h(y[I],y[J[k]]))-theta)/(1-pi)
}
}
inf
}
wmw = list(pt.est=pt.est.wmw, inf.fct.avail=TRUE, inf.fct=inf.fct.wmw)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.