Nothing
#' Single-index models with a surface-link (main function)
#'
#' \code{simsl} is the wrapper function for fitting a single-index model with a surface-link (SIMSL).
#' The function estimates a linear combination (a single-index) of baseline covariates X, and models a nonlinear interactive structure between the single-index and a treatment variable defined on a continuum, by estimating a smooth link function on the index-treatment domain. The resulting \code{simsl} object can be used to estimate an optimal dose rule for a new patient with baseline clinical information.
#'
#' SIMSL captures the effect of covariates via a single-index and their interaction with the treatment via a 2-dimensional smooth link function.
#' Interaction effects are determined by shapes of the link surface.
#' The SIMSL allows comparing different individual treatment levels and constructing individual treatment rules,
#' as functions of a biomarker signature (single-index), efficiently utilizing information on patient’s characteristics.
#' The resulting \code{simsl} object can be used to estimate an optimal dose rule for a new patient with baseline clinical information.
#'
#'
#' @param y a n-by-1 vector of treatment outcomes; y is a member of the exponential family; any distribution supported by \code{mgcv::gam}; y can also be an ordinal categorial response with \code{R} categories taking a value from 1 to \code{R}.
#' @param A a n-by-1 vector of treatment variable; each element is assumed to take a value on a continuum.
#' @param X a n-by-p matrix of baseline covarates.
#' @param Xm a n-by-q design matrix associated with an X main effect model; the defult is \code{NULL} and it is taken as a vector of zeros
#' @param family specifies the distribution of y; e.g., "gaussian", "binomial", "poisson"; can be any family supported by \code{mgcv::gam}; can also be "ordinal", for an ordinal categorical response y.
#' @param R the number of response categories for the case of family = "ordinal".
#' @param bs basis type for the treatment (A) and single-index domains, respectively; the defult is "ps" (p-splines); any basis supported by \code{mgcv::gam} can be used, e.g., "cr" (cubic regression splines); see \code{mgcv::s} for detail.
#' @param k basis dimension for the treatment (A) and single-index domains, respectively.
#' @param m a length 2 list (e.g., m=list(c(2,3), c(2,2))), for the treatment (A) and single-index domains, respectively, where each element specifies the order of basis and penalty (note, for bs="ps", c(2,3) means a 2nd order P-spline basis (cubic spline) and a 3rd order difference penalty; the default "NA" sets c(2,2) for each domain); see \code{mgcv::s} for details.
#' @param sp a vector of smoothing parameters; Smoothing parameters must be supplied in the order that the smooth terms appear in the model formula (i.e., A, and then the single-index); negative elements indicate that the parameter should be estimated, and hence a mixture of fixed and estimated parameters is possible; see \code{mgcv::gam} for detail.
#' @param knots a list containing user-specified knot values to be used for basis construction, for the treatment (A) and single-index domains, respectively.
#' @param sep.A.effect If \code{TRUE}, the g term of SIMSL is further decomposed into: the A main effect + the A-by-X interaction effect; the default is \code{FALSE}.
#' @param mc a length 2 vector indicating which marginals (i.e., A and the single-index, respectively) should have centering (i.e., the sum-to-zero) constraints applied; the default is \code{mc = c(TRUE, FALSE)} (see \code{mgcv::te} for detail of the constraint), which is sufficient for the so-called "orthogonality" constraint of the SIMSL.
#' @param method the smoothing parameter estimation method; "GCV.Cp" to use GCV for unknown scale parameter and Mallows' Cp/UBRE/AIC for known scale; any method supported by \code{mgcv::gam} can be used.
#' @param beta.ini an initial value for \code{beta.coef}; a p-by-1 vector; the defult is \code{NULL}, in which case a linear model estimate is used.
#' @param ind.to.be.positive for identifiability of the solution \code{beta.coef}, the user can restrict the jth (e.g., j=1) component of \code{beta.coef} to be positive; by default, we match the "overall" sign of \code{beta.coef} with that of the linear estimate (i.e., the initial estimate), by restricting the inner product between the two to be positive.
#' @param random.effect if \code{TRUE}, as part of the main effects, the user can incorporate z-specific random intercepts.
#' @param z a factor that specifies the random intercepts when \code{random.effect = TRUE}.
#' @param gamma increase this beyond 1 to produce smoother models. \code{gamma} multiplies the effective degrees of freedom in the GCV or UBRE/AIC (see \code{mgcv::gam} for detail); the default is 1.
#' @param pen.order 0 indicates the ridge penalty; 1 indicates the 1st difference penalty; 2 indicates the 2nd difference penalty, used in a penalized least squares (LS) estimation of \code{beta.coef}.
#' @param lambda a regularization parameter associated with the penalized LS for \code{beta.coef} update.
#' @param max.iter an integer specifying the maximum number of iterations for \code{beta.coef} update.
#' @param eps.iter a value specifying the convergence criterion of algorithm.
#' @param trace.iter if \code{TRUE}, trace the estimation process and print the differences in \code{beta.coef}.
#' @param center.X if \code{TRUE}, center X to have zero mean.
#' @param scale.X if \code{TRUE}, scale X to have unit variance.
#' @param uncons.final.fit if \code{TRUE}, once the convergence in the estimates of \code{beta.coef} is reached, include the main effect associated with the fitted single-index (beta.coef'X) to the final surface-link estimate.
#' @param bootstrap if \code{TRUE}, compute bootstrap confidence intervals for the single-index coefficients, \code{beta.coef}; the default is \code{FALSE}.
#' @param boot.conf a value specifying the confidence level of the bootstrap confidence intervals; the defult is \code{boot.conf = 0.95}.
#' @param nboot when \code{bootstrap=TRUE}, a value specifying the number of bootstrap replications.
#' @param seed when \code{bootstrap=TRUE}, randomization seed used in bootstrap resampling.
#'
#'
#' @return a list of information of the fitted SIMSL including
#' \item{beta.coef}{ the estimated single-index coefficients.} \item{g.fit}{a \code{mgcv:gam} object containing information about the estimated 2-dimensional link function.} \item{beta.ini}{the initial value used in the estimation of \code{beta.coef}} \item{beta.path}{solution path of \code{beta.coef} over the iterations} \item{d.beta}{records the change in \code{beta.coef} over the solution path, \code{beta.path}} \item{X.scale}{sd of pretreatment covariates X} \item{X.center}{mean of pretreatment covariates X} \item{A.range}{range of the observed treatment variable A} \item{p}{number of baseline covariates X} \item{n}{number of subjects} \item{boot.ci}{\code{boot.conf}-level bootstrap CIs (LB, UB) associated with \code{beta.coef}} \item{boot.mat}{a (nboot x p) matrix of bootstrap estimates of \code{beta.coef}}
#'
#' @author Park, Petkova, Tarpey, Ogden
#' @import mgcv stats
#' @seealso \code{pred.simsl}, \code{fit.simsl}
#' @export
#'
#' @examples
#'
#'set.seed(1234)
#'n.test <- 500
#'
#'
#'## simulation 1
#'# generate training data
#'p <- 30
#'n <- 200
#'X <- matrix(runif(n*p,-1,1),ncol=p)
#'A <- runif(n,0,2)
#'D_opt <- 1 + 0.5*X[,2] + 0.5*X[,1]
#'mean.fn <- function(X, D_opt, A){ 8 + 4*X[,1] - 2*X[,2] - 2*X[,3] - 25*((D_opt-A)^2) }
#'mu <- mean.fn(X, D_opt, A)
#'y <- rnorm(length(mu),mu,1)
#'# fit SIMSL
#'simsl.obj <- simsl(y=y, A=A, X=X)
#'
#'# generate testing data
#'X.test <- matrix(runif(n.test*p,-1,1),ncol=p)
#'A.test <- runif(n.test,0,2)
#'f_opt.test <- 1 + 0.5*X.test[,2] + 0.5*X.test[,1]
#'pred <- pred.simsl(simsl.obj, newX= X.test) # make prediction based on the estimated SIMSL
#'value <- mean(8 + 4*X.test[,1] - 2*X.test[,2] - 2*X.test[,3] - 25*((f_opt.test- pred$trt.rule)^2))
#'value # "value" of the estimated treatment rule; the "oracle" value is 8.
#'
#'
#'## simulation 2
#'p <- 10
#'n <- 400
#'# generate training data
#'X <- matrix(runif(n*p,-1,1),ncol=p)
#'A <- runif(n,0,2)
#'f_opt <- I(X[,1] > -0.5)*I(X[,1] < 0.5)*0.6 + 1.2*I(X[,1] > 0.5) +
#' 1.2*I(X[,1] < -0.5) + X[,4]^2 + 0.5*log(abs(X[,7])+1) - 0.6
#'mu <- 8 + 4*cos(2*pi*X[,2]) - 2*X[,4] - 8*X[,5]^3 - 15*abs(f_opt-A)
#'y <- rnorm(length(mu),mu,1)
#'Xq <- cbind(X, X^2) # include a quadratic term
#'# fit SIMSL
#'simsl.obj <- simsl(y=y, A=A, X=Xq)
#'
#'# generate testing data
#'X.test <- matrix(runif(n.test*p,-1,1),ncol=p)
#'A.test <- runif(n.test,0,2)
#'f_opt.test <- I(X.test[,1] > -0.5)*I(X.test[,1] < 0.5)*0.6 + 1.2*I(X.test[,1] > 0.5) +
#' 1.2*I(X.test[,1] < -0.5) + X.test[,4]^2 + 0.5*log(abs(X.test[,7])+1) - 0.6
#'Xq.test <- cbind(X.test, X.test^2)
#'pred <- pred.simsl(simsl.obj, newX= Xq.test) # make prediction based on the estimated SIMSL
#'value <- mean(8 + 4*cos(2*pi*X.test[,2]) - 2*X.test[,4] - 8*X.test[,5]^3 -
#' 15*abs(f_opt.test-pred$trt.rule))
#'value # "value" of the estimated treatment rule; the "oracle" value is 8.
#'
#'
#'\donttest{
#' ### air pollution data application
#' data(chicago); head(chicago)
#' chicago <- chicago[,-3][complete.cases(chicago[,-3]), ]
#' chicago <- chicago[-c(2856:2859), ] # get rid of the gross outliers in y
#' chicago <- chicago[-which.max(chicago$pm10median), ] # get rid of the gross outliers in x
#'
#' # create lagged variables
#' lagard <- function(x,n.lag=5) {
#' n <- length(x); X <- matrix(NA,n,n.lag)
#' for (i in 1:n.lag) X[i:n,i] <- x[i:n-i+1]
#' X
#' }
#' chicago$pm10 <- lagard(chicago$pm10median)
#' chicago <- chicago[complete.cases(chicago), ]
#' # create season varaible
#' chicago$time.day <- round(chicago$time %% 365)
#'
#' # fit SIMSL for modeling the season-by-pm10 interactions on their effects on outcomes
#' simsl.obj <- simsl(y=chicago$death, A=chicago$time.day, X=chicago[,7], bs=c("cc","ps"),
#' ind.to.be.positive = 1, family="poisson", method = "REML",
#' bootstrap =FALSE) # bootstrap = TRUE
#' simsl.obj$beta.coef # the estimated single-index coefficients
#' summary(simsl.obj$g.fit)
#' #round(simsl.obj$boot.ci,3)
#' mgcv::vis.gam(simsl.obj$g.fit, view=c("A","single.index"), theta=-135, phi = 30,
#' color="heat", se=2,ylab = "single-index", zlab = " ",
#' main=expression(paste("Interaction surface g")))
#'
#'
#'
#' ### Warfarin data application
#' data(warfarin)
#' X <- warfarin$X
#' A <- warfarin$A
#' y <- -abs(warfarin$INR - 2.5) # the target INR is 2.5
#' X[,1:3] <- scale(X[,1:3]) # standardize continuous variables
#'
#' # Estimate the main effect, using an additive model
#' mu.fit <- mgcv::gam(y-mean(y) ~ X[, 4:13] +
#' s(X[,1], k=5, bs="ps")+
#' s(X[,2], k=5, bs="ps") +
#' s(X[,3], k=5, bs="ps"), method="REML")
#' summary(mu.fit)
#' mu.hat <- predict(mu.fit)
#' # fit SIMSL
#' simsl.obj <- simsl(y, A, X, Xm= mu.hat, scale.X = FALSE, center.X=FALSE, method="REML",
#' bootstrap = FALSE) # bootstrap = TRUE
#' simsl.obj$beta.coef
#' #round(simsl.obj$boot.ci,3)
#' mgcv::vis.gam(simsl.obj$g.fit, view=c("A","single.index"), theta=52, phi = 18,
#' color="heat", se=2, ylab = "single-index", zlab = "Y",
#' main=expression(paste("Interaction surface g")))
#'}
#'
simsl <- function(y, # a n-by-1 vector of treatment outcomes; y is a member of the exponential family; any distribution supported by mgcv::gam.
A, # a n-by-1 vector of treatment (i.e., dose) variable (assumed to be on continumm).
X, # a n-by-p matrix of covariates.
Xm = NULL, # a n-by-q design matrix assocaited with the X main effect.
family = "gaussian", # specifies the distribution of y; e.g., "gaussian", "binomial", "poisson"; can be any family supported by mgcv::gam; can also be "ordinal", for an ordianal categorical response y.
R = NULL, # for the case of family = "ordinal", the number of response catergories.
bs = c("ps", "ps"), # basis type for the treatment (A) and the single-index domains, respectively; the defult is "ps" (p-splines); any basis supported by mgcv::gam can be used, e.g., "cr" (cubic regression splines).
k = c(8, 8), # basis dimension for the treatment (A) and the single-index domains, respectively.
m = list(NA, NA), # a list of length 2 (e.g., m=list(c(2,3), c(2,2))), each element pecifying the order of basis and penality (for bs="ps", c(2,3) specifies a 2nd order P-spline basis (cubic spline) with a 3rd order difference penalty; the default (i.e., "NA") sets c(2,2) to each domain), for the treatment (A) and the single-index domains, respectively; see mgcv::s for details.
sp = NULL, # a vector of smoothing parameters; Smoothing parameters must be supplied in the order that the smooth terms appear in the model formula (i.e., A and then the single-index); negative elements indicate that the parameter should be estimated, and hence a mixture of fixed and estimated parameters is possible; see mgcv::gam for detail.
knots = NULL, # a list containing user-specified knot values to be used for basis construction, for the treatment (A) and the single-index domains, respectively.
sep.A.effect = FALSE, # If TRUE, the g term of SIMSL is further decomposed into: the A main effect term + the A-by-X interaction effect term.
mc = c(TRUE, FALSE), # a length 2 vector indicating which marginals (i.e., A and the single-index, respectively) should have centering (i.e., sum-to-zero) constraints applied; the default is mc = c(TRUE, FALSE) (see mgcv::te for detail of the constraint), which is sufficient for the so-called "orthogonality" constraint of the SIMSL.
method = "GCV.Cp", # the smoothing parameter estimation method; can be "GCV.Cp", "REML" and "ML"; see mgcv::gam for detail.
beta.ini = NULL, # an initial value for beta.coef; a p-by-1 vector; the defult is NULL.
ind.to.be.positive = NULL, # for identifiability of the solution beta.coef, we can restrict the jth (e.g., j=1) component of beta.coef to be positive; by default, we match the "overall" sign of the single-index with that of the linear estimate (i.e., the initial estimate), by keeping the inner product between the two to be positive.
random.effect = FALSE, # if TRUE, incorporate the z-specific random intercepts, as part of the main effects.
z = NULL, # a factor that specifies random intercepts, if random.effect = TRUE.
gamma =1, # increase this beyond 1 to produce smoother models. gamma multiplies the effective degrees of freedom in the GCV or UBRE/AIC (see mgcv::gam for detail).
pen.order = 0, # pen.order 0 indicates the ridge penalty; 1 indicates the 1st difference penalty; 2 indicates the 2nd difference penalty, used in a penalized least squares (LS) estimation of beta.coef.
lambda = 0, # a regularziation parameter associated with the penalized LS of beta.coef.
max.iter = 10, # an integer specifying the maximum number of iterations for beta.coef update.
eps.iter = 0.01, # a value specifying the convergence criterion of algorithm.
trace.iter = TRUE, # if TRUE, trace the estimation process and print the differences in beta.coef.
center.X=TRUE, # if TRUE, center X to have zero mean.
scale.X=TRUE, # if TRUE, scale X to have unit variance.
uncons.final.fit = TRUE, # if TRUE, once the convergece in the estimates of beta.coef is reached, include the main effect associated with the fitted single-index (beta.coef'X) to the final surface-link estimate.
bootstrap = FALSE, # if TRUE, compute bootstrap confidence intervals for the single-index coefficients, beta.coef; the default is FALSE.
nboot= 200, # when bootstrap=TRUE, a value specifying the number of bootstrap replications.
boot.conf = 0.95, #a value specifying the confidence level of the bootstrap confidence intervals; the defult is boot.conf = 0.95.
seed= 1357) # when bootstrap=TRUE, randomization seed used in bootstrap resampling.
{
simsl.obj <- fit.simsl(y=y, A=A, X=X, Xm=Xm,
family=family, R=R,
bs =bs, k = k, m=m, knots=knots, sp=sp,
sep.A.effect=sep.A.effect,
mc=mc, method=method,
beta.ini = beta.ini,
ind.to.be.positive=ind.to.be.positive,
random.effect=random.effect, z=z,
pen.order = pen.order, lambda = lambda,
max.iter = max.iter, gamma=gamma, trace.iter=trace.iter,
center.X= center.X, scale.X= scale.X,
uncons.final.fit= uncons.final.fit)
boot.mat = boot.ci <- NULL
if(bootstrap){
set.seed(seed)
indices <- 1:simsl.obj$n
if(is.null(Xm)) Xm <- rep(0,simsl.obj$n)
if(is.null(z)) z <- rep(0,simsl.obj$n)
Xm <- as.matrix(Xm)
boot.mat <- matrix(0, nboot, simsl.obj$p)
for(i in 1:nboot){
boot.indices <- sample(indices, simsl.obj$n, replace = TRUE)
tmp <- fit.simsl(y=y[boot.indices], A = A[boot.indices], X = X[boot.indices,],
Xm = Xm[boot.indices,],
family=family, R=R,
bs =bs, k = k, m=m, knots=knots, sp= sp,
sep.A.effect=sep.A.effect,
mc=mc, method= method,
beta.ini = beta.ini,
ind.to.be.positive=ind.to.be.positive,
random.effect= random.effect, z=z[boot.indices],
pen.order = pen.order, lambda = lambda, max.iter = max.iter, gamma=gamma, trace.iter=trace.iter,
center.X= center.X, scale.X= scale.X,
uncons.final.fit = uncons.final.fit
)$beta.coef
if(simsl.obj$beta.coef%*%tmp > simsl.obj$beta.coef%*%(-tmp)){
boot.mat[i,] <- tmp
}else{
boot.mat[i,] <- -tmp
}
}
boot.mat.abs <- abs(boot.mat)
var.t0.abs <- apply(boot.mat.abs, 2, var)
boot.ci <- cbind(simsl.obj$beta.coef-qnorm((1+boot.conf)/2)*sqrt(var.t0.abs),
simsl.obj$beta.coef+qnorm((1+boot.conf)/2)*sqrt(var.t0.abs))
boot.ci <- cbind(simsl.obj$beta.coef, boot.ci, (boot.ci[,1] > 0 | boot.ci[,2] < 0) )
colnames(boot.ci) <- c("coef", "LB", "UB", " ***")
rownames(boot.ci) <- colnames(X)
}
simsl.obj$boot.mat <- boot.mat
simsl.obj$boot.ci <- boot.ci
return(simsl.obj)
}
#' Single-index models with a surface-link (workhorse function)
#'
#' \code{fit.simsl} is the workhorse function for Single-index models with a surface-link (SIMSL).
#'
#' The function estimates a linear combination (a single-index) of covariates X, and captures a nonlinear interactive structure between the single-index and the treatment defined on a continuum via a smooth surface-link on the index-treatment domain.
#'
#' SIMSL captures the effect of covariates via a single-index and their interaction with the treatment via a 2-dimensional smooth link function.
#' Interaction effects are determined by shapes of the link function.
#' The model allows comparing different individual treatment levels and constructing individual treatment rules,
#' as functions of a biomarker signature (single-index), efficiently utilizing information on patient’s characteristics.
#' The resulting \code{simsl} object can be used to estimate an optimal dose rule for a new patient with pretreatment clinical information.
#'
#' @param y a n-by-1 vector of treatment outcomes; y is a member of the exponential family; any distribution supported by \code{mgcv::gam}; y can also be an ordinal categorial response with \code{R} categories taking a value from 1 to \code{R}.
#' @param A a n-by-1 vector of treatment variable; each element is assumed to take a value on a continuum.
#' @param X a n-by-p matrix of baseline covarates.
#' @param Xm a n-by-q design matrix associated with an X main effect model; the defult is \code{NULL} and it is taken as a vector of zeros
#' @param family specifies the distribution of y; e.g., "gaussian", "binomial", "poisson"; can be any family supported by \code{mgcv::gam}; can also be "ordinal", for an ordinal categorical response y.
#' @param R the number of response categories for the case of family = "ordinal".
#' @param bs basis type for the treatment (A) and single-index domains, respectively; the defult is "ps" (p-splines); any basis supported by \code{mgcv::gam} can be used, e.g., "cr" (cubic regression splines); see \code{mgcv::s} for detail.
#' @param k basis dimension for the treatment (A) and single-index domains, respectively.
#' @param m a length 2 list (e.g., m=list(c(2,3), c(2,2))), for the treatment (A) and single-index domains, respectively, where each element specifies the order of basis and penalty (note, for bs="ps", c(2,3) means a 2nd order P-spline basis (cubic spline) and a 3rd order difference penalty; the default "NA" sets c(2,2) for each domain); see \code{mgcv::s} for details.
#' @param sp a vector of smoothing parameters; Smoothing parameters must be supplied in the order that the smooth terms appear in the model formula (i.e., A, and then the single-index); negative elements indicate that the parameter should be estimated, and hence a mixture of fixed and estimated parameters is possible; see \code{mgcv::gam} for detail.
#' @param knots a list containing user-specified knot values to be used for basis construction, for the treatment (A) and single-index domains, respectively.
#' @param sep.A.effect If \code{TRUE}, the g term of SIMSL is further decomposed into: the A main effect + the A-by-X interaction effect; the default is \code{FALSE}.
#' @param mc a length 2 vector indicating which marginals (i.e., A and the single-index, respectively) should have centering (i.e., the sum-to-zero) constraints applied; the default is \code{mc = c(TRUE, FALSE)} (see \code{mgcv::te} for detail of the constraint), which is sufficient for the so-called "orthogonality" constraint of the SIMSL.
#' @param method the smoothing parameter estimation method; "GCV.Cp" to use GCV for unknown scale parameter and Mallows' Cp/UBRE/AIC for known scale; any method supported by \code{mgcv::gam} can be used.
#' @param beta.ini an initial value for \code{beta.coef}; a p-by-1 vector; the defult is \code{NULL}, in which case a linear model estimate is used.
#' @param ind.to.be.positive for identifiability of the solution \code{beta.coef}, the user can restrict the jth (e.g., j=1) component of \code{beta.coef} to be positive; by default, we match the "overall" sign of \code{beta.coef} with that of the linear estimate (i.e., the initial estimate), by restricting the inner product between the two to be positive.
#' @param random.effect if \code{TRUE}, as part of the main effects, the user can incorporate z-specific random intercepts.
#' @param z a factor that specifies the random intercepts when \code{random.effect = TRUE}.
#' @param gamma increase this beyond 1 to produce smoother models. \code{gamma} multiplies the effective degrees of freedom in the GCV or UBRE/AIC (see \code{mgcv::gam} for detail); the default is 1.
#' @param pen.order 0 indicates the ridge penalty; 1 indicates the 1st difference penalty; 2 indicates the 2nd difference penalty, used in a penalized least squares (LS) estimation of \code{beta.coef}.
#' @param lambda a regularization parameter associated with the penalized LS for \code{beta.coef} update.
#' @param max.iter an integer specifying the maximum number of iterations for \code{beta.coef} update.
#' @param eps.iter a value specifying the convergence criterion of algorithm.
#' @param trace.iter if \code{TRUE}, trace the estimation process and print the differences in \code{beta.coef}.
#' @param center.X if \code{TRUE}, center X to have zero mean.
#' @param scale.X if \code{TRUE}, scale X to have unit variance.
#' @param uncons.final.fit if \code{TRUE}, once the convergence in the estimates of \code{beta.coef} is reached, include the main effect associated with the fitted single-index (beta.coef'X) to the final surface-link estimate.
#'
#'
#' @return a list of information of the fitted SIMSL including
#' \item{beta.coef}{ the estimated single-index coefficients.} \item{g.fit}{a \code{mgcv:gam} object containing information about the estimated 2-dimensional link function as well as the X main effect model.} \item{beta.ini}{the initial value used in the estimation of \code{beta.coef}} \item{beta.path}{solution path of \code{beta.coef} over the iterations} \item{d.beta}{records the change in \code{beta.coef} over the solution path, \code{beta.path}} \item{X.scale}{sd of pretreatment covariates X} \item{X.center}{mean of pretreatment covariates X} \item{A.range}{range of the observed treatment variable A} \item{p}{number of baseline covariates X} \item{n}{number of subjects}
#'
#' @author Park, Petkova, Tarpey, Ogden
#' @import mgcv
#' @seealso \code{pred.simsl}, \code{fit.simsl}
#' @export
#'
fit.simsl <- function(y, # a n-by-1 vector of treatment outcomes; y is a member of the exponential family; any distribution supported by mgcv::gam.
A, # a n-by-1 vector of treatment (i.e., dose) variable (assumed to be on continumm).
X, # a n-by-p matrix of covariates.
Xm = NULL, # a n-by-q design matrix assocaited with the X main effect.
family = "gaussian", # specifies the distribution of y; e.g., "gaussian", "binomial", "poisson"; can be any family supported by mgcv::gam; can also be "ordinal", for an ordianal categorical response y.
R = NULL, # for the case of family = "ordinal", the number of response catergories.
bs = c("ps", "ps"), # basis type for the treatment (A) and the single-index domains, respectively; the defult is "ps" (p-splines); any basis supported by mgcv::gam can be used, e.g., "cr" (cubic regression splines).
k = c(8, 8), # basis dimension for the treatment (A) and the single-index domains, respectively.
m = list(NA, NA), # a list of length 2 (e.g., m=list(c(2,3), c(2,2))), each element pecifying the order of basis and penality (for bs="ps", c(2,3) specifies a 2nd order P-spline basis (cubic spline) with a 3rd order difference penalty; the default (i.e., "NA") sets c(2,2) to each domain), for the treatment (A) and the single-index domains, respectively; see mgcv::s for details.
sp = NULL, # a vector of smoothing parameters; Smoothing parameters must be supplied in the order that the smooth terms appear in the model formula (i.e., A and then the single-index); negative elements indicate that the parameter should be estimated, and hence a mixture of fixed and estimated parameters is possible; see mgcv::gam for detail.
knots = NULL, # a list containing user-specified knot values to be used for basis construction, for the treatment (A) and the single-index domains, respectively.
sep.A.effect = FALSE, # If TRUE, the g term of SIMSL is further decomposed into: the A main effect term + the A-by-X interaction effect term.
mc = c(TRUE, FALSE), # a length 2 vector indicating which marginals (i.e., A and the single-index, respectively) should have centering (i.e., sum-to-zero) constraints applied; the default is mc = c(TRUE, FALSE) (see mgcv::te for detail of the constraint), which is sufficient for the so-called "orthogonality" constraint of the SIMSL.
method = "GCV.Cp", # the smoothing parameter estimation method; can be "GCV.Cp", "REML" and "ML"; see mgcv::gam for detail.
beta.ini = NULL, # an initial value for beta.coef; a p-by-1 vector; the defult is NULL.
ind.to.be.positive = NULL, # for identifiability of the solution beta.coef, we can restrict the jth (e.g., j=1) component of beta.coef to be positive; by default, we match the "overall" sign of the single-index with that of the linear estimate (i.e., the initial estimate), by keeping the inner product between the two to be positive.
random.effect = FALSE, # if TRUE, incorporate the z-specific random intercepts, as part of the main effects.
z = NULL, # a factor that specifies random intercepts, if random.effect = TRUE.
gamma =1, # increase this beyond 1 to produce smoother models. gamma multiplies the effective degrees of freedom in the GCV or UBRE/AIC (see mgcv::gam for detail).
pen.order = 0, # pen.order 0 indicates the ridge penalty; 1 indicates the 1st difference penalty; 2 indicates the 2nd difference penalty, used in a penalized least squares (LS) estimation of beta.coef.
lambda = 0, # a regularziation parameter associated with the penalized LS of beta.coef.
max.iter = 10, # an integer specifying the maximum number of iterations for beta.coef update.
eps.iter = 0.01, # a value specifying the convergence criterion of algorithm.
trace.iter = TRUE, # if TRUE, trace the estimation process and print the differences in beta.coef.
center.X=TRUE, # if TRUE, center X to have zero mean.
scale.X=TRUE, # if TRUE, scale X to have unit variance.
uncons.final.fit = TRUE) # if TRUE, once the convergece in the estimates of beta.coef is reached, include the main effect associated with the fitted single-index (beta.coef'X) to the final surface-link estimate.
{
y <- as.vector(y)
n <- length(y)
p <- ncol(X)
A.range <- range(A, na.rm = TRUE) # the observed range of A
## Center and scale X
Xc <- scale(X, center = center.X, scale = scale.X)
X.center <- attr(Xc, "scaled:center")
X.scale <- attr(Xc, "scaled:scale")
## If not provided by the user, the efficiency augmentation vector (corresponding to the X main effect) is set to be a zero vector.
if(is.null(Xm)) Xm <- rep(0,n)
## Specify a penalty matrix associated with the penalized least squares for estimating beta.coef.
D <- diag(p); if(pen.order != 0) for(j in 1:pen.order) D <- diff(D);
Pen <- sqrt(lambda)*D
## special case of an ordinal categorical response
if(family=="ordinal"){
if(is.null(R)) R <- length(unique(y))
family=ocat(R=R)
}
## initialize the single-index coefficient and the single-index.
if(is.null(beta.ini)){
Ac <- A; if(mc[1]) Ac <- A-mean(A)
if(random.effect){
tmp <- gam(y~ Xm + s(z, bs="re") + s(A, k=k[1], bs=bs[1], m=m[[1]]) + Ac:Xc, family = family, knots = knots)$coef # specify the basis type bs and the form of the penalty m
}else{
tmp <- gam(y~ Xm + s(A, k=k[1], bs=bs[1], m=m[[1]]) + Ac:Xc, family = family, knots = knots)$coef # specify the basis type bs and the form of the penalty m
}
beta.ini <- tmp[grep("Ac:" ,names(tmp))]
}
beta.ini[which(is.na(beta.ini))] <- 0
beta.coef <- beta.ini/sqrt(sum(beta.ini^2)) # enforce unit L2 norm
if(!is.null(ind.to.be.positive)){
if(beta.coef[ind.to.be.positive] < 0) beta.coef <- -1*beta.coef # for the (sign) identifiability
}
single.index <- as.vector(Xc %*% beta.coef)
## the following chunk initializes the surface link, through the ti() term.
if(sep.A.effect){ # if sep.A.effect==TRUE, in ti(), "mc" should be c(T, T), to exclude both A and X main effects from the ti() term.
mc <- c(TRUE, TRUE)
if(!is.null(sp)) sp <- c(sp[1], sp)
if(random.effect){
g.fit <- gam(y ~ Xm + s(z, bs="re") + s(A, bs=bs[1], k=k[1], m=m[[1]]) + ti(A, single.index, bs=bs, k=k, m=m, mc=mc),
knots = knots, family = family, gamma=gamma, sp=sp, method=method)
}else{
g.fit <- gam(y ~ Xm + s(A, bs=bs[1], k=k[1], m=m[[1]]) + ti(A, single.index, bs=bs, k=k, m=m, mc=mc),
knots = knots, family = family, gamma=gamma, sp=sp, method=method)
}
}else{ # if sep.A.effect==FALSE, in ti(), "mc" should be c(T, F) (which is the default mode), to allow the A main effect (and exclude only the X main effect) in the ti() term.
if(random.effect){
g.fit <- gam(y ~ Xm + s(z, bs="re") + ti(A, single.index, bs=bs, k=k, m=m, mc=mc), # this models the heterogenous A effect as a function of the single-index
knots = knots, family = family, gamma=gamma, sp=sp, method=method)
}else{
g.fit <- gam(y ~ Xm + ti(A, single.index, bs=bs, k=k, m=m, mc=mc), # this models the heterogenous A effect as a function of the single-index
knots = knots, family = family, gamma=gamma, sp=sp, method=method)
}
}
beta.path <- beta.coef
d.beta <- NULL
## Start iteration
for (it in 2:max.iter)
{
# take the 1st deriavative of the 2D smooths, w.r.t. the single.index.`
g.der <- der.link(g.fit)
## Update beta.coef and intercept through lsfit
# adjusted responses, adjusted for the nonlinearity associated with the smooth
y.star <- residuals(g.fit, type="working") + g.der*single.index
# adjusetd covariates, adjusted for the nonlinearity of the smooth
X.tilda <- diag(g.der) %*% Xc
nix <- rep(0, nrow(D))
X.p <- rbind(X.tilda, Pen)
y.p <- c(y.star, nix)
# perform a (penalized) WLS
beta.fit <- stats::lsfit(X.p, y.p, wt =c(g.fit$weights, (nix + 1)))
beta.fit$coefficients
# for the identifiability
beta.new <- beta.fit$coef[-1]/sqrt(sum(beta.fit$coef[-1]^2))
if(is.null(ind.to.be.positive)){
if(beta.ini %*%beta.new < 0) beta.new <- -1*beta.new
}else{
if(beta.new[ind.to.be.positive] < 0) beta.new <- -1*beta.new # for the (sign) identifiability
}
beta.path <- rbind(beta.path, beta.new)
## Check the convergence of beta
d.beta <- c(d.beta, sum((beta.new-beta.coef)^2))
if(trace.iter){
cat("iter:", it, " "); cat(" difference in beta: ", d.beta[(it-1)], "\n")
}
beta.coef <- beta.new
single.index <- as.vector(Xc %*% beta.coef)
if (d.beta[(it-1)] < eps.iter)
break
# Update the surface-link for the heterogenous A effects, subject to the "orthogonality" constraint
if(sep.A.effect){ # if sep.A.effect==TRUE, in ti(), "mc" should be c(T, T), to exclude both A and X main effects from the ti() term.
if(random.effect){
g.fit <- gam(y ~ Xm + s(z, bs="re") + s(A, bs=bs[1], k=k[1], m=m[[1]]) + ti(A, single.index, bs=bs, k=k, m=m, mc=mc),
knots = knots, family = family, gamma=gamma, sp=sp, method=method)
}else{
g.fit <- gam(y ~ Xm + s(A, bs=bs[1], k=k[1], m=m[[1]]) + ti(A, single.index, bs=bs, k=k, m=m, mc=mc),
knots = knots, family = family, gamma=gamma, sp=sp, method=method)
}
}else{ # if sep.A.effect==FALSE, in ti(), "mc" should be c(T, F) (which is the default mode), to allow the A main effect (and exclude only the X main effect) in the ti() term.
if(random.effect){
g.fit <- gam(y ~ Xm + s(z, bs="re") + ti(A, single.index, bs=bs, k=k, m=m, mc=mc), # this models the heterogenous A effect as a function of the single-index
knots = knots, family = family, gamma=gamma, sp=sp, method=method)
}else{
g.fit <- gam(y ~ Xm + ti(A, single.index, bs=bs, k=k, m=m, mc=mc), # this models the heterogenous A effect as a function of the single-index
knots = knots, family = family, gamma=gamma, sp=sp, method=method)
}
}
}
if(uncons.final.fit){
if(length(sp)>2) sp <- sp[-1]
g.fit <- gam(y ~ Xm + te(A, single.index, bs=bs, k=k, m=m), knots = knots, family = family, gamma=gamma, sp=sp, method=method)
}
results <- list(beta.coef = beta.coef,
beta.ini = beta.ini, d.beta=d.beta, beta.path=beta.path,
g.fit= g.fit,
beta.fit=beta.fit,
X.scale=X.scale, X.center = X.center, y=y, A=A, X=X, Xm = Xm,
z=z, random.effect=random.effect,
A.range=A.range, p=p, n=n, bs=bs, k=k, m=m, mc=mc, gamma=gamma)
class(results) <- c("simsl", "list")
return(results)
}
#' A subfunction used in estimation
#'
#' This function computes the 1st derivative of the surface-link function with respect to the argument associated with the pure interaction effect term of the smooth, using finite difference.
#'
#' @param g.fit a \code{mgcv::gam} object
#' @param eps a small finite difference used in numerical differentiation.
#' @seealso \code{fit.simsl}, \code{simsl}
#'
der.link <- function(g.fit, #arg.number=3,
eps=10^(-4))
{
m.terms <- attr(stats::terms(g.fit), "term.labels")
newD <- stats::model.frame(g.fit)[, m.terms, drop = FALSE]
newD
newDF <- data.frame(newD) # needs to be a data frame for predict
X0 <- predict.gam(g.fit, newDF, type = "lpmatrix")
newDF[,"single.index"] <- newDF[,"single.index"] + eps
X1 <- predict.gam(g.fit, newDF, type = "lpmatrix")
Xp <- (X1 - X0) / eps
Xi <- Xp * 0
want <- grep("A,single.index", colnames(X1)) # take only the pure interaction effect term
Xi[, want] <- Xp[, want]
g.der <- as.vector(Xi %*% stats::coef(g.fit)) # the first derivative of the link function
g.der
return(g.der)
}
#' SIMSL prediction function
#'
#' This function makes predictions from an estimated SIMSL, given a (new) set of covariates.
#' The function returns a set of predicted outcomes given the treatment values in a dense grid of treatment levels for each individual, and a recommended treatment level (assuming a larger value of the outcome is better).
#'
#' @param simsl.obj a \code{simsl} object
#' @param newX a (n-by-p) matrix of new values for the covariates X at which predictions are to be made.
#' @param newA a (n-by-L) matrix of new values for the treatment A at which predictions are to be made.
#' @param newXm a (n-by-q) matrix of new values for the covariates associated with the fitted main effect Xm at which predictions are to be made.
#' @param single.index a length n vector specifying new values for the single-index at which predictions are to be made; the default is \code{NULL}.
#' @param L when \code{newA=NULL}, a value specifying the length of the grid of A at which predictions are to be made.
#' @param type the type of prediction required; the default "response" is on the scale of the response variable; the alternative "link" is on the scale of the linear predictors.
#' @param maximize the default is \code{TRUE}, assuming a larger value of the outcome is better; if \code{FALSE}, a smaller value is assumed to be prefered.
#'
#' @return
#' \item{pred.new}{a (n-by-L) matrix of predicted values; each column represents a treatment dose.}
#' \item{trt.rule}{a (n-by-1) vector of suggested treatment assignments}
#'
#'
#' @author Park, Petkova, Tarpey, Ogden
#' @seealso \code{simsl},\code{fit.simsl}
#' @export
#'
pred.simsl <- function(simsl.obj, newX=NULL, newA =NULL, newXm =NULL, single.index=NULL, L=50, type = "link", maximize=TRUE)
{
#if(!inherits(simsl.obj, "simsl")) # checks input
# stop("obj must be of class `simsl'")
if(is.null(single.index)){
if(is.null(newX)){
newX <- simsl.obj$X
if(is.null(newXm)) newXm <- simsl.obj$Xm
}else{
if(is.null(newXm)){
if(is.matrix(simsl.obj$Xm)){ newXm <- matrix(0, nrow(newX), ncol(simsl.obj$Xm)) }
else{ newXm <- rep(0, nrow(newX)) }
}
}
if(ncol(newX) != simsl.obj$p) stop("newX needs to be of p columns ")
if(is.null(simsl.obj$X.scale)){
if(is.null(simsl.obj$X.center)){
newX.scaled <- scale(newX, center = rep(0,simsl.obj$p), scale = rep(1,simsl.obj$p))
}else{
newX.scaled <- scale(newX, center = simsl.obj$X.center, scale = rep(1,simsl.obj$p))
}
}else{
newX.scaled <- scale(newX, center = simsl.obj$X.center, scale = simsl.obj$X.scale)
}
single.index <- newX.scaled %*% simsl.obj$beta.coef
}else{
if(is.null(newXm)){
if(is.matrix(simsl.obj$Xm)){ newXm <- matrix(0, length(single.index), ncol(simsl.obj$Xm)) }
else{ newXm <- rep(0, length(single.index)) }
}
}
# compute treatment (A)-specific predicted outcomes
if(is.null(newA)){
A.grid <- seq(from =simsl.obj$A.range[1], to =simsl.obj$A.range[2], length.out =L)
newA <- matrix(A.grid, length(single.index), L, byrow = TRUE)
}else{
newA <- as.matrix(newA)
L <- ncol(newA)
}
pred.new <- matrix(0, length(single.index), L)
for(a in 1:L){
if(simsl.obj$random.effect){
newD <- list(Xm= newXm, A= newA[,a], single.index=single.index,z=rep(simsl.obj$g.fit$model$z[1], length(single.index)))
}else{
newD <- list(Xm= newXm, A= newA[,a], single.index=single.index)
}
pred.new[ ,a] <- predict.gam(simsl.obj$g.fit, newD, type =type)
rm(newD)
}
# compute optimal treatment assignment
if(maximize){
opt.trt.index <- apply(pred.new, 1, which.max)
}else{
opt.trt.index <- apply(pred.new, 1, which.min)
}
trt.rule <- rep(NA, nrow(pred.new))
for(i in 1:nrow(pred.new)){
trt.rule[i] <- newA[i, opt.trt.index[i]]
}
return(list(trt.rule = trt.rule, pred.new = pred.new))
}
######################################################################
## END OF THE FILE
######################################################################
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.