list: Statistical Methods for the Item Count Technique and List Experiment

Documented in ictreg plot.predict.ictreg predict.ictreg summary.ictreg

#' Item Count Technique
#' 
#' Function to conduct multivariate regression analyses of survey data with the
#' item count technique, also known as the list experiment and the unmatched
#' count technique.
#' 
#' This function allows the user to perform regression analysis on data from
#' the item count technique, also known as the list experiment and the
#' unmatched count technique.
#' 
#' Three list experiment designs are accepted by this function: the standard
#' design; the multiple sensitive item standard design; and the modified design
#' proposed by Corstange (2009).
#' 
#' For the standard design, three methods are implemented in this function: the
#' linear model; the Maximum Likelihood (ML) estimation for the
#' Expectation-Maximization (EM) algorithm; the nonlinear least squares (NLS)
#' estimation with the two-step procedure both proposed in Imai (2010); and the
#' Maximum Likelihood (ML) estimator in the presence of two types of dishonest
#' responses, "ceiling" and "floor" liars. The ceiling model, floor model, or
#' both, as described in Blair and Imai (2010) can be activated by using the
#' \code{ceiling} and \code{floor} options. The constrained and unconstrained
#' ML models presented in Imai (2010) are available through the
#' \code{constrained} option, and the user can specify if overdispersion is
#' present in the data for the no liars models using the \code{overdispersed}
#' option to control whether a beta-binomial or binomial model is used in the
#' EM algorithm to model the item counts.
#' 
#' The modified design and the multiple sensitive item design are automatically
#' detected by the function, and only the binomial model without overdispersion
#' is available.
#' 
#' @aliases ictreg ict list
#' @param formula An object of class "formula": a symbolic description of the
#' model to be fitted.
#' @param data A data frame containing the variables in the model
#' @param treat Name of treatment indicator as a string. For single sensitive
#' item models, this refers to a binary indicator, and for multiple sensitive
#' item models it refers to a multi-valued variable with zero representing the
#' control condition. This can be an integer (with 0 for the control group) or
#' a factor (with "control" for the control group).
#' @param J Number of non-sensitive (control) survey items.
#' @param method Method for regression, either \code{ml} for the Maximum
#' Likelihood (ML) estimation with the Expectation-Maximization algorithm;
#' \code{lm} for linear model estimation; or \code{nls} for the Non-linear
#' Least Squares (NLS) estimation with the two-step procedure.
#' @param weights Name of the weights variable as a string, if weighted
#' regression is desired. Not implemented for the ceiling/floor models,
#' multiple sensitive item design, or for the modified design.
#' @param h Auxiliary data functionality. Optional named numeric vector with
#' length equal to number of groups. Names correspond to group labels and
#' values correspond to auxiliary moments.
#' @param group Auxiliary data functionality. Optional character vector of
#' group labels with length equal to number of observations.
#' @param matrixMethod Auxiliary data functionality. Procedure for estimating
#' optimal weighting matrix for generalized method of moments. One of
#' "efficient" for two-step feasible and "cue" for continuously updating.
#' Default is "efficient". Only relevant if \code{h} and \code{group} are
#' specified.
#' @param robust Robust NLS and ML models that ensure that the estimated 
#' proportion of the sensitive trait is close to difference-in-means estimate.
#' @param error ML models that model response error processes proposed in
#' Blair, Chou, and Imai (2018). Select either \code{none} (standard ML), the
#' default; \code{topcode}, which models an error process in which a random 
#' subset of respondents chooses the maximal (ceiling) response value, 
#' regardless of their truthful response; and \code{uniform}, which models an 
#' error process in which a subset of respondents chooses their responses at 
#' random.
#' @param overdispersed Indicator for the presence of overdispersion. If
#' \code{TRUE}, the beta-binomial model is used in the EM algorithm, if
#' \code{FALSE} the binomial model is used. Not relevant for the \code{NLS} or
#' \code{lm} methods.
#' @param constrained A logical value indicating whether the control group
#' parameters are constrained to be equal.  Not relevant for the \code{NLS} or
#' \code{lm} methods
#' @param floor A logical value indicating whether the floor liar model should
#' be used to adjust for the possible presence of respondents dishonestly
#' reporting a negative preference for the sensitive item among those who hold
#' negative views of all the non-sensitive items.
#' @param ceiling A logical value indicating whether the ceiling liar model
#' should be used to adjust for the possible presence of respondents
#' dishonestly reporting a negative preference for the sensitive item among
#' those who hold affirmative views of all the non-sensitive items.
#' @param ceiling.fit Fit method for the M step in the EM algorithm used to fit
#' the ceiling liar model. \code{glm} uses standard logistic regression, while
#' \code{bayesglm} uses logistic regression with a weakly informative prior
#' over the parameters.
#' @param floor.fit Fit method for the M step in the EM algorithm used to fit
#' the floor liar model. \code{glm} uses standard logistic regression, while
#' \code{bayesglm} uses logistic regression with a weakly informative prior
#' over the parameters.
#' @param ceiling.formula Covariates to include in ceiling liar model. These
#' must be a subset of the covariates used in \code{formula}.
#' @param floor.formula Covariates to include in floor liar model. These must
#' be a subset of the covariates used in \code{formula}.
#' @param fit.start Fit method for starting values for standard design
#' \code{ml} model. The options are \code{lm}, \code{glm}, and \code{nls},
#' which use OLS, logistic regression, and non-linear least squares to generate
#' starting values, respectively. The default is \code{nls}.
#' @param fit.sensitive Fit method for the sensitive item fit for maximum likelihood
#' models. \code{glm} uses standard logistic regression, while
#' \code{bayesglm} uses logistic regression with a weakly informative prior
#' over the parameters.
#' @param fit.nonsensitive Fit method for the non-sensitive item fit for the
#' \code{nls} method and the starting values for the \code{ml} method for the
#' \code{modified} design. Options are \code{glm} and \code{nls}, and the
#' default is \code{nls}.
#' @param multi.condition For the multiple sensitive item design, covariates
#' representing the estimated count of affirmative responses for each
#' respondent can be included directly as a level variable by choosing
#' \code{level}, or as indicator variables for each value but one by choosing
#' \code{indicators}. The default is \code{none}.
#' @param maxIter Maximum number of iterations for the Expectation-Maximization
#' algorithm of the ML estimation.  The default is 5000.
#' @param verbose a logical value indicating whether model diagnostics are
#' printed out during fitting.
#' @param ... further arguments to be passed to NLS regression commands.
#' @return \code{ictreg} returns an object of class "ictreg".  The function
#' \code{summary} is used to obtain a table of the results.  The object
#' \code{ictreg} is a list that contains the following components.  Some of
#' these elements are not available depending on which method is used
#' (\code{lm}, \code{nls} or \code{ml}), which design is used (\code{standard},
#' \code{modified}), whether multiple sensitive items are include
#' (\code{multi}), and whether the constrained model is used (\code{constrained
#' = TRUE}).
#' 
#' \item{par.treat}{point estimate for effect of covariate on item count fitted
#' on treatment group} \item{se.treat}{standard error for estimate of effect of
#' covariate on item count fitted on treatment group} \item{par.control}{point
#' estimate for effect of covariate on item count fitted on control group}
#' \item{se.control}{standard error for estimate of effect of covariate on item
#' count fitted on control group} \item{coef.names}{variable names as defined
#' in the data frame} \item{design}{call indicating whether the \code{standard}
#' design as proposed in Imai (2010) or thee \code{modified} design as proposed
#' in Corstange (2009) is used} \item{method}{call of the method used}
#' \item{overdispersed}{call indicating whether data is overdispersed}
#' \item{constrained}{call indicating whether the constrained model is used}
#' \item{boundary}{call indicating whether the floor/ceiling boundary models
#' are used} \item{multi}{indicator for whether multiple sensitive items were
#' included in the data frame} \item{call}{the matched call} \item{data}{the
#' \code{data} argument} \item{x}{the design matrix} \item{y}{the response
#' vector} \item{treat}{the vector indicating treatment status} \item{J}{Number
#' of non-sensitive (control) survey items set by the user or detected.}
#' \item{treat.labels}{a vector of the names used by the \code{treat} vector
#' for the sensitive item or items. This is the names from the \code{treat}
#' indicator if it is a factor, or the number of the item if it is numeric.}
#' \item{control.label}{a vector of the names used by the \code{treat} vector
#' for the control items. This is the names from the \code{treat} indicator if
#' it is a factor, or the number of the item if it is numeric.}
#' 
#' For the maximum likelihood models, an additional output object is included:
#' \item{pred.post}{posterior predicted probability of answering "yes" to the
#' sensitive item. The weights from the E-M algorithm.}
#' 
#' For the floor/ceiling models, several additional output objects are
#' included: \item{ceiling}{call indicating whether the assumption of no
#' ceiling liars is relaxed, and ceiling parameters are estimated}
#' \item{par.ceiling}{point estimate for effect of covariate on whether
#' respondents who answered affirmatively to all non-sensitive items and hold a
#' true affirmative opinion toward the sensitive item lied and reported a
#' negative response to the sensitive item } \item{se.ceiling}{standard error
#' for estimate for effect of covariate on whether respondents who answered
#' affirmatively to all non-sensitive items and hold a true affirmative opinion
#' toward the sensitive item lied and reported a negative response to the
#' sensitive item} \item{floor}{call indicating whether the assumption of no
#' floor liars is relaxed, and floor parameters are estimated}
#' \item{par.ceiling}{point estimate for effect of covariate on whether
#' respondents who answered negatively to all non-sensitive items and hold a
#' true affirmative opinion toward the sensitive item lied and reported a
#' negative response to the sensitive item } \item{se.ceiling}{standard error
#' for estimate for effect of covariate on whether respondents who answered
#' negatively to all non-sensitive items and hold a true affirmative opinion
#' toward the sensitive item lied and reported a negative response to the
#' sensitive item} \item{coef.names.ceiling}{variable names from the ceiling
#' liar model fit, if applicable} \item{coef.names.floor}{variable names from
#' the floor liar model fit, if applicable}
#' 
#' For the multiple sensitive item design, the \code{par.treat} and
#' \code{se.treat} vectors are returned as lists of vectors, one for each
#' sensitive item.
#' 
#' For the unconstrained model, the \code{par.control} and \code{se.control}
#' output is replaced by: \item{par.control.phi0}{point estimate for effect of
#' covariate on item count fitted on treatment group}
#' \item{se.control.phi0}{standard error for estimate of effect of covariate on
#' item count fitted on treatment group} \item{par.control.phi1}{point estimate
#' for effect of covariate on item count fitted on treatment group}
#' \item{se.control.phi1}{standard error for estimate of effect of covariate on
#' item count fitted on treatment group}
#' 
#' Depending upon the estimator requested by the user, model fit statistics are
#' also included: \item{llik}{the log likelihood of the model, if \code{ml} is
#' used} \item{resid.se}{the residual standard error, if \code{nls} or
#' \code{lm} are used. This will be a scalar if the standard design was used,
#' and a vector if the multiple sensitive item design was used}
#' \item{resid.df}{the residual degrees of freedom, if \code{nls} or \code{lm}
#' are used. This will be a scalar if the standard design was used, and a
#' vector if the multiple sensitive item design was used}
#' 
#' When using the auxiliary data functionality, the following objects are
#' included: \item{aux}{logical value indicating whether estimation
#' incorporates auxiliary moments} \item{nh}{integer count of the number of
#' auxiliary moments} \item{wm}{procedure used to estimate the optimal weight
#' matrix} \item{J.stat}{numeric value of the Sargan Hansen overidentifying
#' restriction test statistic} \item{overid.p}{corresponding p-value for the
#' Sargan Hansen test}
#' @author Graeme Blair, UCLA, \email{graeme.blair@ucla.edu}
#' and Kosuke Imai, Princeton University, \email{kimai@princeton.edu}
#' @seealso \code{\link{predict.ictreg}} for fitted values
#' @references Blair, Graeme and Kosuke Imai. (2012) ``Statistical Analysis of
#' List Experiments."  Political Analysis. Forthcoming. available at
#' \url{http://imai.princeton.edu/research/listP.html}
#' 
#' Imai, Kosuke. (2011) ``Multivariate Regression Analysis for the Item Count
#' Technique.'' Journal of the American Statistical Association, Vol. 106, No.
#' 494 (June), pp. 407-416. available at
#' \url{http://imai.princeton.edu/research/list.html}
#' @keywords models regression
#' @examples
#' 
#' 
#' data(race)
#' 
#' set.seed(1)
#' 
#' # Calculate list experiment difference in means
#' 
#' diff.in.means.results <- ictreg(y ~ 1, data = race, 
#' 	       	      treat = "treat", J=3, method = "lm")
#' 
#' summary(diff.in.means.results)
#' 
#' # Fit linear regression
#' # Replicates Table 1 Columns 1-2 Imai (2011); note that age is divided by 10
#' 
#' lm.results <- ictreg(y ~ south + age + male + college, data = race, 
#' 	       	      treat = "treat", J=3, method = "lm")
#' 
#' summary(lm.results)
#' 
#' # Fit two-step non-linear least squares regression
#' # Replicates Table 1 Columns 3-4 Imai (2011); note that age is divided by 10
#' 
#' nls.results <- ictreg(y ~ south + age + male + college, data = race, 
#' 	       	      treat = "treat", J=3, method = "nls")
#' 
#' summary(nls.results)
#' 
#' \dontrun{
#' 
#' # Fit EM algorithm ML model with constraint
#' # Replicates Table 1 Columns 5-6, Imai (2011); note that age is divided by 10
#' 
#' ml.constrained.results <- ictreg(y ~ south + age + male + college, data = race, 
#' 		       	  	 treat = "treat", J=3, method = "ml", 
#' 				 overdispersed = FALSE, constrained = TRUE)
#' 
#' summary(ml.constrained.results)
#' 
#' # Fit EM algorithm ML model with no constraint
#' # Replicates Table 1 Columns 7-10, Imai (2011); note that age is divided by 10
#' 
#' ml.unconstrained.results <- ictreg(y ~ south + age + male + college, data = race, 
#' 			    	   treat = "treat", J=3, method = "ml", 
#' 				   overdispersed = FALSE, constrained = FALSE)
#' 
#' summary(ml.unconstrained.results)
#' 
#' # Fit EM algorithm ML model for multiple sensitive items
#' # Replicates Table 3 in Blair and Imai (2010)
#' 
#' multi.results <- ictreg(y ~ male + college + age + south + south:age, treat = "treat", 
#' 	      	 	J = 3, data = multi, method = "ml", 
#' 			multi.condition = "level")
#' 
#' summary(multi.results)
#' 
#' # Fit standard design ML model
#' # Replicates Table 7 Columns 1-2 in Blair and Imai (2010)
#' 
#' noboundary.results <- ictreg(y ~ age + college + male + south, treat = "treat",
#' 		      	     J = 3, data = affirm, method = "ml", 
#' 			     overdispersed = FALSE)
#' 
#' summary(noboundary.results)
#' 
#' # Fit standard design ML model with ceiling effects alone
#' # Replicates Table 7 Columns 3-4 in Blair and Imai (2010)
#' 
#' ceiling.results <- ictreg(y ~ age + college + male + south, treat = "treat", 
#' 		   	  J = 3, data = affirm, method = "ml", fit.start = "nls",
#' 			  ceiling = TRUE, ceiling.fit = "bayesglm",
#' 			  ceiling.formula = ~ age + college + male + south)
#' 
#' summary(ceiling.results)
#' 
#' # Fit standard design ML model with floor effects alone
#' # Replicates Table 7 Columns 5-6 in Blair and Imai (2010)
#' 
#' floor.results <- ictreg(y ~ age + college + male + south, treat = "treat", 
#' 	      	 	J = 3, data = affirm, method = "ml", fit.start = "glm", 
#' 			floor = TRUE, floor.fit = "bayesglm",
#' 			floor.formula = ~ age + college + male + south)
#' 
#' summary(floor.results)
#' 
#' # Fit standard design ML model with floor and ceiling effects
#' # Replicates Table 7 Columns 7-8 in Blair and Imai (2010)
#' 
#' both.results <- ictreg(y ~ age + college + male + south, treat = "treat", 
#' 	     	       J = 3, data = affirm, method = "ml", 
#' 		       floor = TRUE, ceiling = TRUE, 
#' 		       floor.fit = "bayesglm", ceiling.fit = "bayesglm",
#' 		       floor.formula = ~ age + college + male + south,
#' 		       ceiling.formula = ~ age + college + male + south)
#' 
#' summary(both.results)
#' 
#' # Response error models (Blair, Imai, and Chou 2018)
#' 
#' top.coded.error <- ictreg(
#'    y ~ age + college + male + south, treat = "treat",
#'    J = 3, data = race, method = "ml", error = "topcoded")
#'    
#' uniform.error <- ictreg(
#'    y ~ age + college + male + south, treat = "treat",
#'    J = 3, data = race, method = "ml", error = "topcoded")
#'    
#' # Robust models, which constrain sensitive item proportion
#' #   to difference-in-means estimate
#' 
#' robust.ml <- ictreg(
#'    y ~ age + college + male + south, treat = "treat",
#'    J = 3, data = affirm, method = "ml", robust = TRUE)
#'
#' robust.nls <- ictreg(
#'    y ~ age + college + male + south, treat = "treat",
#'    J = 3, data = affirm, method = "nls", robust = TRUE)
#'    
#' }
#' 
ictreg <- function(formula, data = parent.frame(), treat = "treat", J, method = "ml", weights, 
                   h = NULL, group = NULL, matrixMethod = "efficient", robust = FALSE, error = "none", 
                   overdispersed = FALSE, constrained = TRUE, floor = FALSE, ceiling = FALSE, 
                   ceiling.fit = "glm", floor.fit = "glm", ceiling.formula = ~ 1, floor.formula = ~ 1, 
                   fit.start = "lm", fit.sensitive = "glm", 
                   fit.nonsensitive = "nls", multi.condition = "none", maxIter = 5000, verbose = FALSE, ...){
  
  ictreg.call <- match.call()

  # set up data frame, with support for standard and modified responses
  mf <- match.call(expand.dots = FALSE)
  
  # make all other call elements null in mf <- NULL in next line
  mf$method <- mf$maxIter <- mf$verbose <- mf$fit.start <- mf$J <- mf$design <- 
    mf$treat <- mf$weights <- mf$constrained <- mf$fit.sensitive <- mf$overdispersed <- mf$floor <- 
    mf$ceiling <- mf$ceiling.fit <- mf$fit.nonsensitive <- mf$floor.fit <- 
    mf$multi.condition <- mf$floor.formula <- mf$ceiling.formula <- mf$h <-
    mf$group <- mf$matrixMethod <- mf$robust <- mf$error <- NULL
  mf[[1]] <- as.name("model.frame")
  mf$na.action <- 'na.pass'
  mf <- eval.parent(mf)
  
  if(floor == TRUE | ceiling == TRUE) {
    boundary <- TRUE
  } else {
    boundary <- FALSE
  }
  
  # define design, response data frames
  x.all <- model.matrix.default(attr(mf, "terms"), mf)
  y.all <- model.response(mf)
  
  if(inherits(y.all, "matrix")) {
    design <- "modified"
  } else {
    design <- "standard"
  }
  
  if(missing("weights") == FALSE) {
      weighted <- TRUE
      w.all <- data[, paste(weights)]
  } else {
      weighted <- FALSE
      w.all <- rep(1, length(y.all))
  }

  if (method == "oneway") fit.start <- "lm"
  if (method == "nls") fit.start <- "nls"
  
  # extract number of non-sensitive items from the response matrix
  if(design=="modified") J <- ncol(y.all) - 1
  
  # set -777 code for treatment status
#  if(design=="modified"){
#    for(j in 1:J) y.all[,j] <- ifelse(!is.na(y.all[,J+1]), -777, y.all[,j])
#    y.all[,J+1] <- ifelse(is.na(y.all[,J+1]), -777, y.all[,J+1])
#  }
  
  # list-wise missing deletion
  na.x <- apply(is.na(x.all), 1, sum)
  if(design=="standard") na.y <- is.na(y.all)
  if(design=="modified") {
    na.y <- apply(is.na(y.all[,1:J]), 1, sum)
    na.y[is.na(y.all[,J+1]) == FALSE] <- 0
  }
  na.w <- is.na(w.all)

  ## treatment indicator for subsetting the dataframe
  if(design=="standard") t <- data[na.x==0 & na.y==0 & na.w==0, paste(treat)]
  if(design=="modified") t <- as.numeric(is.na(y.all[na.x == 0 & na.y == 0 & na.w==0, J+1]) == FALSE)

  if(inherits(t, "factor")) {

    levels(t) <- tolower(levels(t))
    
    if (length(which(levels(t) == "control")) == 1) {
      t <- relevel(t, ref = "control")
    } else {
      warning("Note: using the first level as the control condition, but it is not labeled 'control'.")
    }
        
    condition.labels <- levels(t)
    t <- as.numeric(t) - 1
    treatment.labels <- condition.labels[2:length(condition.labels)]
    control.label <- condition.labels[1]
    
  } else {
    ##condition.labels <- as.character(paste("Sensitive Item",sort(unique(t))))
    ##condition.labels[which(condition.labels == "Sensitive Item 0")] <- "control"
    ##treatment.labels <- condition.labels[condition.labels != "control"]

    condition.labels <- sort(unique(t))
    treatment.labels <- condition.labels[condition.labels != 0]
    control.label <- 0
  }
  
  # list wise delete
  if(design=="standard") y.all <- y.all[na.x==0 & na.y==0 & na.w==0]
  if(design=="modified") y.all <- y.all[na.x==0 & na.y==0 & na.w==0,]
  x.all <- x.all[na.x==0 & na.y==0, , drop = FALSE]
  w.all <- w.all[na.x==0 & na.y==0 & na.w==0]

  ## so that the output data has the same dimension as x.all and y.all
  data <- data[na.x==0 & na.y==0 & na.w==0, , drop = FALSE]
  
  # set up data objects for y and x for each group from user input
  x.treatment <- x.all[t != 0, , drop = FALSE]
  y.treatment <- subset(y.all, t != 0)
  w.treatment <- subset(w.all, t != 0)
  x.control <- x.all[t == 0 , , drop = FALSE]
  y.control <- subset(y.all, t==0)
  w.control <- subset(w.all, t==0)

  if(design=="standard" & missing("J")) {
    J <- max(y.treatment) - 1
    cat(paste("Note: number of control items set to", J, "\n"))
  }

  condition.values <- sort(unique(t))
  treatment.values <- 1:length(condition.values[condition.values!=0])
  
  if(length(treatment.values) > 1) {
    multi <- TRUE
  } else {
    multi <- FALSE
  }

  if(weighted == TRUE) {
    if(design == "modified")
      stop("Weighted list experiment regression is not supported (yet) for the modified design.")
    if(boundary == TRUE)
      stop("Weighted list experiment regression is not supported (yet) for the ceiling/floor design.")
    if(multi == TRUE)
      stop("Weighted list experiment regression is not supported (yet) for the multi-item design.")
  }

  # robust functionality -- check conditions
  if (robust) {
    if (multi.condition != "none") 
      stop("The robust ML functionality is not yet supported for multiple sensitive item designs.")
    if (!(method == "ml" | method == "nls"))
      stop("You must specify method as 'ml' or 'nls' to use the robust functionality.")
  }

  # auxiliary data functionality -- check conditions
  aux.check <- !(is.null(h) & is.null(group))

  if (aux.check) {
    if (multi.condition != "none") 
      stop("The auxiliary data functionality is not yet supported for multiple sensitive item designs.")
    if (method != "nls")
      stop("The auxiliary data functionality is currently supported only for the nonlinear least squares method.")
    if (robust)
      stop("The auxiliary data functionality is not yet compatible with the robust functionality.")
  }

  n <- nrow(x.treatment) + nrow(x.control)

  coef.names <- colnames(x.all)

  nPar <- ncol(x.all)

  intercept.only <- ncol(x.all)==1 & sum(x.all[,1]==1) == n

  if (intercept.only == TRUE) {
    x.vars <- "1"
  } else {
    x.vars <- coef.names[-1]
  }
  
  logistic <- function(x) exp(x)/(1+exp(x))
  logit <- function(x) return(log(x)-log(1-x))

  if (design == "standard" & method == "lm") {
    
    treat <- matrix(NA, nrow = n, ncol = length(treatment.values))
    
    for (m in 1:length(treatment.values)) 
      treat[ , m] <- as.numeric(t == m)
      
    x.all.noint <- x.all[, -1, drop = FALSE]

    x.all.lm <- x.all
    for (m in 1:length(treatment.values))
      x.all.lm <- cbind(x.all.lm, x.all * treat[, m])

    if (intercept.only == TRUE) {
      fit.lm <- lm(y.all ~ treat, weights = w.all)
    } else {
      ## fit.lm <- lm(y.all ~ x.all.noint * treat)
      fit.lm <- lm(y.all ~ x.all.lm - 1, weights = w.all)
    }
    
    vcov <- vcovHC(fit.lm)

    par.control <- coef(fit.lm)[1:length(coef.names)]
    se.control <- sqrt(diag(vcov))[1:length(coef.names)]

    names(par.control) <- names(se.control) <- coef.names
    
    par.treat <- se.treat <- list()
    
    for (m in 1:length(treatment.values)) {
      if (intercept.only == TRUE) {
        par.treat[[m]] <- coef(fit.lm)[nPar + m]
        se.treat[[m]] <- sqrt(diag(vcov))[nPar + m]
        names(par.treat[[m]]) <- names(se.treat[[m]]) <- "(Intercept)"
      } else {
        
        par.treat[[m]] <- coef(fit.lm)[(nPar + (m-1) * nPar + 1) : (nPar + m * nPar)]
        se.treat[[m]] <- sqrt(diag(vcov))[(nPar + (m-1) * nPar + 1) : (nPar + m * nPar)]
        
        names(par.treat[[m]]) <- names(se.treat[[m]]) <- coef.names
      }
      
    }

    if(multi == FALSE) {
      par.treat <- par.treat[[1]]
      se.treat <- se.treat[[1]]
    }
    
    sum.fit.lm <- summary(fit.lm)
      
    resid.se <- sum.fit.lm$sigma
    resid.df <- sum.fit.lm$df[2]
    
    if (multi == TRUE) {
      return.object <- list(par.treat=par.treat, se.treat=se.treat,
                            par.control=par.control, se.control=se.control,
                            vcov=vcov, treat.values = treatment.values, treat.labels = treatment.labels,
                            J=J, coef.names=coef.names,
                            resid.se = resid.se, resid.df = resid.df,  design = design, method = method,
                            multi = multi, boundary = boundary, call = match.call(),
                            data = data, x = x.all, y = y.all, treat = t)
      if(weighted == TRUE)
          return.object$weights <- w.all
    } else {
      return.object <- list(par.treat=par.treat, se.treat=se.treat,
                            par.control=par.control, se.control=se.control,
                            vcov=vcov, J=J,  coef.names=coef.names, resid.se = resid.se,
                            resid.df = resid.df,  design = design, method = method,
                            multi = multi, boundary = boundary, call = match.call(),
                            data = data, x = x.all, y = y.all, treat = t)
      if(weighted == TRUE)
          return.object$weights <- w.all
    }
    
  } else if (design=="standard" & method != "lm") {
    
    if (!(method == "nls" & error != "none")) {
      
      ##
      # Run two-step estimator
      
      vcov.twostep.std <- function(treat.fit, control.fit, J, y, treat, x) {
        
        n <- length(y)
        y1 <- y[treat == 1]
        y0 <- y[treat == 0]
        x1 <- x[treat == 1, , drop = FALSE]
        x0 <- x[treat == 0, , drop = FALSE]
        delta <- coef(treat.fit)
        gamma <- coef(control.fit)
        
        m1 <- c((y1 - J*logistic(x1 %*% gamma) - logistic(x1 %*% delta)) *
                  logistic(x1 %*% delta)/(1+exp(x1 %*% delta))) * x1
        m0 <- c((y0 - J*logistic(x0 %*% gamma)) * J *
                  logistic(x0 %*% gamma)/(1+exp(x0 %*% gamma))) * x0
        Em1 <- t(m1) %*% m1 / n
        Em0 <- t(m0) %*% m0 / n
        F <- adiag(Em1, Em0)
        Gtmp <- c(logistic(x1 %*% delta)/(1 + exp(x1 %*% delta))) * x1
        G1 <- -t(Gtmp) %*% Gtmp / n  
        Gtmp <- c(sqrt(J*logistic(x1 %*% delta)*logistic(x1 %*% gamma)/
                         ((1 + exp(x1 %*% delta))*(1 + exp(x1 %*% gamma))))) * x1
        G2 <- -t(Gtmp) %*% Gtmp / n
        Gtmp <- c(J*logistic(x0 %*% gamma)/(1 + exp(x0 %*% gamma))) * x0
        G3 <- -t(Gtmp) %*% Gtmp / n
        invG1 <- solve(G1, tol = 1e-20)
        invG3 <- solve(G3, tol = 1e-20)
        invG <- rbind(cbind(invG1, - invG1 %*% G2 %*% invG3),
                      cbind(matrix(0, ncol = ncol(G1), nrow = nrow(G3)), invG3))
        
        return(invG %*% F %*% t(invG) / n)
        
      }
      
      ## fit to the control group
      fit.glm.control <- glm(cbind(y.control, J-y.control) ~ x.control - 1,
                             family = binomial(logit), weights = w.control)
      
      coef.glm.control <- coef(fit.glm.control)
      names(coef.glm.control) <- paste("beta", 1:length(coef.glm.control), sep = "")
      
      if(fit.start == "nls") {
        if(is.null(ictreg.call$control)){
          fit.control <- nls( as.formula(paste("I(y.control/J) ~ logistic(x.control %*% c(",
                                               paste(paste("beta", 1:length(coef.glm.control),
                                                           sep=""), collapse= ","), "))")) ,
                              start = coef.glm.control, weights = w.control, control =
                                nls.control(maxiter=maxIter, warnOnly=TRUE), ... = ...)
        } else {
          fit.control <- nls( as.formula(paste("I(y.control/J) ~ logistic(x.control %*% c(",
                                               paste(paste("beta", 1:length(coef.glm.control),
                                                           sep=""), collapse= ","), "))")) ,
                              weights = w.control,
                              start = coef.glm.control, ... = ...)
        }
        
        fit.control.coef <- summary(fit.control)$parameters[,1]
        
      } else if (fit.start == "glm") {
        fit.control <- fit.glm.control
        fit.control.coef <- coef(fit.glm.control)
      } else if (fit.start == "lm") {
        fit.control <- lm(y.control ~ x.control - 1, weights = w.control)
        fit.control.coef <- coef(fit.control)
      }
      
      fit.treat <- vcov.nls <- se.twostep <- par.treat.nls.std <- list()
      
      for(m in 1:length(treatment.values)){
        
        curr.treat <- t[t!=0] == treatment.values[m]
        
        x.treatment.curr <- x.treatment[curr.treat, , drop = F]
        w.treatment.curr <- w.treatment[curr.treat]
        
        ## calculate the adjusted outcome    
        y.treatment.pred <- y.treatment[curr.treat] - logistic(x.treatment.curr
                                                               %*% fit.control.coef)*J
        
        ## fit to the treated		
        y.treatment.pred.temp <- ifelse(y.treatment.pred>1, 1, y.treatment.pred)
        y.treatment.pred.temp <- ifelse(y.treatment.pred.temp<0, 0, y.treatment.pred.temp)
        
        alpha <- mean(y.treatment.pred.temp)
        
        y.treatment.start <- ifelse(y.treatment.pred.temp >
                                      quantile(y.treatment.pred.temp, alpha), 1, 0)
        
        try(fit.glm.treat <- glm(cbind(y.treatment.start, 1 - y.treatment.start) ~ x.treatment.curr - 1,
                                 family = binomial(logit), weights = w.treatment.curr), silent = F)
        try(coef.glm.treat <- coef(fit.glm.treat), silent = T)
        try(names(coef.glm.treat) <- paste("beta", 1:length(coef.glm.treat), sep = ""), silent = T)
        
        if(fit.start == "nls") {
          if(exists("coef.glm.treat")) {
            if (is.null(ictreg.call$control)) {
              fit.treat[[m]] <- nls( as.formula(paste("y.treatment.pred ~ logistic(x.treatment.curr %*% c(",
                                                      paste(paste("beta", 1:length(coef.glm.treat),
                                                                  sep=""), collapse= ","), "))")) ,
                                     start = coef.glm.treat, weights = w.treatment.curr, control =
                                       nls.control(maxiter = maxIter, warnOnly = TRUE), ... = ...)
            } else {
              fit.treat[[m]] <- nls( as.formula(paste("y.treatment.pred ~ logistic(x.treatment.curr %*% c(",
                                                      paste(paste("beta", 1:length(coef.glm.treat),
                                                                  sep=""), collapse= ","), "))")) ,
                                     start = coef.glm.treat, weights = w.treatment.curr, ... = ...)
            }
          } else {
            if (is.null(ictreg.call$control)) {
              fit.treat[[m]] <- nls( as.formula(paste("y.treatment.pred ~ logistic(x.treatment.curr %*% c(",
                                                      paste(paste("beta", 1:length(coef.glm.treat),
                                                                  sep=""), collapse= ","), "))")),
                                     weights = w.treatment.curr,
                                     control = nls.control(maxiter = maxIter, warnOnly = TRUE),
                                     ... = ...)
            } else {
              fit.treat[[m]] <- nls( as.formula(paste("y.treatment.pred ~ logistic(x.treatment.curr %*% c(",
                                                      paste(paste("beta", 1:length(coef.glm.treat),
                                                                  sep=""), collapse= ","), "))")) ,
                                     weights = w.treatment.curr,
                                     ... = ...)
            }
          }
          
        } else if (fit.start == "glm") {
          fit.treat[[m]] <- fit.glm.treat
        } else if (fit.start == "lm") {
          fit.treat[[m]] <- lm(y.treatment.pred ~ x.treatment.curr - 1, weights = w.treatment.curr)
        }
        
        sample.curr <- t==0 | t==treatment.values[m]
        
        vcov.nls[[m]] <- vcov.twostep.std(fit.treat[[m]], fit.control, J,
                                          y.all[sample.curr], t[sample.curr]>0, x.all[sample.curr, , drop = FALSE])
        
        se.twostep[[m]] <- sqrt(diag(vcov.nls[[m]]))
        par.treat.nls.std[[m]] <- coef(fit.treat[[m]])
        
      }
      
      if(multi == FALSE) {
        fit.treat <- fit.treat[[1]]
        vcov.nls <- vcov.nls[[1]]
        se.twostep <- se.twostep[[1]]
        par.treat.nls.std <- par.treat.nls.std[[m]]
      }
      
      par.control.nls.std <- coef(fit.control)
      
    } else if (error == "topcode" & method == "nls") {
      
      ## NLS top-coded error model
      
      #\l[Y_i - pJ - T_i\{ p + (1-p) \E(Z_i \mid X_i) \}- (1-p) \E(Y_i^\ast \mid X_i) \r]^2
      
      sse_nls_topcoded <- function(par, J, y, treat, x){
        pstar <- par[1]
        p <- exp(pstar) / (1 + exp(pstar))
        
        k <- ncol(x)
        coef.h <- par[2:(k + 1)]
        coef.g <- par[(k + 2):(2 * k + 1)]
        gX <- logistic(x %*% coef.g)
        hX <- logistic(x %*% coef.h)

        sse <- sum((y - (p * J + treat * (p + (1 - p) * gX) + (1 - p) * J * hX)) ^ 2)
        
        return(sse)
      }
      
      k <- ncol(x.all)
      
      start <- runif(k*2 + 1, min = -.5, max = .5)
      
      NLSfit <- optim(par = start, 
                      fn = sse_nls_topcoded, J = J, y = y.all,
                      treat = t, x = x.all, hessian = TRUE, control = list(maxit = maxIter))
      
      vcov.nls <- solve(NLSfit$hessian, tol = 1e-20)
      se.nls <- sqrt(diag(vcov.nls))
      
      par.treat <- NLSfit$par[(k + 2):(2 * k + 1)]
      par.control <- NLSfit$par[2:(k + 1)]
      se.treat <- se.nls[(k + 2):(2 * k + 1)]
      se.control <- se.nls[2:(k + 1)]
      
      p.est <- logistic(NLSfit$par[1])
      p.ci <- c("lwr" = logistic(NLSfit$par[1] - qnorm(.975)*se.nls[1]),
                "upr" = logistic(NLSfit$par[1] + qnorm(.975)*se.nls[1]))
      
    } else if (error == "uniform" & method == "nls") {
      
      ## NLS uniform error model
      
      #\frac{p_0 (1-T_i) J}{2} + T_i \l\{\frac{p_1 (J+1)}{2} +
      #    (1-p_1)\E(Z_i \mid X_i)\r\}  \nonumber \\
      # + \{(1-T_i)(1 - p_0) + T_i(1-p_1)\} \E(Y_i^\ast \mid X_i)
      
      # For the parameter p, we use the logit transformation by defining a new parameter p* = log {p / (1-p)}.  Then, substitute p = exp(p*) / { 1 + exp(p*)} into equation 11.
      
      sse_nls_uniform <- function(par, J, y, treat, x){
        p0star <- par[1]
        p1star <- par[2]
        p0 <- exp(p0star) / (1 + exp(p0star))
        p1 <- exp(p1star) / (1 + exp(p1star))
        
        k <- ncol(x)
        coef.h <- par[3:(k + 2)]
        coef.g <- par[(k + 3):(2 * k + 2)]
        gX <- logistic(x %*% coef.g)
        hX <- logistic(x %*% coef.h)
        
        sse <- sum((y - ((p0 * (1 - treat) * J) / 2 + 
                    treat * ((p1 * (J + 1)) / 2 + 
                               (1 - p1) * gX) +
                    ((1 - treat) * (1 - p0) + 
                       treat * (1 - p1)) * J * hX)) ^ 2)
        
        return(sse)
      }
      
      k <- ncol(x.all)
      
      start <- runif(k*2 + 2, min = -.5, max = .5)
      
      NLSfit <- optim(par = start, 
                      fn = sse_nls_uniform, J = J, y = y.all,
                      treat = t, x = x.all, hessian = TRUE, control = list(maxit = maxIter))
      
      vcov.nls <- solve(NLSfit$hessian, tol = 1e-20)
      se.nls <- sqrt(diag(vcov.nls))
      
      par.treat <- NLSfit$par[(k + 3):(2 * k + 2)]
      par.control <- NLSfit$par[3:(k + 2)]
      se.treat <- se.nls[(k + 3):(2 * k + 2)]
      se.control <- se.nls[3:(k + 2)]
      
      p0.est <- logistic(NLSfit$par[1])
      p0.ci <- c("lwr" = logistic(NLSfit$par[1] - qnorm(.975)*se.nls[1]),
                "upr" = logistic(NLSfit$par[1] + qnorm(.975)*se.nls[1]))
      
      p1.est <- logistic(NLSfit$par[2])
      p1.ci <- c("lwr" = logistic(NLSfit$par[2] - qnorm(.975)*se.nls[2]),
                "upr" = logistic(NLSfit$par[2] + qnorm(.975)*se.nls[2]))
      
    }
      
    if(method=="ml" & robust == FALSE) {
      
      if(boundary == FALSE) {
      
        if (multi == FALSE) {

          coef.control.start <- par.control.nls.std
          coef.treat.start <- par.treat.nls.std
          
          ## ##
          ## Run EM estimator if requested
          
          ##
          ## observed-data log-likelihood for beta binomial
          ##
          obs.llik.std <- function(par, J, y, treat, x, wt, const = FALSE, fit.sensitive) {
            
            k <- ncol(x)
            if (const) {
              coef.h0 <- coef.h1 <- par[1:k]
              rho.h0 <- rho.h1 <- logistic(par[(k+1)])
              coef.g <- par[(k+2):(2*k+1)]
            } else {
              coef.h0 <- par[1:k]
              rho.h0 <- logistic(par[(k+1)])
              coef.h1 <- par[(k+2):(2*k+1)]
              rho.h1 <- logistic(par[(2*k+2)])
              coef.g <- par[(2*k+3):(3*k+2)]
            }
            
            h0X <- logistic(x %*% coef.h0)
            h1X <- logistic(x %*% coef.h1)
            gX <- logistic(x %*% coef.g)
            
            ind10 <- ((treat == 1) & (y == 0))
            ind1J1 <- ((treat == 1) & (y == (J+1)))
            ind1y <- ((treat == 1) & (y > 0) & (y < (J+1)))
            
            if (sum(ind10) > 0) {
              p10 <- sum(wt[ind10] * (log(1-gX[ind10]) + dBB(x = 0, bd = J, mu = h0X[ind10],
                                                sigma = rho.h0/(1-rho.h0), log = TRUE)))
            } else {
              p10 <- 0
            }
            
            if (sum(ind1J1) > 0) {
              p1J1 <- sum(wt[ind1J1] * (log(gX[ind1J1]) + dBB(J, bd = J, mu = h1X[ind1J1],
                                                sigma = rho.h1/(1-rho.h1), log = TRUE)))
            } else {
              p1J1 <- 0
            }
            
            if (sum(ind1y) > 0) {
              p1y <- sum(wt[ind1y] * (log(gX[ind1y]*dBB(y[ind1y]-1, bd = J, mu = h1X[ind1y], sigma =
                                           rho.h1/(1-rho.h1), log = FALSE) + (1-gX[ind1y])*
                             dBB(y[ind1y], bd = J, mu = h0X[ind1y],
                                 sigma = rho.h0/(1-rho.h0), log = FALSE))))
            } else {
              p1y <- 0
            }
            
            if (sum(treat == 0) > 0) {
              p0y <- sum(wt[!treat] * (log(gX[!treat]*dBB(y[!treat], bd = J, mu = h1X[!treat],
                                            sigma = rho.h1/(1-rho.h1), log = FALSE) +
                             (1-gX[!treat])*dBB(y[!treat], bd = J, mu = h0X[!treat],
                                                sigma = rho.h0/(1-rho.h0), log = FALSE))))
            } else {
              p0y <- 0
            }
            
            if (fit.sensitive == "bayesglm") {
              p.prior.sensitive <- sum(dcauchy(x = coef.g, scale = bayesglm_priors(gX), log = TRUE))
            } else {
              p.prior.sensitive <- 0
            }
            
            return(p10+p1J1+p1y+p0y+p.prior.sensitive)
            
          }
          
          ##
          ##  Observed data log-likelihood for binomial
          ##
          obs.llik.binom.std <- function(par, J, y, treat, x, wt, const = FALSE, fit.sensitive) {
            
            k <- ncol(x)
            if (const) {
              coef.h0 <- coef.h1 <- par[1:k]
              coef.g <- par[(k+1):(2*k)]
            } else {
              coef.h0 <- par[1:k]
              coef.h1 <- par[(k+1):(2*k)]
              coef.g <- par[(2*k+1):(3*k)]
            }
            
            h0X <- logistic(x %*% coef.h0)
            h1X <- logistic(x %*% coef.h1)
            gX <- logistic(x %*% coef.g)
            
            ind10 <- ((treat == 1) & (y == 0))
            ind1J1 <- ((treat == 1) & (y == (J+1)))
            ind1y <- ((treat == 1) & (y > 0) & (y < (J+1)))
            
            if (sum(ind10) > 0) {
              p10 <- sum(wt[ind10] * (log(1-gX[ind10]) + dbinom(x = 0, size = J, prob = h0X[ind10], log = TRUE)))
            } else {
              p10 <- 0
            }
            if (sum(ind1J1) > 0) {
              p1J1 <- sum(wt[ind1J1] * (log(gX[ind1J1]) + dbinom(J, size = J, prob = h1X[ind1J1], log = TRUE)))
            } else {
              p1J1 <- 0
            }
            if (sum(ind1y) > 0) {
              p1y <- sum(wt[ind1y] * (log(gX[ind1y]*dbinom(y[ind1y]-1, size = J, prob = h1X[ind1y],
                                              log = FALSE) + (1-gX[ind1y])*
                             dbinom(y[ind1y], size = J, prob = h0X[ind1y], log = FALSE))))
            } else {
              p1y <- 0
            }
            
            if (sum(treat == 0) > 0) {
              p0y <- sum(wt[!treat] * (log(gX[!treat]*dbinom(y[!treat], size = J, prob = h1X[!treat], log = FALSE) +
                             (1-gX[!treat])*dbinom(y[!treat], size = J, prob = h0X[!treat], log = FALSE))))
            } else {
              p0y <- 0
            }
            
            if (fit.sensitive == "bayesglm") {
              p.prior.sensitive <- sum(dcauchy(x = coef.g, scale = bayesglm_priors(gX), log = TRUE))
            } else {
              p.prior.sensitive <- 0
            }
            
            return(p10+p1J1+p1y+p0y+p.prior.sensitive)
          }
          
          ##
          ## EM algorithm
          ##
          
          ## Estep for beta-binomial
          Estep.std <- function(par, J, y, treat, x, const = FALSE) {
            
            k <- ncol(x)
            if (const) {
              coef.h0 <- coef.h1 <- par[1:k]
              rho.h0 <- rho.h1 <- logistic(par[(k+1)])
              coef.g <- par[(k+2):(2*k+1)]
            } else {
              coef.h0 <- par[1:k]
              rho.h0 <- logistic(par[(k+1)])
              coef.h1 <- par[(k+2):(2*k+1)]
              rho.h1 <- logistic(par[(2*k+2)])
              coef.g <- par[(2*k+3):(3*k+2)]
            }
            
            h0X <- logistic(x %*% coef.h0)
            h1X <- logistic(x %*% coef.h1)
            gX <- logistic(x %*% coef.g)
            
            ind <- !((treat == 1) & ((y == 0) | (y == (J+1))))
            w <- rep(NA, length(y))
            w[ind] <- gX[ind]*dBB((y-treat)[ind], bd = J, mu = h1X[ind], sigma = rho.h1/(1-rho.h1), log = FALSE) /
              (gX[ind]*dBB((y-treat)[ind], bd = J, mu = h1X[ind], sigma = rho.h1/(1-rho.h1), log = FALSE) +
               (1-gX[ind])*dBB(y[ind], bd = J, mu = h0X[ind], sigma = rho.h0/(1-rho.h0), log = FALSE))
            
            w[(treat == 1) & (y == 0)] <- 0
            w[(treat == 1) & (y == (J+1))] <- 1
            
            return(w)
          }
          
          
          ## Estep for binomial
          Estep.binom.std <- function(par, J, y, treat, x, const = FALSE) {
            
            k <- ncol(x)
            if (const) {
              coef.h0 <- coef.h1 <- par[1:k]
              coef.g <- par[(k+1):(2*k)]
            } else {
              coef.h0 <- par[1:k]
              coef.h1 <- par[(k+1):(2*k)]
              coef.g <- par[(2*k+1):(3*k)]
            }
            
            h0X <- logistic(x %*% coef.h0)
            h1X <- logistic(x %*% coef.h1)
            gX <- logistic(x %*% coef.g)
            
            ind <- !((treat == 1) & ((y == 0) | (y == (J+1))))
            w <- rep(NA, length(y))
            w[ind] <- gX[ind]*dbinom((y-treat)[ind], size = J, prob = h1X[ind], log = FALSE) /
              (gX[ind]*dbinom((y-treat)[ind], size = J, prob = h1X[ind], log = FALSE) +
               (1-gX[ind])*dbinom(y[ind], size = J, prob = h0X[ind], log = FALSE))
            w[(treat == 1) & (y == 0)] <- 0
            w[(treat == 1) & (y == (J+1))] <- 1
            
            return(w)
          }
          
          ## Mstep 1: weighted MLE for logistic regression
          wlogit.fit.std <- function(y, treat, x, w, par = NULL, wt, fit.sensitive) {
              ## wt is survey weights
              
            yrep <- rep(c(1,0), each = length(y))
            xrep <- rbind(x, x)
            wrep <- c(w, 1-w)
            wtrep <- c(wt, wt)
            
            if(fit.sensitive == "glm"){
              fit <- glm(cbind(yrep, 1-yrep) ~ xrep - 1, weights = wrep * wtrep, family = binomial(logit), start = par)
            } else if (fit.sensitive == "bayesglm"){
              fit <- bayesglm(cbind(yrep, 1-yrep) ~ xrep - 1,
                                        weights = wrep * wtrep, family = binomial(logit),
                                        start = par, control = glm.control(maxit = maxIter), scaled = F,
                                prior.scale = 2.5, prior.scale.for.intercept = 10)
            } else {
              stop("Please choose 'glm' or 'bayesglm' for fit.sensitive.")
            }
            
            return(fit)
            
          }
          
          ## Mstep 2: weighted MLE for beta-binomial regression
          wbb.fit.std <- function(J, y, treat, x, w, par0 = NULL, par1 = NULL) {
            
            Not0 <- ((treat == 1) & (y == (J+1)))
            y0 <- y[!Not0]
            x0 <- x[!Not0,]
            w0 <- 1-w[!Not0]
            fit0 <- vglm(cbind(y0, J-y0) ~ x0, betabinomial, weights = w0, coefstart = par0)
            
            Not1 <- ((treat == 1) & (y == 0))
            y1 <- y
            y1[treat == 1] <- y1[treat == 1] - 1
            y1 <- y1[!Not1]
            x1 <- x[!Not1]
            w1 <- w[!Not1]
            fit1 <- vglm(cbind(y1, J-y1) ~ x1, betabinomial, weights = w1, coefstart = par1)
            
            return(list(fit1 = fit1, fit0 = fit0))
          }
          
          ## Mstep2: weighted MLE for binomial regression
          wbinom.fit.std <- function(J, y, treat, x, w, par0 = NULL, par1 = NULL) {
            
            Not0 <- ((treat == 1) & (y == (J+1)))
            y0 <- y[!Not0]
            x0 <- x[!Not0,]
            w0 <- 1-w[!Not0]
            fit0 <- glm(cbind(y0, J-y0) ~ x0, family = binomial(logit), weights = w0, start = par0)
            
            Not1 <- ((treat == 1) & (y == 0))
            y1 <- y
            y1[treat == 1] <- y1[treat == 1] - 1
            y1 <- y1[!Not1]
            x1 <- x[!Not1,]
            w1 <- w[!Not1]
            fit1 <- glm(cbind(y1, J-y1) ~ x1, family = binomial(logit), weights = w1, start = par1)
            
            return(list(fit1 = fit1, fit0 = fit0))
          }
          
          ##
          ## Running the EM algorithm
          ##
                    
          if(constrained==F) {
            
            ## Run unconstrained model
            
            if (overdispersed==T) {
              par <- c(rep(c(coef.control.start, 0.5), 2), coef.treat.start)
            } else {
              par <- c(rep(coef.control.start, 2), coef.treat.start)
            }
            
            ## max number of iterations
            pllik <- -Inf
            
            if (overdispersed==T) {
              llik <- obs.llik.std(par, J = J, y = y.all, treat = t, x = x.all, wt = w.all, fit.sensitive = fit.sensitive)
            } else {
              llik <- obs.llik.binom.std(par, J = J, y = y.all, treat = t, x = x.all, wt = w.all, fit.sensitive = fit.sensitive)
            }
            
            Not0 <- (t & (y.all == (J+1)))
            Not1 <- (t & (y.all == 0))
            
            counter <- 0
            while (((llik - pllik) > 10^(-8)) & (counter < maxIter)) {
              
              if(overdispersed==T) {
                
                w <- Estep.std(par, J, y.all, t, x.all)
                
                lfit <- wlogit.fit.std(y.all, t, x.all, w, par = par[(nPar*2+3):length(par)], wt = w.all, fit.sensitive = fit.sensitive)
              
                y1 <- y.all
                
                y1[t==1] <- y1[t==1]-1

                if (intercept.only == TRUE) {
                  fit0 <- vglm(cbind(y.all[!Not0], J-y.all[!Not0]) ~ 1,
                               betabinomial, weights = (1-w[!Not0]) * w.all[!Not0], coefstart = par[1:2])
                  fit1 <- vglm(cbind(y1[!Not1], J-y1[!Not1]) ~ 1,
                               betabinomial, weights = w[!Not1] * w.all[!Not1],
                               coefstart = par[3:4])
                  par <- c(coef(fit0), coef(fit1), coef(lfit))
                  
                } else {
                  fit0 <- vglm(cbind(y.all[!Not0], J-y.all[!Not0]) ~ x.all[!Not0, -1, drop = FALSE],
                               betabinomial, weights = (1-w[!Not0]) * w.all[!Not0],
                               coefstart = par[c(1,(nPar+1),2:(nPar))])
                  fit1 <- vglm(cbind(y1[!Not1], J-y1[!Not1]) ~ x.all[!Not1, -1, drop = FALSE] ,
                               betabinomial, weights = w[!Not1] * w.all[!Not1],
                               coefstart = par[c((nPar+2),(2*nPar+2),(nPar+3):(2*nPar+1))])
                  par <- c(coef(fit0)[c(1,3:(nPar+1),2)], coef(fit1)[c(1,3:(nPar+1),2)], coef(lfit))
                }
                                
              } else {
                
                w <- Estep.binom.std(par, J, y.all, t, x.all)

                lfit <- wlogit.fit.std(y.all, t, x.all, w, par = par[(nPar*2+1):length(par)], wt = w.all, fit.sensitive = fit.sensitive)
                
                fit0 <- glm(cbind(y.all[!Not0], J-y.all[!Not0]) ~ x.all[!Not0,] - 1,
                            family = binomial(logit), weights = (1-w[!Not0]) * w.all[!Not0], start = par[1:(nPar)])
                
                y1 <- y.all
                
                y1[t==1] <- y1[t==1]-1
                
                fit1 <- glm(cbind(y1[!Not1], J-y1[!Not1]) ~ x.all[!Not1,] - 1,
                            family = binomial(logit), weights = w[!Not1] * w.all[!Not1],
                            start = par[(nPar+1):(2*nPar)])
                
                par <- c(coef(fit0), coef(fit1), coef(lfit))
                
              }
              
              pllik <- llik
              
              if(verbose==T)
                cat(paste(counter, round(llik, 4), "\n"))
              
              if (overdispersed==T) {
                llik <- obs.llik.std(par, J = J, y = y.all, treat = t, x = x.all, wt = w.all, fit.sensitive = fit.sensitive)
              } else {
                llik <- obs.llik.binom.std(par, J = J, y = y.all, treat = t, x = x.all, wt = w.all, fit.sensitive = fit.sensitive)
              }
              
              counter <- counter + 1
              
              if (llik < pllik) 
                warning("log-likelihood is not monotonically increasing.")
              ## NOTE CHANGED FROM STOP()
              
              if(counter == (maxIter-1))
                warning("number of iterations exceeded maximum in ML.")
              
            }
            
            ## getting  standard errors
            
            if (overdispersed==T) {
              MLEfit <- optim(par, obs.llik.std, method = "BFGS", J = J, y = y.all,
                              treat = t, x = x.all, wt = w.all, fit.sensitive = fit.sensitive, hessian = TRUE, control = list(maxit = 0))
              vcov.mle <- solve(-MLEfit$hessian, tol = 1e-20)
              se.mle <- sqrt(diag(vcov.mle))
            } else {
              MLEfit <- optim(par, obs.llik.binom.std, method = "BFGS", J = J, y = y.all,
                              treat = t, x = x.all, wt = w.all, fit.sensitive = fit.sensitive, hessian = TRUE, control = list(maxit = 0))
              vcov.mle <- solve(-MLEfit$hessian, tol = 1e-20)
              se.mle <- sqrt(diag(vcov.mle))
            }
            
          } else { # end of unconstrained model
            
            ## constrained model

            if (overdispersed==T) {
              par <- c(coef.control.start, 0.5, coef.treat.start)
            } else {
              par <- c(coef.control.start, coef.treat.start)
            }
            
            pllik.const <- -Inf
            
            if (overdispersed==T) {
              llik.const <- obs.llik.std(par, J = J, y = y.all, treat = t,
                                         x = x.all, wt = w.all, fit.sensitive = fit.sensitive, const = TRUE)
            } else {
              llik.const <- obs.llik.binom.std(par, J = J, y = y.all, treat = t,
                                               x = x.all, wt = w.all, fit.sensitive = fit.sensitive, const = TRUE)
            }
            
            counter <- 0
            while (((llik.const - pllik.const) > 10^(-8)) & (counter < maxIter)) {
              
              if (overdispersed==T) {
                
                w <- Estep.std(par, J, y.all, t, x.all, const = TRUE)
                
                lfit <- wlogit.fit.std(y.all, t, x.all, w, par = par[(nPar+2):(nPar*2+1)], wt = w.all, fit.sensitive = fit.sensitive)
                
              } else {
                
                w <- Estep.binom.std(par, J, y.all, t, x.all, const = TRUE)
                
                lfit <- wlogit.fit.std(y.all, t, x.all, w, par = par[(nPar+1):(nPar*2)], wt = w.all, fit.sensitive = fit.sensitive)
                
              }
              
              y.var <- as.character(formula)[[2]]
 
              data.all <- as.data.frame(cbind(y.all, x.all))
              names(data.all)[1] <- y.var
              
              dtmp <- rbind(cbind(data.all, w, t, w.all), cbind(data.all, w, t, w.all)[t==1, ])
              
              dtmp[((dtmp$t == 1) & ((1:nrow(dtmp)) <= n)), paste(y.var)] <-
                dtmp[((dtmp$t == 1) & ((1:nrow(dtmp)) <= n)), paste(y.var)] - 1
              
              dtmp$w[((dtmp$t == 1) & ((1:nrow(dtmp)) > n))] <-
                1 - dtmp$w[((dtmp$t == 1) & ((1:nrow(dtmp)) > n))]
              
              dtmp$w[dtmp$t == 0] <- 1

              dtmp$w <- dtmp$w * dtmp$w.all
              
              dtmp <- dtmp[dtmp$w > 0, ]
              
              if (overdispersed==T) {

                if (intercept.only == TRUE) {

                  fit <- vglm(as.formula(paste("cbind(", y.var, ", J-", y.var, ") ~ 1")),
                              betabinomial, weights = dtmp$w,
                              coefstart = par[1:2], data = dtmp)
                  
                  par <- c(coef(fit), coef(lfit))

                } else {
                  
                  fit <- vglm(as.formula(paste("cbind(", y.var, ", J-", y.var, ") ~ ",
                                               paste(x.vars, collapse=" + "))),
                              betabinomial, weights = dtmp$w,
                              coefstart = par[c(1,(nPar+1),2:(nPar))], data = dtmp)
                  
                  par <- c(coef(fit)[c(1,3:(nPar+1),2)], coef(lfit))

                }
                  
              } else {

                ## GB: This is where the error comes in for survey weights.

                fit <- glm(as.formula(paste("cbind(", y.var, ", J-", y.var, ") ~ ",
                                            paste(x.vars, collapse=" + "))),
                           family = binomial(logit), weights = dtmp$w,
                           start = par[1:(nPar)], data = dtmp)
                
                par <- c(coef(fit), coef(lfit))
                
              }
              
              pllik.const <- llik.const
              
              if(verbose==T)
                cat(paste(counter, round(llik.const, 4), "\n"))
              
              if (overdispersed==T) {
                llik.const <- obs.llik.std(par, J = J, y = y.all, treat = t,
                                           x = x.all, wt = w.all, const = TRUE, fit.sensitive = fit.sensitive)
              } else {
                llik.const <- obs.llik.binom.std(par, J = J, y = y.all, treat = t,
                                                 x = x.all, wt = w.all, const = TRUE, fit.sensitive = fit.sensitive)
              }
              
              counter <- counter + 1
              if (llik.const < pllik.const)
                warning("log-likelihood is not monotonically increasing.")
              ## NOTE CHANGED FROM STOP()
              
              if(counter == (maxIter-1))
                warning("number of iterations exceeded maximum in ML")
              
            }
            
            if (overdispersed==T) {
              
              MLEfit <- optim(par, obs.llik.std, method = "BFGS", J = J,
                              y = y.all, treat = t, x = x.all, wt = w.all, const = TRUE, fit.sensitive = fit.sensitive,
                              hessian = TRUE, control = list(maxit = 0))
              
              vcov.mle <- solve(-MLEfit$hessian, tol = 1e-20)
              se.mle <- sqrt(diag(vcov.mle))
              
            } else {
              
              MLEfit <- optim(par, obs.llik.binom.std, method = "BFGS", J = J,
                              y = y.all, treat = t,  x = x.all, wt = w.all, const = TRUE, fit.sensitive = fit.sensitive,
                              hessian = TRUE, control = list(maxit = 0))
              
              vcov.mle <- solve(-MLEfit$hessian, tol = 1e-20)
              se.mle <- sqrt(diag(vcov.mle))
              
            }

          } # end of constrained model

        } else if (multi == TRUE) {

          ##
          ## Start multi-treatment standard design model
          
          ## test start values
          ## par <- rep(0, 3*k + 2*(J+1))
            
          obs.llik.multi <- function(par, J, y, treat, x, multi = "none") {

            treat.values <- sort(unique(treat[treat!=0]))
            k <- ncol(x)

            ## pull out all g coefs, including alpha_yt and beta_t
            par.g <- par[(k+1):length(par)]

            ## pull out h coefs and construct h vector
            coef.h <- par[1:k]
            hX <- logistic(x %*% coef.h)

            ## separate coefs for alpha_yt and beta_t and construct
            ## g_t vector for each treatment condition by filling in a
            ## matrix with columns representing possible values of Y_i(0),
            ## rows individuals, and it is in a list where g_t = gX[[t]]
            coef.g.b <- coef.g.a <- gX <- list()
            for (m in 1:length(treat.values)) {

              if (multi == "indicators") {
                coef.g.b[[m]] <- par.g[((m-1)*(k+J)+1):((m-1)*(k+J)+k)]
                coef.g.a[[m]] <- c(par.g[((m-1)*(k+J)+k+1):((m-1)*(k+J)+k+J)],0)
              } else if (multi == "level") {
                coef.g.b[[m]] <- par.g[((m-1)*(k+1)+1):((m-1)*(k+1)+k)]
                coef.g.a[[m]] <- par.g[((m-1)*(k+1)+k+1)]
              } else if (multi == "none") {
                coef.g.b[[m]] <- par.g[((m-1)*k+1):((m-1)*k+k)]
              }

              gX[[m]] <- matrix(NA, nrow = length(y), ncol = J+1)

              if (multi == "indicators") {
                for(y.iter in 0:J) {
                  gX[[m]][,y.iter + 1] <- logistic(coef.g.a[[m]][y.iter + 1] + x %*% coef.g.b[[m]])
                }
              } else if (multi == "level") {
                for(y.iter in 0:J) {
                  gX[[m]][,y.iter + 1] <- logistic(y.iter * coef.g.a[[m]] + x %*% coef.g.b[[m]])
                }
              } else if (multi == "none") {
                for(y.iter in 0:J) {
                  gX[[m]][,y.iter + 1] <- logistic(x %*% coef.g.b[[m]])
                }
              }
            }

            ## contribution from control group
            
            ind0y <- (treat==0)
            p0y <- sum(dbinom(x = y[ind0y], size = J, prob = hX[ind0y], log = TRUE))

            ## contribution from treatment groups
            
            pm0 <- pmJ1 <- rep(NA, length(treat.values))
            pmy <- matrix(NA, nrow = length(treat.values), ncol = J)

            for (m in 1:length(treat.values)) {
              
              indm0 <- ((treat == m) & (y == 0))
              pm0[m] <- sum(dbinom(x = 0, size = J, prob = hX[indm0], log = TRUE) +
                            log(1 - gX[[m]][indm0, 0 + 1]))

              indmJ1 <- ((treat == m) & (y == (J+1)))
              pmJ1[m] <- sum(dbinom(x = J, size = J, prob = hX[indmJ1], log = TRUE) +
                             log(gX[[m]][indmJ1, J + 1]))

              for (y.iter in 1:J) {
                indmy <- ((treat == m) & (y == y.iter))
                pmy[m,y.iter] <- sum(log(gX[[m]][indmy, (y.iter - 1) + 1] *
                                         dbinom(x = y.iter - 1, size = J,
                                                prob = hX[indmy], log = FALSE) +
                 (1-gX[[m]][indmy, y.iter + 1]) * dbinom(x = y.iter, size = J,
                                                         prob = hX[indmy], log = FALSE)))
              }
                         
            }

            return(p0y + sum(pm0) + sum(pmJ1) + sum(pmy))
            
          }

          Estep.multi <- function(par, J, y, treat, x, m, multi = "none") {

            treat.values <- sort(unique(treat[treat!=0]))
            k <- ncol(x)

            ## pull out all g coefs, including alpha_yt and beta_t
            par.g <- par[(k+1):length(par)]

            ## pull out h coefs and construct h vector
            coef.h <- par[1:k]
            hX <- logistic(x %*% coef.h)

            if (multi == "indicators") {
              coef.g.b <- par.g[((m-1)*(k+J)+1):((m-1)*(k+J)+k)]
              coef.g.a <- c(par.g[((m-1)*(k+J)+k+1):((m-1)*(k+J)+k+J)],0)
            } else if (multi == "level") {
              coef.g.b <- par.g[((m-1)*(k+1)+1):((m-1)*(k+1)+k)]
              coef.g.a <- par.g[((m-1)*(k+1)+k+1)]
            } else if (multi == "none") {
              coef.g.b <- par.g[((m-1)*k+1):((m-1)*k+k)]
            }

            ## construct g_t matrix, where columns represent values of y_i(0),
            ## since g_t varies in y_i(0) through alpha_ty. Rows are indivs.

            ## in constrained model, g_t is constant across y_i(0) values
            
            gX <- matrix(NA, nrow = length(y), ncol = J+1)

            if (multi == "indicators") {
              for(y.iter in 0:J) {
                gX[,y.iter + 1] <- logistic(coef.g.a[y.iter + 1] + x %*% coef.g.b)
              }
            } else if (multi == "level") {
              for(y.iter in 0:J) {
                gX[,y.iter + 1] <- logistic(y.iter * coef.g.a + x %*% coef.g.b)
              }
            } else if (multi == "none") {
              for(y.iter in 0:J) {
                gX[,y.iter + 1] <- logistic(x %*% coef.g.b)
              }
            }

            w <- rep(NA, length(y))

            ind0 <- ((y==0) & (treat==m))
            w[ind0] <- 0

            indJ1 <- ((y==(J+1)) & (treat==m))
            w[indJ1] <- 1

            ## for values of Y_i(0) not 0 or (J+1), construct the weight by
            ## looping through all possible values of Y_i(0) and adding weights
            ## for individuals with that Y_i(0)
            
            for (y.iter in 1:J) {
              
              indother <- ((ind0==FALSE) & (indJ1==FALSE) & (y==y.iter) & (treat==m))
              w[indother] <- (gX[indother, (y.iter - 1) + 1] *
                              dbinom(x = y.iter - 1, size = J, prob = hX[indother],
                                     log = FALSE)) /
                                (gX[indother, (y.iter - 1) + 1] *
                                 dbinom(x = y.iter - 1, size = J, prob = hX[indother],
                                        log = FALSE) +
                                 (1 - gX[indother, y.iter + 1]) *
                                 dbinom(x = y.iter, size = J, prob = hX[indother],
                                        log = FALSE) )
              
            }
            
            return(w)
            
          }

          ##
          ## start EM algorithm

          ## get starting values from NLS
          
          par <- par.control.nls.std
          if (multi.condition == "none") {
            par <- c(par, do.call(c, par.treat.nls.std))
          } else if (multi.condition == "indicators") {
            for (m in 1:length(treatment.values)){
              par <- c(par, par.treat.nls.std[[m]], rep(0, J))
            }
          } else if (multi.condition == "level") {
            for (m in 1:length(treatment.values)){
              par <- c(par, par.treat.nls.std[[m]], 0)
            }
          }
           
          pllik <- -Inf
          
          llik <- obs.llik.multi(par, J = J, y = y.all, treat = t, x = x.all,
                                 multi = multi.condition)

          ##
          ## create temporary datasets for M step

          ## start fit data for H including control group data
          
          ytmpH <- y.all[t==0]
          xtmpH <- x.all[t==0, , drop = FALSE]
          
          ## now loop over treatment conditions to construct data for g_t fit
          ## and add treatment condition data (m times) to data for h fit

          ytmpG <- xtmpG <- dvtmpG <- list()

          for (m in 1:length(treatment.values)){

            ##
            ## set up temporary data for g fit
            
            ytmpG[[m]] <- c(y.all[t==m]-1, y.all[t==m])
            xtmpG[[m]] <- rbind(x.all[t==m, , drop = FALSE], x.all[t==m, , drop = FALSE])
            dvtmpG[[m]] <- c(rep(1, sum(t==m)), rep(0, sum(t==m)))

            if (multi.condition == "indicators") {
              
              ## create matrix of indicator variables for non-sensitive y_i(0) value
              ## omits indicator for value == J              
              y.ind <- matrix(NA, nrow = sum(t==m) * 2, ncol = J)
              for (y.iter in 0:(J-1))
                y.ind[, y.iter + 1] <- ytmpG[[m]] == y.iter
              
              ## x vector for g_t fit includes covariates and y_i(0) indicator matrix
              ## the constrained model does not include these indicators in the g_t M fits              
              xtmpG[[m]] <- cbind(xtmpG[[m]], y.ind)
              
            } else if (multi.condition == "level") {

              xtmpG[[m]] <- cbind(xtmpG[[m]], ytmpG[[m]])
              
            }
            
            ##
            ## set up temporary data for h fit
            
            ytmpH <- c(ytmpH, y.all[t==m]-1, y.all[t==m])
            xtmpH <- rbind(xtmpH, x.all[t==m, , drop = FALSE], x.all[t==m, , drop = FALSE])
            
          }

          counter <- 0
          while (((llik - pllik) > 10^(-8)) & (counter < maxIter)) {
            
            wtmpH <- rep(1, sum(t==0))
            
            lfit <- list()
            
            for (m in 1:length(treatment.values)){

              ## get E step values for this treatment condition
              w <- Estep.multi(par, J, y.all, t, x.all, m, multi = multi.condition)

              ## create weight vectors for temporary data for fits

              wtmpG <- c(w[t==m], 1-w[t==m])

              ## for h, we are adding weights to the vector of weights = 1 created earlier
              
              wtmpH <- c(wtmpH, w[t==m], 1-w[t==m])

              if (multi.condition == "none")
                par.start <- par[(nPar + (m-1) * nPar + 1):(nPar + (m-1) * nPar + nPar)]
              else if (multi.condition == "indicators")
                par.start <- par[(nPar + (m - 1) * (nPar + J) + 1):
                                 (nPar + (m - 1) * (nPar + J) + nPar + J)]
              else if (multi.condition == "level")
                par.start <- par[(nPar + (m - 1) * (nPar + 1) + 1):
                                 (nPar + (m - 1) * (nPar + 1) + nPar + 1)]

              ## run g_t fits
              lfit[[m]] <- glm(cbind(dvtmpG[[m]][wtmpG>0], 1 - dvtmpG[[m]][wtmpG>0]) ~
                               xtmpG[[m]][wtmpG>0, , drop = FALSE] - 1, weights = wtmpG[wtmpG>0],
                               family = binomial(logit), start = par.start,
                               control = glm.control(maxit = maxIter))

            }

            ## run h fit
            fit <- glm(cbind(ytmpH[wtmpH>0], J - ytmpH[wtmpH>0]) ~ xtmpH[wtmpH>0, , drop = FALSE] - 1,
                       family = binomial(logit), weights = wtmpH[wtmpH>0], start = par[1:(nPar)])

            ## put coefficients back together
            par <- coef(fit)
            par.treat <- list()
            for (m in 1:length(treatment.values)) {
               par <- c(par, coef(lfit[[m]]))
               par.treat[[m]] <- coef(lfit[[m]])
            }

            pllik <- llik
            
            if(verbose==T)
              cat(paste(counter, round(llik, 4), "\n"))
            
            llik <- obs.llik.multi(par, J = J, y = y.all, treat = t, x = x.all,
                                   multi = multi.condition)
            
            counter <- counter + 1
            if (llik < pllik)
              warning("log-likelihood is not monotonically increasing.")
            ## NOTE CHANGED FROM STOP()
            
            if(counter == (maxIter-1))
              warning("number of iterations exceeded maximum in ML")
            
          }

          MLEfit <- optim(par, obs.llik.multi, method = "BFGS", J = J, y = y.all, treat = t,
                          x = x.all, hessian = TRUE, control = list(maxit = 0),
                          multi = multi.condition)
          
          vcov.mle <- solve(-MLEfit$hessian, tol = 1e-20)
          
          se.mle <- sqrt(diag(vcov.mle))
  
        } # end of multi treat regression
        
      } else if (boundary == TRUE) {
                
        coef.control.start <- par.control.nls.std
        coef.treat.start <- par.treat.nls.std

        ##
        ##  Observed data log-likelihood for binomial
        ##
        
        obs.llik.binom.boundary <- function(par, J, y, treat, x, x.floor, x.ceiling, ceiling.fit,
                                            floor.fit, ceiling, floor) {
          
          k <- ncol(x)
          k.ceiling <- ncol(x.ceiling)
          k.floor <- ncol(x.floor)
          
          coef.h <- par[1:k]
          coef.g <- par[(k+1):(2*k)]
          coef.ql <- par[(2*k+1):(2*k + k.floor)] 
          coef.qu <- par[(2*k+k.floor+1):(2*k+k.floor+k.ceiling)] 
          
          hX <- logistic(x %*% coef.h)
          gX <- logistic(x %*% coef.g)
          quX <- logistic(x.ceiling %*% coef.qu)
          qlX <- logistic(x.floor %*% coef.ql)
          
          ind0y <- ((treat == 0) & (y < (J+1)))
          ind10 <- ((treat == 1) & (y == 0))
          ind11 <- ((treat == 1) & (y == 1))
          ind1y <- ((treat == 1) & (y > 1) & (y < J))
          ind1J <- ((treat == 1) & (y == J))
          ind1J1 <- ((treat == 1) & (y == (J+1)))
          
          p0y <- p10 <- p11 <- p1y <- p1J <- p1J1 <- 0
          
          if(sum(ind0y) > 0){
            p0y <- sum(dbinom(x = y[ind0y], size = J, prob = hX[ind0y], log = TRUE))
          }
          
          if(sum(ind10) > 0){
            p10 <- sum(log((1-gX[ind10]) + gX[ind10] * qlX[ind10])
                       + dbinom(x = 0, size = J, prob = hX[ind10], log = TRUE))
          }
          
          if (sum(ind11) > 0) {
            p11 <- sum(log( gX[ind11] * (1-qlX[ind11]) *
                           dbinom(x = 0, size = J, prob = hX[ind11], log = FALSE) +
                           (1-gX[ind11]) * dbinom(x = 1, size = J, prob = hX[ind11], log = FALSE) ))
          }
          
          if (sum(ind1y) > 0) {
            p1y <- sum(log( gX[ind1y] * dbinom(x = y[ind1y]-1, size = J, prob = hX[ind1y], log = FALSE) +
                           (1-gX[ind1y]) * dbinom(x = y[ind1y], size = J, prob = hX[ind1y], log = FALSE) ))
          }
          
          if (sum(ind1J)) {
            p1J <- sum(log( gX[ind1J] * (dbinom(x = J-1, size = J, prob = hX[ind1J], log = FALSE) +
                                         quX[ind1J] * dbinom(x = J, size = J, prob = hX[ind1J], log = FALSE)) +
                           (1-gX[ind1J]) * dbinom(x = J, size = J, prob = hX[ind1J], log = FALSE) ))
          }
          
          if (sum(ind1J1) > 0) {
            p1J1 <- sum(log(1-quX[ind1J1]) + log(gX[ind1J1]) + dbinom(x = J, size = J, prob = hX[ind1J1], log = TRUE))
          }
          
          if (ceiling.fit=="bayesglm" & ceiling == TRUE) {
            p.prior.ceiling <- sum(dcauchy(x = coef.qu,
                                           scale = bayesglm_priors(quX), log = TRUE))
          } else {
            p.prior.ceiling <- 0
          }
          
          if (floor.fit=="bayesglm" & floor == TRUE) {
            p.prior.floor <- sum(dcauchy(x = coef.ql,
                                         scale = bayesglm_priors(qlX), log = TRUE))
          } else {
            p.prior.floor <- 0
          }
					
          return(p0y  + p10  + p11  + p1y  + p1J  + p1J1 + p.prior.ceiling + p.prior.floor)
        }
				
        ##
        ##  Observed data log-likelihood for binomial for optim -- shortened par
        ##
        obs.llik.binom.optim.boundary <- function(par, J, y, treat, x, x.floor, x.ceiling,
                                                  ceiling.fit, floor.fit, floor, ceiling) {
          
          k <- ncol(x)
          k.ceiling <- ncol(x.ceiling)
          k.floor <- ncol(x.floor)
          coef.h <- par[1:k]
          coef.g <- par[(k+1):(2*k)]
          if(floor==TRUE) {
            coef.ql <- par[(2*k+1):(2*k + k.floor)]
            if(ceiling==TRUE){
              coef.qu <- par[(2*k+k.floor+1):(2*k+k.floor+k.ceiling)] 
            } else {
              coef.qu <- c(-Inf, rep(0, (k.ceiling-1)))
            }
          } else {
            coef.ql <- c(-Inf, rep(0, (k.floor-1)))
            if(ceiling==TRUE){
              coef.qu <- par[(2*k+1):(2*k + k.ceiling)] 
            } else {
              coef.qu <- c(-Inf, rep(0, (k.ceiling-1)))
            }
          }
          
          hX <- logistic(x %*% coef.h)
          gX <- logistic(x %*% coef.g)
          quX <- logistic(x.ceiling %*% coef.qu)
          qlX <- logistic(x.floor %*% coef.ql)
          
          ind0y <- ((treat == 0) & (y < (J+1)))
          ind10 <- ((treat == 1) & (y == 0))
          ind11 <- ((treat == 1) & (y == 1))
          ind1y <- ((treat == 1) & (y > 1) & (y < J))
          ind1J <- ((treat == 1) & (y == J))
          ind1J1 <- ((treat == 1) & (y == (J+1)))
          
          p0y <- p10 <- p11 <- p1y <- p1J <- p1J1 <- 0
          
          if(sum(ind0y) > 0){
            p0y <- sum(dbinom(x = y[ind0y], size = J, prob = hX[ind0y], log = TRUE))
          }
          
          if(sum(ind10) > 0){
            p10 <- sum(log((1-gX[ind10]) + gX[ind10] * qlX[ind10]) + dbinom(x = 0, size = J, prob = hX[ind10], log = TRUE))
          }
          
          if(sum(ind11) > 0){
            p11 <- sum(log( gX[ind11] * (1-qlX[ind11]) * dbinom(x = 0, size = J, prob = hX[ind11], log = FALSE) + (1-gX[ind11]) * dbinom(x = 1, size = J, prob = hX[ind11], log = FALSE) ))
          }
          
          if(sum(ind1y) > 0){
            p1y <- sum(log( gX[ind1y] * dbinom(x = y[ind1y]-1, size = J, prob = hX[ind1y], log = FALSE) + (1-gX[ind1y]) * dbinom(x = y[ind1y], size = J, prob = hX[ind1y], log = FALSE) ))
          }
          
          if(sum(ind1J)){
            p1J <- sum(log( gX[ind1J] * (dbinom(x = J-1, size = J, prob = hX[ind1J], log = FALSE) + quX[ind1J] * dbinom(x = J, size = J, prob = hX[ind1J], log = FALSE)) + (1-gX[ind1J]) * dbinom(x = J, size = J, prob = hX[ind1J], log = FALSE) ))
          }
          
          if(sum(ind1J1) > 0){
            p1J1 <- sum(log(1-quX[ind1J1]) + log(gX[ind1J1]) + dbinom(x = J, size = J, prob = hX[ind1J1], log = TRUE))
          }
          
          if(ceiling.fit=="bayesglm" & ceiling==TRUE) {
            p.prior.ceiling <- sum(dcauchy(x = coef.qu,
                                         scale = bayesglm_priors(quX), log = TRUE))
          } else {
            p.prior.ceiling <- 0
          }
          
          if(floor.fit=="bayesglm" & floor==TRUE) {
            p.prior.floor <- sum(dcauchy(x = coef.ql,
                                         scale = bayesglm_priors(qlX), log = TRUE))
          } else {
            p.prior.floor <- 0
          }
          
          return(p0y  + p10  + p11  + p1y  + p1J  + p1J1 + p.prior.ceiling + p.prior.floor)
          
        }
        
        ##
        ## EM algorithm
        ##
        
        ## Estep for beta-binomial
        Estep.boundary <- function(par, J, y, treat, x, product) {

          ## not currently used
          ## needs to be updated for various changes, including ceiling/floor formulaes
          
          k <- ncol(x)
          coef.h <- par[1:k]
          rho.h <- logistic(par[(k+1)])
          coef.g <- par[(k+2):(2*k+1)]
          coef.ql <- par[(2*k+2):(3*k+1)] 
          coef.qu <- par[(3*k+2):(4*k+1)] 
          
          hX <- logistic(x %*% coef.h)
          gX <- logistic(x %*% coef.g)
          quX <- logistic(x %*% coef.qu)
          qlX <- logistic(x %*% coef.ql)
          
          ind0 <- (y==0)
          ind1 <-  (y==1)
          indJ <- (y==J)
          indJ1 <- (y==(J+1))
          indother <- (y > 1 & y < J)
          
          w <- rep(NA, length(y))
          
          if(product == FALSE){
            
            w[ind0] <- dBB(0, bd = J, mu = hX[ind0], sigma = rho.h/(1-rho.h), log = FALSE)*gX[ind0]*qlX[ind0] /
              (dBB(0, bd = J, mu = hX[ind0], sigma = rho.h/(1-rho.h), log = FALSE) * (gX[ind0]*qlX[ind0] + (1-gX[ind0]) ) )
            
            w[ind1] <- dBB(0, bd = J, mu = hX[ind1], sigma = rho.h/(1-rho.h), log = FALSE)*gX[ind1]*(1-qlX[ind1]) / ( dBB(0, bd = J, mu = hX[ind1], sigma = rho.h/(1-rho.h), log = FALSE) * gX[ind1] * (1-qlX[ind1]) + dBB(1, bd = J, mu = hX[ind1], sigma = rho.h/(1-rho.h), log = FALSE)*(1-gX[ind1]) )
            
            w[indother] <- gX[indother]*dBB((y-1)[indother], bd = J, mu = hX[indother], sigma = rho.h/(1-rho.h), log = FALSE) /
              (gX[indother]*dBB((y-1)[indother], bd = J, mu = hX[indother], sigma = rho.h/(1-rho.h), log = FALSE) +
               (1-gX[indother])*dBB(y[indother], bd = J, mu = hX[indother], sigma = rho.h/(1-rho.h), log = FALSE))
            
            w[indJ] <- gX[indJ]*( dBB(J-1, bd = J, mu = hX[indJ], sigma = rho.h/(1-rho.h), log = FALSE) + quX[indJ]*dBB(J, bd = J, mu = hX[indJ], sigma = rho.h/(1-rho.h), log = FALSE) ) / ( gX[indJ] * ( dBB(J-1, bd = J, mu = hX[indJ], sigma = rho.h/(1-rho.h), log = FALSE) + quX[indJ] * dBB(J, bd = J, mu = hX[indJ], sigma = rho.h/(1-rho.h), log = FALSE)) + (1-gX[indJ])*dBB(J, bd = J, mu = hX[indJ], sigma = rho.h/(1-rho.h), log = FALSE) )
            
            w[indJ1] <- 1
            
          }
          
          if(product == TRUE){
            
            w[ind0] <- 0
            
            w[ind1] <- dBB(0, bd = J, mu = hX[ind1], sigma = rho.h/(1-rho.h), log = FALSE)*gX[ind1]*(1-qlX[ind1]) / ( dBB(0, bd = J, mu = hX[ind1], sigma = rho.h/(1-rho.h), log = FALSE) * gX[ind1] * (1-qlX[ind1]) + dBB(1, bd = J, mu = hX[ind1], sigma = rho.h/(1-rho.h), log = FALSE)*(1-gX[ind1]) )
            
            w[indother] <- gX[indother]*dBB((y-1)[indother], bd = J, mu = hX[indother], sigma = rho.h/(1-rho.h), log = FALSE) / (gX[indother]*dBB((y-1)[indother], bd = J, mu = hX[indother], sigma = rho.h/(1-rho.h), log = FALSE) + (1-gX[indother])*dBB(y[indother], bd = J, mu = hX[indother], sigma = rho.h/(1-rho.h), log = FALSE))
            
            w[indJ] <- gX[indJ]*dBB(J - 1, bd = J, mu = hX[indJ], sigma = rho.h/(1-rho.h), log = FALSE) / (gX[indJ]*(dBB(J - 1, bd = J, mu = hX[indJ], sigma = rho.h/(1-rho.h), log = FALSE) + quX[indJ]*dBB(J, bd = J, mu = hX[indJ], sigma = rho.h/(1-rho.h), log = FALSE)) + (1-gX[indJ])*dBB(J, bd = J, mu = hX[indJ], sigma = rho.h/(1-rho.h), log = FALSE))
            
            w[indJ1] <- 1
            
          }
          
          return(w)
          
        }
        
        ## Estep for binomial
        Estep.binom.boundary <- function(par, J, y, treat, x, x.floor, x.ceiling, product) {
          
          k <- ncol(x)
          k.ceiling <- ncol(x.ceiling)
          k.floor <- ncol(x.floor)
          coef.h <- par[1:k]
          coef.g <- par[(k+1):(2*k)]
          coef.ql <- par[(2*k+1):(2*k+k.floor)] 
          coef.qu <- par[(2*k+k.floor+1):(2*k+k.floor+k.ceiling)] 
          
          hX <- logistic(x %*% coef.h)
          gX <- logistic(x %*% coef.g)
          quX <- logistic(x.ceiling %*% coef.qu)
          qlX <- logistic(x.floor %*% coef.ql)
          
          ind0 <- (y==0)
          ind1 <-  (y==1)
          indJ <- (y==J)
          indJ1 <- (y==(J+1))
          indother <- (y > 1 & y < J)
          
          w <- rep(NA, length(y))
          
          if(product == FALSE){
            
            w[ind0] <- (dbinom(x = 0, size = J, prob = hX[ind0], log = FALSE) * gX[ind0] * qlX[ind0]) / (dbinom(x = 0, size = J, prob = hX[ind0], log = FALSE) * (gX[ind0] * qlX[ind0] + (1-gX[ind0])))
            
            w[ind1] <- (dbinom(x = 0, size = J, prob = hX[ind1], log = FALSE) * gX[ind1] * (1-qlX[ind1])) / (dbinom(x = 0, size = J, prob = hX[ind1], log = FALSE) * gX[ind1] * (1-qlX[ind1]) + dbinom(x = 1, size = J, prob = hX[ind1], log = FALSE) * (1-gX[ind1]))
            
            w[indother] <- (gX[indother] * dbinom(x = y[indother]-1, size = J, prob = hX[indother], log = FALSE)) / (gX[indother] * dbinom(x = y[indother]-1, size = J, prob = hX[indother], log = FALSE) + (1-gX[indother]) * dbinom(x = y[indother], size = J, prob = hX[indother], log = FALSE) )
            
            w[indJ] <- (gX[indJ] * (dbinom(x = J-1, size = J, prob = hX[indJ], log = FALSE) + quX[indJ] * dbinom(x = J, size = J, prob = hX[indJ], log = FALSE))) / (gX[indJ] * (dbinom(x = J-1, size = J, prob = hX[indJ], log = FALSE) + quX[indJ] * dbinom(x = J, size = J, prob = hX[indJ], log = FALSE)) + (1-gX[indJ]) * dbinom(x = J, size = J, prob = hX[indJ], log = FALSE))
            
            w[indJ1] <- 1
            
          }
          
          if(product == TRUE){
            
            w[ind0] <- 0
            
            w[ind1] <- (dbinom(x = 0, size = J, prob = hX[ind1], log = FALSE) * gX[ind1] * (1-qlX[ind1])) / (dbinom(x = 0, size = J, prob = hX[ind1], log = FALSE) * gX[ind1] * (1-qlX[ind1]) + dbinom(x = 1, size = J, prob = hX[ind1], log = FALSE) * (1-gX[ind1]))
            
            w[indother] <- (gX[indother] * dbinom(x = y[indother]-1, size = J, prob = hX[indother], log = FALSE)) / (gX[indother] * dbinom(x = y[indother]-1, size = J, prob = hX[indother], log = FALSE) + (1-gX[indother]) * dbinom(x = y[indother], size = J, prob = hX[indother], log = FALSE) )
            
            w[indJ] <- (gX[indJ] * dbinom(x = J-1, size = J, prob = hX[indJ], log = FALSE)) / (gX[indJ] * (dbinom(x = J-1, size = J, prob = hX[indJ], log = FALSE) + quX[indJ] * dbinom(x = J, size = J, prob = hX[indJ], log = FALSE)) + (1-gX[indJ]) * dbinom(x = J, size = J, prob = hX[indJ], log = FALSE))
        
            w[indJ1] <- 1
            
          }
          
          return(w)
          
        }
        
        ## Mstep 1: weighted MLE for logistic regression
        wlogit.fit.boundary <- function(y, x, w, par = NULL, maxIter = 50000) {
          yrep <- rep(c(1,0), each = length(y))
          xrep <- rbind(x, x)
          wrep <- c(w, 1-w)
          return(glm(cbind(yrep, 1-yrep) ~ xrep - 1, weights = wrep, family = binomial(logit), start = par, control = glm.control(maxit = maxIter)))
          
        }
        
        ## Mstep2: weighted MLE for binomial regression
        wbinom.fit.boundary <- function(J, y, treat, x, w, par0 = NULL, par1 = NULL) {
          
          Not0 <- ((treat == 1) & (y == (J+1)))
          y0 <- y[!Not0]
          x0 <- x[!Not0,]
          w0 <- 1-w[!Not0]
          fit0 <- glm(cbind(y0, J-y0) ~ x0, family = binomial(logit), weights = w0, start = par0)
          
          Not1 <- ((treat == 1) & (y == 0))
          y1 <- y
          y1[treat == 1] <- y1[treat == 1] - 1
          y1 <- y1[!Not1]
          x1 <- x[!Not1,]
          w1 <- w[!Not1]
          fit1 <- glm(cbind(y1, J-y1) ~ x1, family = binomial(logit), weights = w1, start = par1)
          
          return(list(fit1 = fit1, fit0 = fit0))
        }
        
        ##
        ## Running the EM algorithm
        ##

        if (overdispersed == T)
          stop("The ceiling-floor liar model does not support overdispersion. Set overdispersed = F.")
                
        y.var <- as.character(formula)[[2]]

        x.ceiling.all <- model.matrix(as.formula(ceiling.formula), as.data.frame(x.all))
        x.ceiling.treatment <- model.matrix(as.formula(ceiling.formula), as.data.frame(x.treatment))
        x.ceiling.control <- model.matrix(as.formula(ceiling.formula), as.data.frame(x.control))
        
        nPar.ceiling <- ncol(x.ceiling.all)
        
        intercept.only.ceiling <- ncol(x.ceiling.all)==1 & sum(x.ceiling.all[,1]==1) == n
        
        coef.names.ceiling <- colnames(x.ceiling.all)
        
        if (intercept.only.ceiling == TRUE) {
          x.vars.ceiling <- "1"
        } else {
          x.vars.ceiling <- coef.names.ceiling[-1]
        }
        
        x.floor.all <- model.matrix(as.formula(floor.formula), as.data.frame(x.all))
        x.floor.treatment <- model.matrix(as.formula(floor.formula), as.data.frame(x.treatment))
        x.floor.control <- model.matrix(as.formula(floor.formula), as.data.frame(x.control))
        nPar.floor <- ncol(x.floor.all)
        
        intercept.only.floor <- ncol(x.floor.all)==1 & sum(x.floor.all[,1]==1) == n
        
        coef.names.floor <- colnames(x.floor.all)
        
        if (intercept.only.floor == TRUE) {
          x.vars.floor <- "1"
        } else {
          x.vars.floor <- coef.names.floor[-1]
        }
        
        data.all <- as.data.frame(cbind(y.all, x.all))
        names(data.all)[1] <- y.var
        
        data.treatment <- as.data.frame(cbind(y.treatment, x.treatment))
        names(data.treatment)[1] <- y.var
        
        ## set start values
        
        data.ceiling.all <- as.data.frame(cbind(y.all, x.ceiling.all))
        names(data.ceiling.all)[1] <- y.var
       
        if (ceiling == TRUE){
    
          data.ceiling.treatment <- as.data.frame(cbind(y.treatment, x.ceiling.treatment))
          names(data.ceiling.treatment)[1] <- y.var
          
          dtmpS <- data.treatment[data.ceiling.treatment[,c(y.var)]==J |
                                  data.ceiling.treatment[,c(y.var)]==(J+1), ]
          
          dtmpS$dv <- dtmpS[,c(y.var)] == J
          
          coef.ceiling.start <- coef(glm(as.formula(paste("cbind(dv, 1-dv) ~ ",
                                                          paste(x.vars.ceiling, collapse=" + "))),
                                         family = binomial(logit), data = dtmpS,
                                         control = glm.control(maxit = maxIter)))
        } else {
          coef.ceiling.start <- c(-Inf, rep(0, (nPar.ceiling-1)))
        }

        data.floor.all <- as.data.frame(cbind(y.all, x.floor.all))
        names(data.floor.all)[1] <- y.var
        
        if (floor == TRUE){
          
          data.floor.treatment <- as.data.frame(cbind(y.treatment, x.floor.treatment))
          names(data.floor.treatment)[1] <- y.var
          
          dtmpS <- data.floor.treatment[data.floor.treatment[,c(y.var)]==0 |
                                        data.floor.treatment[,c(y.var)]==1, ]
          
          dtmpS$dv <- dtmpS[,c(y.var)] == 1
          
          coef.floor.start <- coef(glm(as.formula(paste("cbind(dv, 1-dv) ~ ",
                                                        paste(x.vars.floor, collapse=" + "))),
                                       family = binomial(logit), data = dtmpS,
                                       control = glm.control(maxit = maxIter)))
          
        } else {
          coef.floor.start <- c(-Inf, rep(0, (nPar.floor-1)))
        }
        
        par <- c(coef.control.start, coef.treat.start, coef.floor.start, coef.ceiling.start)
       
        ## calculate starting log likelihood

        pllik <- -Inf
        
        llik <- obs.llik.binom.boundary(par, J = J, y = y.all, treat = t, x = x.all,
                                        x.ceiling = x.ceiling.all, x.floor = x.floor.all,
                                        ceiling.fit = ceiling.fit, floor.fit = floor.fit,
                                        ceiling = ceiling, floor = floor)
        
        ## begin EM iterations
        
        counter <- 0
        while (((llik - pllik) > 10^(-8)) & (counter < maxIter)) {
          
          w <- Estep.binom.boundary(par, J, y.all, t, x.all,
                                    x.floor.all, x.ceiling.all, product = FALSE)
          w.prod <- Estep.binom.boundary(par, J, y.all, t, x.all,
                                         x.floor.all, x.ceiling.all, product = TRUE)
          
          lfit <- wlogit.fit.boundary(y.all[t==1], x.all[t==1, , drop = FALSE], w[t==1],
                             par = par[(nPar+1):(nPar*2)], maxIter = maxIter)
          
          dtmp <- rbind(data.all[t==0, ], data.all[t==1, ], data.all[t==1, ], data.all[t==1, ])
          
          dtmp$w <- c(rep(1, sum(t==0)), (w-w.prod)[t==1], (1-w)[t==1], w.prod[t==1])
          
          dtmp[(sum(c(t==0, t==1, t==1))+1):nrow(dtmp),paste(y.var)] <-
            dtmp[(sum(c(t==0, t==1, t==1))+1):nrow(dtmp),paste(y.var)] - 1
          
          dtmp <- dtmp[dtmp$w > 0, ]
          
          ## temporary data for ceiling logit
          dtmpC <- rbind(data.ceiling.all[t==1 & y.all==(J+1),], data.ceiling.all[t==1 & y.all==J, ])
          
          dtmpC[,paste(y.var)] <- c(rep(0,sum(t==1 & y.all==(J+1))), rep(1,sum(t==1 & y.all==J)))
          
          dtmpC$w <- c( w.prod[t==1 & y.all==(J+1)], (w - w.prod)[t==1 & y.all==J])
          
          dtmpC <- dtmpC[dtmpC$w > 0, ]
          
          ## temporary data for floor logit
          dtmpF <- rbind(data.floor.all[t==1 & y.all==1,], data.floor.all[t==1 & y.all==0, ])
          
          dtmpF[,paste(y.var)] <- c(rep(0,sum(t==1 & y.all==1)), rep(1,sum(t==1 & y.all==0)))
          
          dtmpF$w <- c(w.prod[t==1 & y.all==1], (w-w.prod)[t==1 & y.all==0])
          
          dtmpF <- dtmpF[dtmpF$w > 0, ]
                      
          fit <- glm(as.formula(paste("cbind(", y.var, ", J-", y.var, ") ~ ",
                                      paste(x.vars, collapse=" + "))),
                     family = binomial(logit), weights = dtmp$w, start = par[1:(nPar)],
                     data = dtmp)
          
          if (ceiling==TRUE) {
            ##coef.qufit.start <- par[(3*nPar+1):(4*nPar)]
            coef.qufit.start <- par[(2*nPar + nPar.floor + 1):(2*nPar + nPar.floor + nPar.ceiling)]
          } else {
            coef.qufit.start <- rep(0, nPar.ceiling)
          }

          if (ceiling == TRUE) {
            if (ceiling.fit=="glm") {
              qufit <- glm(as.formula(paste("cbind(", y.var, ", 1-", y.var, ") ~ ",
                                            paste(x.vars.ceiling, collapse=" + "))),
                           weights = dtmpC$w, family = binomial(logit),
                           start = coef.qufit.start, data = dtmpC,
                           control = glm.control(maxit = maxIter))
            } else if(ceiling.fit=="bayesglm") {
              
              if (intercept.only.ceiling == F) {
                qufit <- bayesglm(as.formula(paste("cbind(", y.var, ", 1-", y.var, ") ~ ",
                                                   paste(x.vars.ceiling, collapse=" + "))),
                                  weights = dtmpC$w, family = binomial(logit),
                                  start = coef.qufit.start, data = dtmpC,
                                  control = glm.control(maxit = maxIter), scaled = F,
                                prior.scale = 2.5, prior.scale.for.intercept = 10)
                
              } else {
                qufit <- bayesglm(as.formula(paste("cbind(", y.var, ", 1-", y.var, ") ~ 1")),
                                  weights = dtmpC$w, family = binomial(logit),
                                  start = coef.qufit.start, data = dtmpC,
                                  control = glm.control(maxit = maxIter), scaled = F,
                                prior.scale = 2.5, prior.scale.for.intercept = 10)
                
              }
            }
          }
          
          if(floor==TRUE) {
            ##coef.qlfit.start <- par[(2*nPar+1):(3*nPar)]
            coef.qlfit.start <- par[(2*nPar+1):(2*nPar + nPar.floor)]
          } else {
            coef.qlfit.start <- rep(0, nPar.floor)
          }

          if (floor == TRUE) {
            if(floor.fit=="glm") {
              qlfit <- glm(as.formula(paste("cbind(", y.var, ", 1-", y.var, ") ~ ",
                                            paste(x.vars.floor, collapse=" + "))), weights = dtmpF$w,
                           family = binomial(logit), start = coef.qlfit.start, data = dtmpF,
                           control = glm.control(maxit = maxIter))
            } else if(floor.fit=="bayesglm") {
              if (intercept.only.floor == F) {
                qlfit <- bayesglm(as.formula(paste("cbind(", y.var, ", 1-", y.var, ") ~ ",
                                                   paste(x.vars.floor, collapse=" + "))),
                                  weights = dtmpF$w, family = binomial(logit),
                                  start = coef.qlfit.start, data = dtmpF,
                                  control = glm.control(maxit = maxIter), scaled = F,
                                prior.scale = 2.5, prior.scale.for.intercept = 10)
              } else {
                qlfit <- bayesglm(as.formula(paste("cbind(", y.var, ", 1-", y.var, ") ~ 1")),
                                  weights = dtmpF$w, family = binomial(logit),
                                  start = coef.qlfit.start, data = dtmpF,
                                  control = glm.control(maxit = maxIter), scaled = F,
                                prior.scale = 2.5, prior.scale.for.intercept = 10)
              }
            }
          }
          
          ## reset parameters based on floor / ceiling options
          
          if(floor==TRUE & ceiling==TRUE) {
            par <- c(coef(fit), coef(lfit),coef(qlfit), coef(qufit))
          } else if(floor==FALSE & ceiling==TRUE) {
            par <- c(coef(fit), coef(lfit),  c(-Inf, rep(0, (nPar.floor-1))), coef(qufit))
          } else if(floor==TRUE & ceiling==FALSE) {
            par <- c(coef(fit), coef(lfit),coef(qlfit),  c(-Inf, rep(0, (nPar.ceiling-1))))
          }
          
          pllik <- llik
          
          if(verbose==T) cat(paste(counter, round(llik, 4), "\n"))
          
          llik <- obs.llik.binom.boundary(par, J = J, y = y.all, treat = t, x = x.all,
                                          x.ceiling = x.ceiling.all, x.floor = x.floor.all,
                                          ceiling.fit = ceiling.fit, floor.fit = floor.fit,
                                          ceiling = ceiling, floor = floor)
          			  
          counter <- counter + 1
          if (llik < pllik)
            warning("log-likelihood is not monotonically increasing.")
          ## NOTE CHANGED FROM STOP()
          
          if(counter == (maxIter-1))
            warning("number of iterations exceeded maximum in ML")
				  				  
        }
          
        if(floor==FALSE & ceiling==TRUE) {
          par <- par[c((1:(2*nPar)), (((2*nPar + nPar.floor)+1):(2*nPar + nPar.floor + nPar.ceiling)))]
        } else if(floor==TRUE & ceiling==FALSE) {
          par <- par[1:(2*nPar + nPar.floor)]
        }
        
        MLEfit <- optim(par, obs.llik.binom.optim.boundary, method = "BFGS", J = J, y = y.all,
                        treat = t,  x = x.all, x.ceiling = x.ceiling.all, x.floor = x.floor.all,
                        ceiling.fit = ceiling.fit,
                        floor.fit  = floor.fit, floor = floor, ceiling = ceiling,
                        hessian = TRUE, control = list(maxit = 0))
        
        if(ceiling.fit=="bayesglm" & ceiling==TRUE) {
          p.prior.ceiling <- sum(dcauchy(x = coef(qufit),
                                         scale = bayesglm_priors_from_fit(qufit), log = TRUE))
        } else {
          p.prior.ceiling <- 0
        }
        
        if(floor.fit=="bayesglm" & floor==TRUE) {
          p.prior.floor <- sum(dcauchy(x = coef(qlfit),
                                       scale = bayesglm_priors_from_fit(qlfit), log = TRUE))
        } else {
          p.prior.floor <- 0
        }
        
        llik <- llik - p.prior.floor - p.prior.ceiling
        
        vcov.mle <- solve(-MLEfit$hessian, tol = 1e-20)
        se.mle <- sqrt(diag(vcov.mle))
        
      } # end of boundary ml
      
    } # end of em algorithm
    
  } else if (design=="modified") {

    ## end of standard design
    
    ##
    ## Run two-step estimator for modified design
    
    ## fit to the control group
    
    fit.control <- list()
    par.control.list <- list()
    pi.pred <- matrix(NA, nrow(x.treatment), J)
    
    for (j in 1:J) {
      
      fit.glm.control <- glm(y.control[,j] ~ x.control - 1, family = binomial(logit), weights = w.control)
      coef.glm.control <- coef(fit.glm.control)
      names(coef.glm.control) <- paste("beta", 1:length(coef.glm.control), sep = "")
      
      if (fit.nonsensitive == "glm") {
        fit.control[[j]] <- fit.glm.control
      } else if (fit.nonsensitive == "nls") {

        tmpY <- y.control[, j]
        
        fit.control[[j]] <- nls( as.formula(paste("tmpY ~ logistic( x.control %*% c(",
                                                  paste(paste("beta", 1:length(coef.glm.control),
                                                              sep=""), collapse= ","), "))")) ,
                                start = coef.glm.control,
                                weights = w.control,
                                control = nls.control(maxiter=maxIter, warnOnly=TRUE))
      }
      
      par.control.list[[j]] <- coef(fit.control[[j]])
      
      pi.pred[,j] <- logistic(x.treatment %*% par.control.list[[j]])
      
    }
    
    y.treatment.pred <- y.treatment[,J+1] - apply(pi.pred, 1, sum)
    
    y.treatment.pred.temp <- ifelse(y.treatment.pred > 1, 1, y.treatment.pred)
    y.treatment.pred.temp <- ifelse(y.treatment.pred.temp < 0, 0, y.treatment.pred.temp)
    
    alpha <- mean(y.treatment.pred.temp)
    
    y.treatment.start <- ifelse(y.treatment.pred.temp >=
                                quantile(y.treatment.pred.temp, alpha), 1, 0)
    
    try(fit.glm.control <- glm(cbind(y.treatment.start, 1 - y.treatment.start) ~ x.treatment - 1, family = binomial(logit),
                               weights = w.treatment))
    try(coef.glm.control <- coef(fit.glm.control), silent = T)
    try(names(coef.glm.control) <- paste("beta", 1:length(coef.glm.control), sep = ""), silent = T)
    
    if(exists("coef.glm.control")) {
      try(fit.treat <- nls( as.formula(paste("y.treatment.pred ~ logistic( x.treatment %*% c(",
                                             paste(paste("beta", 1:length(coef.glm.control),sep=""),
                                                   collapse= ","), "))")) , start = coef.glm.control,
                           weights = w.treatment,
                           control = nls.control(maxiter=maxIter, warnOnly=TRUE)), silent = T)
    } else {
      try(fit.treat <- nls( as.formula(paste("y.treatment.pred ~ logistic( x.treatment %*% c(",
                                             paste(paste("beta", 1:length(coef.glm.control),sep=""),
                                                   collapse= ","), "))")) ,
                           weights = w.treatment,
                           control = nls.control(maxiter=maxIter, warnOnly=TRUE)), silent = T)
    }

    if(!exists("fit.treat"))
      fit.treat <- lm(y.treatment.pred ~ x.treatment - 1, weights = w.treatment)
      
    vcov.twostep.modified <- function(coef.treat, coef.control, J,
                                      x1, y1, x0, y0, fit.nonsensitive = "nls") {
      
      nPar <- length(coef.treat)
      
      n <- length(c(y1,y0))
      
      Htmp <- c(logistic(x1 %*% coef.treat) / (1 + exp(x1 %*% coef.treat))) * x1
      
      H <- - t(Htmp) %*% Htmp / n
      
      L <- matrix(NA, ncol=(J*nPar), nrow=nPar)
      
      for(j in 1:J){
        
        Ltmp <- c(sqrt(logistic(x1 %*% coef.control[[j]])*
                       logistic(x1 %*% coef.treat)/((1 + exp(x1 %*% coef.control[[j]]))*
                                                    (1 + exp(x1 %*% coef.treat))))) * x1
        
        L[,((j-1)*nPar+1):((j)*nPar)] <- -t(Ltmp) %*% Ltmp / n
        
      }
      
      M <- matrix(0, ncol = (J*nPar), nrow = (J*nPar))
      
      if (fit.nonsensitive=="glm") {
        
        for(j in 1:J){
        			
          Mtmp <- sqrt(c(exp(x0 %*% coef.control[[j]])/(1+exp(x0 %*% coef.control[[j]])) -
                         exp(2*x0 %*% coef.control[[j]])/(1+exp(x0 %*% coef.control[[j]]))^2))*
                           x0
          
          M[((j-1)*nPar+1):((j)*nPar),((j-1)*nPar+1):((j)*nPar)] <- -t(Mtmp) %*% Mtmp / n
          
        }
        
      } else if (fit.nonsensitive == "nls") {
        
        for(j in 1:J){
          
          Mtmp <- c(logistic(x0 %*% coef.control[[j]])/(1 + exp(x0 %*% coef.control[[j]]))) * x0
          
          M[((j-1)*nPar+1):((j)*nPar),((j-1)*nPar+1):((j)*nPar)] <- -t(Mtmp) %*% Mtmp / n
          
        }
        
      }
      
      invH <- solve(H, tol = 1e-20)
      invM <- solve(M, tol = 1e-20)
      
      invG <- rbind( cbind(invH, -invH %*% L %*% invM),
                    cbind(matrix(0, nrow(invM), ncol(invH)), invM)  ) 
      
      gtmp <- matrix(NA, nrow=nrow(x1), ncol=J)
      for(j in 1:J)
        gtmp[,j] <- logistic(x1 %*% coef.control[[j]])
      
      gtmpsum <- as.vector(apply(gtmp, 1, sum))
      
      g <- c((y1[,J+1] - gtmpsum - logistic(x1 %*% coef.treat))*
             logistic(x1 %*% coef.treat)/(1+exp(x1 %*% coef.treat))) * x1
      
      gg <- t(g) %*% g / n
      
      h <- list()
      
      if (fit.nonsensitive == "glm"){
        for (j in 1:J) {
          h[[j]] <- c((y0[,j] - logistic(x0 %*% coef.control[[j]]))) * x0
        }
      } else if (fit.nonsensitive == "nls") {
        for (j in 1:J) {
          h[[j]] <- c((y0[,j] - logistic(x0 %*% coef.control[[j]]))*
                      logistic(x0 %*% coef.control[[j]])/(1+exp(x0 %*% coef.control[[j]]))) * x0
        }	
      }
      
      Fh <- matrix(NA, nrow = J*nPar, ncol = J*nPar)
      for(j in 1:J) {
        for(k in 1:J){
          Fh[((j-1)*nPar+1):(j*nPar), ((k-1)*nPar+1):(k*nPar)] <- t(h[[j]]) %*% h[[k]] / n
        }
      }
      
      F <- adiag(gg, Fh)
    
      return(invG %*% F %*% t(invG) / n)
      
    }
    
    par.treat.nls.mod <- coef(fit.treat)
    par.control.nls.mod <- do.call(c, par.control.list)
    
    if(method == "nls") {
      vcov.twostep <- vcov.twostep.modified(par.treat.nls.mod, par.control.list,
                                            J, x.treatment, y.treatment, x.control, y.control,
                                            fit.nonsensitive)
      se.twostep <- sqrt(diag(vcov.twostep))
    }
    
    ##
    ## Run maximum likelihood method
    
    if(method == "ml"){

      coef.control.start <- par.control.nls.mod
      coef.treat.start <- par.treat.nls.mod
      
      R.poisbinomR <- function(k, p) {								
        colSum <- 0
        for(i in 1:k){
          if((k-i)>=1) colSum <- colSum + (-1)^(i+1) * sum( (p/(1-p))^i ) * R.poisbinomR(k - i, p)
          if((k-i)==0) colSum <- colSum + (-1)^(i+1) * sum( (p/(1-p))^i )
          if((k-i)<0) colSum <- colSum + 0						
        }
        if(k>0) return((1/k) * colSum)	
        else if(k==0) return(1)
      }
      
      ## Poisson-Binomial density function
      ## takes y scalar and p vector of Bernoulli success probabilities
      dpoisbinomR <- function(y, p) {
        
        return( R.poisbinomR(y, p) * prod(1 - p) )
        
      }
      
      dpoisbinomC <- function(y, p) {
        
        k <- length(p)
        return(.C("dpoisbinom", as.integer(y), as.integer(k), as.double(p),
                  res = double(1), PACKAGE = "list")$res)
        
      }
      
      R.poisbinomC <- function(k, p) {
        
        return(.C("RpoisbinomReturn", as.integer(k), as.double(p),
                  as.integer(length(p)), res = double(1), PACKAGE = "list")$res)
        
      }
      
      R.poisbinomE <- function(k, p) {
        
        maxk <- max(k)
        return(.C("RpoisbinomEff", as.integer(maxk), 
                  as.double(p), as.integer(length(p)), Rs = double(maxk+1),
                  PACKAGE = "list")$Rs[k+1])
        
      }
      
      dpoisbinomE <- function(y, p) {
        
        return(R.poisbinomE(y, p)*prod(1-p))
        
      }
      
      R.poisbinomM <- function(k, p) {
        
        m <- ncol(p)
        if (m != length(k))
          stop("the dimension of k match with the column dimension of p")
        return(.C("RpoisbinomEffMatrix", as.integer(k), as.integer(max(k)),
                  as.double(p), as.integer(nrow(p)), as.integer(m),
                  Rs = double(m), PACKAGE = "list")$Rs)
        
      }
      
      dpoisbinomM <- function(y, p) {
        
        return(R.poisbinomM(y, p)*apply(1-p, 2, prod))
        
      }
      
      obs.llik.modified <- function(par, y, X, treat, wt) {
        
        J <- ncol(y) - 1
        
        nPar <- length(par) / (J+1)
        
        n.treat <- nrow(y[treat==1,])
        
        x.treat <- X[treat==1,]
        y.treat <- y[treat==1,]
        
        pi <- matrix(NA, ncol = nrow(X), nrow = J+1)
        for(j in 1:(J+1))
          pi[j,] <- logistic(X %*% par[((j-1)*nPar+1):(j*nPar)])
        
        llik.treat <- sum(wt[t==1] * log(dpoisbinomM(y.treat[,J+1], pi[,t==1])))
        
        x.control <- X[treat==0,]
        y.control <- y[treat==0,]
        
        llik.control <- 0
        for(j in 1:J)
          llik.control <- llik.control + sum(wt[t==0] * (y.control[,j]*log(pi[j,t==0]) +
                                             (1-y.control[,j])*log(1-pi[j,t==0])))
        
        llik <- llik.treat + llik.control
        
        return(llik)
        
      }
      
      Estep.modified <- function(par, y, X, treat) {
        
        J <- ncol(y) - 1
        
        nPar <- length(par) / (J+1)
        
        x.treat <- X[treat==1, , drop = FALSE]
        y.treat <- y[treat==1, , drop = FALSE]
        
        n.treat <- nrow(x.treat)
        
        pi <- matrix(NA, ncol = n.treat, nrow = J+1)
        for(j in 1:(J+1))
          pi[j,] <- logistic(x.treat %*% par[((j-1)*nPar+1):(j*nPar)])
        
        all0.cond <- y.treat[,J+1]==0
        all1.cond <- y.treat[,J+1]==(J+1)
        either.cond <- all0.cond==TRUE | all1.cond==TRUE
        
        rpb <- matrix(NA, nrow = n.treat, ncol = J+1)
        rpb.inv <- matrix(NA, nrow = n.treat, ncol = J+1)
        
        rpb.inv.vector <- R.poisbinomM(y.treat[!either.cond, J+1], pi[,!either.cond])
        
        for(j in 1:(J+1)){
          rpb[!either.cond ,j] <- R.poisbinomM(y.treat[!either.cond,J+1]-1, pi[-j,!either.cond])
          rpb.inv[!either.cond,j] <- rpb.inv.vector
        }
        
        w <- t(pi) * rpb / ((1-t(pi)) * rpb.inv)
        w[all0.cond, ] <- matrix(0, ncol = ncol(w), nrow = sum(all0.cond))
        w[all1.cond, ] <- matrix(1, ncol = ncol(w), nrow = sum(all1.cond))
        
        return(w)
        
      }
      
      par <- c(coef.control.start, coef.treat.start)
      
      nPar <- length(par) / (J+1)
      
      pllik <- -Inf
      
      llik <- obs.llik.modified(par, y.all, x.all, t, wt = w.all)
      
      counter <- 0
      while (((llik - pllik) > 10^(-8)) & (counter < maxIter)) {
        
        w <- Estep.modified(par, y.all, x.all, t)
        
        y.var <- as.character(formula)[[2]]
                
        coef.list <- list()
        
        for(j in 1:(J+1)){
          
          dtmp <- as.data.frame(rbind(cbind(1, x.treatment, w[,j], 1),
                                      cbind(y.control[,j], x.control, 1, 0),
                                      cbind(0, x.treatment, 1-w[,j], 1)))

          dtmp.weights <- c(w.treatment, w.control, w.treatment)

          if (intercept.only == TRUE)
            names(dtmp) <- c("y","(Intercept)","w","treat")
          else
            names(dtmp) <- c("y","(Intercept)",x.vars,"w","treat")
          
          if(j==(J+1)) {
              dtmp <- dtmp[dtmp$treat==1,]
              dtmp.weights <- dtmp.weights[dtmp$treat==1]
          }
          
          if (j<(J+1)) {
            trials <- 1
          } else {
            trials <- j
          }
          
          fit <- glm(as.formula(paste("cbind(y, 1-y) ~ ", paste(x.vars, collapse=" + "))),
                     family = binomial(logit), weights = dtmp$w * dtmp.weights,
                     start = par[((j-1)*nPar+1):(j*nPar)], data = dtmp)
          
          coef.list[[j]] <- coef(fit)
          
        }
        
        par <- do.call(c, coef.list)
        
        pllik <- llik
        
        if(verbose==T)
          cat(paste(counter, round(llik, 4), "\n"))
        
        llik <- obs.llik.modified(par, y.all, x.all, t)
        
        counter <- counter + 1
        
        if (llik < pllik)
          warning("log-likelihood is not monotonically increasing.")
        ## NOTE CHANGED FROM STOP()
        
        if(counter == (maxIter-1))
          warning("number of iterations exceeded maximum in ML.")
        
      } # end ml while loop
      
      MLEfit <- optim(par, fn = obs.llik.modified, method = "BFGS", y = y.all, X = x.all,
                      treat = t, hessian = TRUE, control = list(maxit = 0))
      
      vcov.mle <- solve(-MLEfit$hessian, tol = 1e-20)
      se.mle <- sqrt(diag(vcov.mle))
      
    } # end ml for modified design
    
  } # end modified design
  
  # one-step estimator
  if (method == "onestep") {

    if(error!="none") stop("Measurement error models not implemented for this method.")
    if(robust) stop("Robust functionality not implemented for this method")
    if(!is.null(h)|!is.null(group)) stop("Auxiliary info functionality not implemented for this method")

    start.par <- c(coef(fit.treat), fit.control.coef)/5

    nls.objective <- function(params, J, Tr, Y, X) {
      
      k  <- length(params)/2
      beta  <- params[1:k]
      gamma <- params[(k + 1):(2*k)]

      Xb <- X %*% beta
      Xg <- X %*% gamma

      tmp1 <- c(Tr * (Y - logistic(Xb) - J * logistic(Xg)) * logistic(Xb)/(1 + exp(Xb))) * X
      tmp0 <- c((1 - Tr) * (Y - J * logistic(Xg)) * J * logistic(Xg)/(1 + exp(Xg))) * X
    
      m1 <- colMeans(tmp1)
      m0 <- colMeans(tmp0)

      W  <- ginv(adiag(t(tmp1) %*% tmp1/n, t(tmp0) %*% tmp0/n))
      
      as.numeric(t(c(m1, m0)) %*% W %*% c(m1, m0))

    }

    vcov.twostep.std.onestep <- function(params, J, Tr, Y, X) {
    
        n  <- length(Y)
        k  <- length(params)/2
        beta  <- params[1:k]
        gamma <- params[(k + 1):(2*k)]

        Xb <- X %*% beta
        Xg <- X %*% gamma

        tmp1 <- c(Tr * (Y - logistic(Xb) - J * logistic(Xg)) * logistic(Xb)/(1 + exp(Xb))) * X
        tmp0 <- c((1 - Tr) * (Y - J * logistic(Xg)) * J * logistic(Xg)/(1 + exp(Xg))) * X
        Em1 <- t(tmp1) %*% tmp1 / n
        Em0 <- t(tmp0) %*% tmp0 / n
        F <- adiag(Em1, Em0)
        Gtmp <- c(Tr * logistic(Xb)/(1 + exp(Xb))) * X
        G1 <- -t(Gtmp) %*% Gtmp / n  
        Gtmp <- c(sqrt(J*Tr*logistic(Xb)*logistic(Xg)/
                       ((1 + exp(Xb))*(1 + exp(Xg))))) * X
        G2 <- -t(Gtmp) %*% Gtmp / n
        Gtmp <- c(J*(1-Tr)*logistic(Xg)/(1 + exp(Xg))) * X
        G3 <- -t(Gtmp) %*% Gtmp / n
        invG1 <- ginv(G1)
        invG3 <- ginv(G3)
        invG <- rbind(cbind(invG1, - invG1 %*% G2 %*% invG3),
                      cbind(matrix(0, ncol = ncol(G1), nrow = nrow(G3)), invG3))
        
        return(invG %*% F %*% t(invG) / n)
        
      }
  
      optim.out <- optim(fn = nls.objective,
        par = start.par, J = J, Tr = t, Y = y.all, X = x.all, 
        control = list(maxit = 5000))

      vcov <- vcov.twostep.std.onestep(params = optim.out$par, J = J, Tr = t, Y = y.all, X = x.all)
      rownames(vcov) <- colnames(vcov) <- rep(colnames(x.all), 2)
      k <- ncol(x.all)

      onestep.par.treat <- optim.out$par[1:k]
      onestep.par.control <- optim.out$par[(k+1):(2*k)]
      onestep.vcov <- vcov
      onestep.se.treat <- sqrt(diag(vcov))[1:k]
      onestep.se.control <- sqrt(diag(vcov))[(k+1):(2*k)]
      onestep.convergence <- optim.out$convergence

  }

  # measurement error models
  if (error == "topcode" & method == "ml") {
    
    ####################
    # TOP CODING ERROR #
    ####################
    obs.llik.top <- function(all.pars, formula, data, treat, J, fit.sensitive) {

      mf <- model.frame(formula, data)
      n  <- nrow(mf)

      Y  <- model.response(mf)
      X  <- model.matrix(formula, data)
      Tr <- data[row.names(mf), treat]
      k  <- ncol(X)
      beta <- all.pars[1:k]
      gamma <- all.pars[(k+1):(2*k)]

      log.odds <- all.pars[2*k + 1]
      p0 <- logistic(log.odds)

      Xb <- X %*% beta
      Xg <- X %*% gamma

      lliks <- ifelse(Y == J + 1, log(p0 + (1 - p0) * logistic(Xb) * logistic(Xg)^J), 
        ifelse(Y == J & Tr == 0, log(p0 + (1 - p0) * logistic(Xg)^J),  
          ifelse(Y == 0 & Tr == 1, log(1 - p0) + log(1 - logistic(Xb)) + J * log(1 - logistic(Xg)), 
        ifelse(Y %in% 1:J & Tr == 1, log(1-p0) + 
          log(logistic(Xb) * choose(J, Y-1) * logistic(Xg)^(Y-1) * (1 - logistic(Xg))^(J-Y+1) + 
            (1 - logistic(Xb)) * choose(J, Y) * logistic(Xg)^Y * (1 - logistic(Xg))^(J-Y)), 
          log(1 - p0) + log(choose(J, Y)) + Y * log(logistic(Xg)) + (J-Y) * log(1 - logistic(Xg))))))
      
      if (fit.sensitive == "bayesglm") {
        p.prior.sensitive <- sum(dcauchy(x = beta, scale = bayesglm_priors(X), log = TRUE))
      } else {
        p.prior.sensitive <- 0
      }
      
      sum(lliks) + p.prior.sensitive 

    }
    
    gr.top <- function(all.pars, formula, data, treat, J, fit.sensitive) {

      mf <- model.frame(formula, data)
      n  <- nrow(mf)
      
      Y  <- model.response(mf)
      X  <- model.matrix(formula, data)
      Tr <- data[row.names(mf), treat]
      k  <- ncol(X)
      beta <- all.pars[1:k]
      gamma <- all.pars[(k+1):(2*k)]
      
      log.odds <- all.pars[2*k + 1]
      p0 <- logistic(log.odds)
      
      Xb <- X %*% beta
      Xg <- X %*% gamma
      
      logistic <- function(x) exp(x)/(1 + exp(x))
      dlogistic <- function(x) exp(x)/(1 + exp(x))^2
      
      binomial <- function(y) {
        dbinom(x = y, size = J, prob = logistic(Xg))
      }
      
      binomial_deriv <- function(y) {
        choose(J, y) * dlogistic(Xg) * (1 - logistic(Xg))^(J - y - 1) * logistic(Xg)^(y - 1) * (y - J * logistic(Xg))
      }
      
      log_dcauchy_deriv <- function(coef, scale) {
        -2 * coef/(scale^2 + coef^2)
      }
      
      treated <- Tr == 1
      not_treated <- Tr == 0
      
      # treatment group wrt beta (NxK matrix)
      gradient_treatment_beta  <- ifelse(treated & Y == 0, -dlogistic(Xb)/(1-logistic(Xb)), 
                                         ifelse(treated & Y %in% 1:J, (binomial(Y-1) - binomial(Y)) * dlogistic(Xb)/(logistic(Xb) * binomial(Y-1) + (1 - logistic(Xb)) * binomial(Y)),
                                                ifelse(treated & Y == J + 1, (1-p0) * logistic(Xg)^J * dlogistic(Xb)/(p0 + (1-p0) * logistic(Xb) * logistic(Xg)^J), 0))) * X
      
      # cauchy wrt beta
      if (fit.sensitive == "bayesglm") {
        gradient_cauchy_beta <- log_dcauchy_deriv(beta, bayesglm_priors(X))
      } else {
        gradient_cauchy_beta <- 0
      }

      # treatment group wrt gamma (NxK matrix)
      gradient_treatment_gamma <- ifelse(treated & Y == 0, - J * dlogistic(Xg)/(1 - logistic(Xg)), 
                                         ifelse(treated & Y %in% 1:J, (logistic(Xb) * binomial_deriv(Y-1) + (1 - logistic(Xb)) * binomial_deriv(Y))/(logistic(Xb) * binomial(Y-1) + (1 - logistic(Xb)) * binomial(Y)), 
                                                ifelse(treated & Y == (J + 1), (1-p0) * logistic(Xb) * J * logistic(Xg)^(J-1) * dlogistic(Xg)/(p0 + (1-p0) * logistic(Xb) * logistic(Xg)^J), 0))) * X
      
      # treatment group wrt p0 (N vector)
      gradient_treatment_p0 <- ifelse(treated & Y %in% 0:J, -1/(1-p0) * dlogistic(log.odds), 
                                      ifelse(treated & Y == (J + 1), (1 - logistic(Xb) * logistic(Xg)^J)/(p0 + (1-p0) * logistic(Xb) * logistic(Xg)^J) * dlogistic(log.odds), 0))
      
      
      # control group wrt gamma (NxK matrix)
      gradient_control_gamma <- ifelse(not_treated & Y %in% 0:(J-1), (Y/logistic(Xg) - (J-Y)/(1 - logistic(Xg))) * dlogistic(Xg),
                                       ifelse(not_treated & Y == J, (1-p0) * J * logistic(Xg)^(J-1)/(p0 + (1-p0) * logistic(Xg)^J) * dlogistic(Xg), 0)) * X
      
      # control group wrt p0 (N vector)
      gradient_control_p0 <- ifelse(not_treated & Y %in% 0:(J-1), -1/(1-p0) * dlogistic(log.odds), 
                                    ifelse(not_treated & Y == J, (1 - logistic(Xg)^J)/(p0 + (1-p0) * logistic(Xg)^J) * dlogistic(log.odds), 0))
      
      # K row vector
      c(colSums(gradient_treatment_beta) + gradient_cauchy_beta, colSums(gradient_treatment_gamma + gradient_control_gamma), sum(gradient_treatment_p0 + gradient_control_p0))
      
    }

    topcodeE <- function(formula, data, treat, J, p0, params) {
      
      mf <- model.frame(formula, data)
      n  <- nrow(mf)

      Y  <- model.response(mf)
      X  <- model.matrix(formula, data)
      Tr <- data[row.names(mf), treat]
      k  <- ncol(X)
      beta <- params[1:k]
      gamma <- params[(k+1):(2*k)]

      Xb <- X %*% beta
      Xg <- X %*% gamma

      # probability of top coding error
      xi <- ifelse(Y == J + 1, 
        p0/(p0 * (1 - logistic(Xb) * logistic(Xg)^J) + logistic(Xb)*logistic(Xg)^J), 
          ifelse(Tr == 0 & Y == J, 
            p0/(p0 * (1 - logistic(Xg)^J) + logistic(Xg)^J), 
              0
            )
        )

      # probability of sensitive trait
      eta <- ifelse(Tr == 1 & Y == 0, 0,
        ifelse(Y == J + 1, 
          ((1-p0) * logistic(Xb) * logistic(Xg)^J + p0 * logistic(Xb))/
            (p0 + (1-p0)*logistic(Xb)*logistic(Xg)^J),
          logistic(Xb) * choose(J, Y-1) * logistic(Xg)^(Y-1) * (1 - logistic(Xg))^(J-Y+1)/ 
            (logistic(Xb) * choose(J, Y-1) * logistic(Xg)^(Y-1) * (1 - logistic(Xg))^(J-Y+1) + 
              (1 - logistic(Xb)) * choose(J, Y) * logistic(Xg)^Y * (1 - logistic(Xg))^(J - Y))
          )
        )

      # eta <- ifelse(Tr == 0, 0, eta)

      # latent number of control items
      yzeta0 <- ifelse(Tr == 0 & Y == J, 
        J * logistic(Xg)^J/(p0 * (1-logistic(Xg)^J) + logistic(Xg)^J), 0)

      yzeta0y <- rep(0, n)
      for (y in 1:(J - 1)) {
        tmp <- ifelse(Tr == 0 & Y == J, 
          y * p0 * choose(J, y) * logistic(Xg)^y * (1 - logistic(Xg))^(J - y)/
            (p0 * (1 - logistic(Xg)^J) + logistic(Xg)^J), 
              ifelse(Tr == 0 & Y == y, y, 0))
        yzeta0y <- yzeta0y + tmp
      }

      yzeta1J <- ifelse(Tr == 1 & Y == J + 1, 
        J * ((1-p0) * logistic(Xb) * logistic(Xg)^J + p0*logistic(Xg)^J)/(p0 + (1-p0)*logistic(Xb)*logistic(Xg)^J), 
          ifelse(Tr == 1 & Y == J, 
        J * (1 - logistic(Xb)) * logistic(Xg)^J/
          ((1 - logistic(Xb)) * logistic(Xg)^J + logistic(Xb) * J * logistic(Xg)^(J-1) * (1 - logistic(Xg))), 
            0))

      yzeta1y <- rep(0, n)
      for (y in 1:(J - 1)) {
        tmp <- ifelse(Tr == 1 & Y == y + 1, 
          y * logistic(Xb)*choose(J, Y-1)*logistic(Xg)^(Y-1)*(1-logistic(Xg))^(J-Y+1)/
          (logistic(Xb)*choose(J, Y-1)*logistic(Xg)^(Y-1)*(1-logistic(Xg))^(J-Y+1) + 
            (1-logistic(Xb))*choose(J, Y)*logistic(Xg)^Y*(1-logistic(Xg))^(J-Y)),
          ifelse(Tr == 1 & Y == y, 
            y * (1-logistic(Xb))*choose(J, Y)*logistic(Xg)^Y*(1-logistic(Xg))^(J-Y)/
          ((1-logistic(Xb))*choose(J, Y)*logistic(Xg)^Y*(1-logistic(Xg))^(J-Y) + 
            logistic(Xb)*choose(J, Y-1)*logistic(Xg)^(Y-1)*(1-logistic(Xg))^(J-Y+1)), 
          ifelse(Tr == 1 & Y == J + 1, 
            (y * p0 * choose(J, y) * logistic(Xg)^y * (1 - logistic(Xg))^(J - y))/
              (p0 + (1-p0)*logistic(Xb)*logistic(Xg)^J), 
                0)))
        yzeta1y <- yzeta1y + tmp

      }

      yzeta <- yzeta0 + yzeta0y + yzeta1J + yzeta1y

      return(list(xi = signif(xi, 10), eta = signif(eta, 10), yzeta = signif(yzeta, 10)))
      
    }

    topcodeM <- function(formula, data, treat, J, xi, eta, yzeta, fit.sensitive) {

      mf <- model.frame(formula, data)
      n  <- nrow(mf)

      Y  <- model.response(mf)
      X  <- model.matrix(formula, data)
      Tr <- data[row.names(mf), treat]
      k  <- ncol(X)

      if (length(attr(attr(mf, "terms"), "term.labels")) == 0) {

        betaX <- rbind(X[Tr == 1, ], X[Tr == 1, ])
        betaY <- rep(0:1, each = sum(Tr))

        if(fit.sensitive == "glm"){
          beta.fit <- glm(cbind(betaY, 1 - betaY) ~ 1, weights = c(1 - eta[Tr == 1], eta[Tr == 1]), 
                           family = binomial("logit"), control = list(maxit = 5000))
        } else if (fit.sensitive == "bayesglm"){
          beta.fit <- bayesglm(cbind(betaY, 1 - betaY) ~ 1, weights = c(1 - eta[Tr == 1], eta[Tr == 1]), 
                                family = binomial(logit), 
                                control = glm.control(maxit = 5000), scaled = FALSE,
                                prior.scale = 2.5, prior.scale.for.intercept = 10)
        } else {
          stop("Please choose 'glm' or 'bayesglm' for fit.sensitive.")
        }

        gammaX <- rbind(X, X)
        gammaY <- rep(0:1, each = n)

          gamma.fit <- glm(cbind(gammaY, 1 - gammaY) ~ 1, weights = c(J - yzeta, yzeta), 
                           family = binomial("logit"), control = list(maxit = 5000))
        
      } else {

        betaX <- rbind(X[Tr == 1, ], X[Tr == 1, ])
        betaY <- rep(0:1, each = sum(Tr))

        if(fit.sensitive == "glm"){
          beta.fit <- glm(cbind(betaY, 1 - betaY) ~ betaX - 1, weights = c(1 - eta[Tr == 1], eta[Tr == 1]), 
                           family = binomial("logit"), control = list(maxit = 5000))
        } else if (fit.sensitive == "bayesglm"){
          beta.fit <- bayesglm(cbind(betaY, 1 - betaY) ~ betaX - 1, weights = c(1 - eta[Tr == 1], eta[Tr == 1]), 
                                family = binomial(logit), 
                                control = glm.control(maxit = 5000), scaled = FALSE,
                                prior.scale = 2.5, prior.scale.for.intercept = 10)
        } else {
          stop("Please choose 'glm' or 'bayesglm' for fit.sensitive.")
        }
        
        gammaX <- rbind(X, X)
        gammaY <- rep(0:1, each = n)
        
        gamma.fit <- glm(cbind(gammaY, 1 - gammaY) ~ gammaX - 1, weights = c(J - yzeta, yzeta), 
                         family = binomial("logit"), control = list(maxit = 5000))
        
      }

      return(list(beta.fit = beta.fit, gamma.fit = gamma.fit, 
        pars = c(coef(beta.fit), coef(gamma.fit)), p0 = mean(xi)))

    }

    topcodeEM <- function(formula, data, treat, J, p0 = 0.05, params = NULL, eps = 1e-08, maxIter, fit.sensitive) {

      mf <- model.frame(formula, data)
      n  <- nrow(mf)

      Y  <- model.response(mf)
      X  <- model.matrix(formula, data)
      Tr <- data[row.names(mf), treat]
      k  <- ncol(X)

      if (is.null(params)) {
        params <- c(coef.treat.start, coef.control.start)
      }

      beta  <- params[1:k]
      gamma <- params[(k+1):(2*k)]

      Xb <- X %*% beta
      Xg <- X %*% gamma

      Estep0 <- topcodeE(formula, data, treat, J, p0, params)
      Mstep0 <- topcodeM(formula, data, treat, J, 
        eta = Estep0$eta, yzeta = Estep0$yzeta, xi = Estep0$xi, 
        fit.sensitive = fit.sensitive)

      Estep <- topcodeE(formula, data, treat, J, p0, params)
      Mstep <- topcodeM(formula, data, treat, J, 
        eta = Estep$eta, yzeta = Estep$yzeta, xi = Estep$xi,
        fit.sensitive = fit.sensitive)

      par.holder <- c(Mstep$pars, Mstep$p0)

      iteration <- 1

      while ((iteration == 1 | max(abs(par.holder - c(Mstep$par, Mstep$p0))) > eps) & 
        iteration <= maxIter) {

        par.holder <- c(Mstep$pars, Mstep$p0)

        obs.llik0 <- obs.llik.top(all.pars = c(Mstep$pars, log(Mstep$p0/(1-Mstep$p0))), 
          formula, data, treat, J, fit.sensitive = fit.sensitive)

        Estep <- topcodeE(formula, data, treat, J, p0 = Mstep$p0, params = Mstep$pars)
        Mstep <- topcodeM(formula, data, treat, J, 
          eta = Estep$eta, yzeta = Estep$yzeta, xi = Estep$xi, fit.sensitive = fit.sensitive)

        obs.llik1 <- obs.llik.top(all.pars = c(Mstep$pars, log(Mstep$p0/(1-Mstep$p0))), 
          formula, data, treat, J, fit.sensitive = fit.sensitive)

        if(signif(obs.llik1) < signif(obs.llik0)) warning("Observed-data likelihood is not monotonically increasing.")
        iteration <- iteration + 1

      }

      if (iteration == maxIter) warning("Maximum number of iterations reached.")

      if(fit.sensitive == "bayesglm") {
        optim.out <- optim(fn = obs.llik.top, hessian = TRUE, method = "BFGS", gr = gr.top,
                           par = c(Mstep$pars, log(Mstep$p0/(1 - Mstep$p0))), 
                           formula = formula, data = data, treat = treat, J = J, 
                           fit.sensitive = fit.sensitive,
                           control = list(fnscale = -1, maxit = 0))
      } else if (fit.sensitive == "glm") {
        optim.out <- optim(fn = obs.llik.top, hessian = TRUE, method = "BFGS", gr = gr.top,
                           par = c(Mstep$pars, log(Mstep$p0/(1 - Mstep$p0))), 
                           formula = formula, data = data, treat = treat, J = J, 
                           fit.sensitive = fit.sensitive,
                           control = list(fnscale = -1, maxit = 0))
      } else {
        stop("Please choose 'glm' or 'bayesglm' for fit.sensitive.")
      }

      vcov <- vcov.flip <- solve(-optim.out$hessian, tol = 1e-20)
      std.errors <- sqrt(diag(vcov))
      vcov.flip[(1:k), (1:k)] <- vcov[(k+1):(2*k), (k+1):(2*k)] 
      vcov.flip[(k+1):(2*k), (k+1):(2*k)] <- vcov[(1:k), (1:k)]

      return(
        list(
          par.treat   = Mstep$par[1:k], 
          par.control = Mstep$par[(k+1):(2*k)],
          se.treat    = std.errors[1:k],
          se.control  = std.errors[(k+1):(2*k)],
          vcov        = vcov.flip, 
          p.est       = mean(Estep$xi),
          p.ci        = c("lwr" = logistic(optim.out$par[2*k+1] - qnorm(0.975)*std.errors[2*k+1]),
            "upr" = logistic(optim.out$par[2*k+1] + qnorm(0.975)*std.errors[2*k+1])), 
          lp.est      = optim.out$par[2*k+1], 
          lp.se       = std.errors[2*k+1], 
          iterations  = iteration, 
          obs.llik    = optim.out$value
          )
        )

    }

    topcode.out <- topcodeEM(formula = formula, data = data, treat = treat, J = J, maxIter = maxIter, fit.sensitive = fit.sensitive)
    topcode.par.treat <- topcode.out$par.treat
    topcode.par.control <- topcode.out$par.control
    topcode.vcov <- topcode.out$vcov
    topcode.se.treat <- topcode.out$se.treat
    topcode.se.control <- topcode.out$se.control
    llik.topcode <- topcode.out$obs.llik   
    topcode.p <- topcode.out$p.est
    topcode.p.ci <- topcode.out$p.ci
    topcode.lp <- topcode.out$lp.est
    topcode.lp.se <- topcode.out$lp.se
    topcode.iter <- topcode.out$iterations


  } 

  if (error == "uniform" & method == "ml") {

    ########################
    # UNIFORM CODING ERROR #
    ########################
    obs.llik.uniform <- function(all.pars, formula, data, treat, J, fit.sensitive) {

      mf <- model.frame(formula, data)
      n  <- nrow(mf)

      Y  <- model.response(mf)
      X  <- model.matrix(formula, data)
      Tr <- data[row.names(mf), treat]
      k  <- ncol(X)

      beta <- all.pars[1:k]
      gamma <- all.pars[(k+1):(2*k)]

      Xb <- X %*% beta
      Xg <- X %*% gamma

      p0 <- logistic(all.pars[2*k + 1])
      p1 <- logistic(all.pars[2*k + 2])

      lliks <- ifelse(Y == J + 1, 
        log((1 - p1) * logistic(Xb) * logistic(Xg)^J + p1/(J + 2)), 
        ifelse(Y == 0 & Tr == 1, 
          log((1 - p1) * (1 - logistic(Xb)) * (1 - logistic(Xg))^J + p1/(J + 2)), 
        ifelse(Y %in% 1:J & Tr == 1, 
          log((1 - p1) * (logistic(Xb) * choose(J, Y-1) * logistic(Xg)^(Y-1) * (1 - logistic(Xg))^(J-Y+1) + 
            (1 - logistic(Xb)) * choose(J, Y) * logistic(Xg)^Y * (1 - logistic(Xg))^(J-Y)) + p1/(J + 2)), 
        log((1 - p0) * choose(J, Y) * logistic(Xg)^Y * (1 - logistic(Xg))^(J-Y) + p0/(J+1)))))

      if (fit.sensitive == "bayesglm") {
        p.prior.sensitive <- sum(dcauchy(x = beta, scale = bayesglm_priors(X), log = TRUE))
      } else {
        p.prior.sensitive <- 0
      }
      
      sum(lliks) + p.prior.sensitive 

    }
    
    gr.uniform <- function(all.pars, formula, data, treat, J, fit.sensitive) {
      
      mf <- model.frame(formula, data)
      n  <- nrow(mf)
      
      Y  <- model.response(mf)
      X  <- model.matrix(formula, data)
      Tr <- data[row.names(mf), treat]
      k  <- ncol(X)
      
      beta <- all.pars[1:k]
      gamma <- all.pars[(k+1):(2*k)]
      
      Xb <- X %*% beta
      Xg <- X %*% gamma
      
      log.odds.p0 <- all.pars[2*k + 1]
      log.odds.p1 <- all.pars[2*k + 2]
      
      p0 <- logistic(log.odds.p0)
      p1 <- logistic(log.odds.p1)
      
      logistic <- function(x) exp(x)/(1 + exp(x))
      dlogistic <- function(x) exp(x)/(1 + exp(x))^2
      
      binomial <- function(y) {
        dbinom(x = y, size = J, prob = logistic(Xg))
      }
      
      binomial_deriv <- function(y) {
        choose(J, y) * dlogistic(Xg) * (1 - logistic(Xg))^(J - y - 1) * logistic(Xg)^(y - 1) * (y - J * logistic(Xg))
      }
      
      log_dcauchy_deriv <- function(coef, scale) {
        -2 * coef/(scale^2 + coef^2)
      }
      
      treated <- Tr == 1
      not_treated <- Tr == 0
      
      # treatment group wrt beta (NxK matrix)
      gradient_treatment_beta  <- 
        ifelse(treated & Y == 0, 
               - ((1-p1) * (1 - logistic(Xg))^J * dlogistic(Xb)) / 
                 ( p1/(J+2) + (1-p1) * (1-logistic(Xb)) * (1-logistic(Xg))^J ),
               ifelse(treated & Y %in% 1:J, 
                      ((1 - p1) * ( binomial(Y - 1) - binomial(Y)) * dlogistic(Xb)) / 
                        ( p1/(J + 2) + (1 - p1) * (logistic(Xb) * binomial(Y - 1) + (1 - logistic(Xb)) * binomial(Y)) ),
                      
                      ifelse(treated & Y == J + 1, 
                             ((1 - p1) * logistic(Xg)^J * dlogistic(Xb))/
                               ( p1 / (J + 2) + (1 - p1) * logistic(Xb) * logistic(Xg)^J ),
                             0
                      )
               )
        ) * X
      
      # cauchy wrt beta
      if (fit.sensitive == "bayesglm") {
        gradient_cauchy_beta <- log_dcauchy_deriv(beta, bayesglm_priors(X))
      } else {
        gradient_cauchy_beta <- 0
      }

      # treatment group wrt gamma (NxK matrix)
      gradient_treatment_gamma <- 
        ifelse(treated & Y == 0, 
               - ((1 - p1) * (1 - logistic(Xb)) * J * (1 - logistic(Xg))^(J - 1) * dlogistic(Xg)) / 
                 ( (p1 / (J + 2)) + (1 - p1) * (1 - logistic(Xb)) * (1 - logistic(Xg))^J ), 
               
               ifelse(treated & Y %in% 1:J, 
                      ((1 - p1) * (logistic(Xb) * binomial_deriv(Y - 1) + (1 - logistic(Xb)) * binomial_deriv(Y))) / 
                        ( (p1 / (J + 2)) + (1 - p1) * ( logistic(Xb) * binomial(Y - 1) + (1 - logistic(Xb)) * binomial(Y)) ),
                      
                      ifelse(treated & Y == J + 1, 
                             ((1-p1) * logistic(Xb) * J * logistic(Xg)^(J-1) * dlogistic(Xg)) /
                               (p1 / (J + 2) + (1 - p1) * logistic(Xb) * logistic(Xg)^J), 
                             0
                      )
               )
        ) * X
      
      # treatment group wrt p1 (N vector)
      gradient_treatment_p1 <- 
        ifelse(treated & Y == 0, 
               (( 1/(J + 2) - (1 - logistic(Xb)) * (1 - logistic(Xg))^J) / 
                 ( p1/(J + 2) + (1 - p1) * (1 - logistic(Xb)) * (1 - logistic(Xg))^J )) * dlogistic(log.odds.p1),
               ifelse(treated & Y %in% 1:J, 
                      (( 1/(J + 2) - ( logistic(Xb) * binomial(Y - 1) + (1 - logistic(Xb)) * binomial(Y) ) ) / 
                        ( p1/(J + 2) + (1 - p1) * ( logistic(Xb) * binomial(Y - 1) + (1 - logistic(Xb)) * binomial(Y) ) )) * dlogistic(log.odds.p1), 
                      ifelse(treated & Y == J + 1, 
                             (( 1/(J + 2) - logistic(Xb) * logistic(Xg)^J ) / 
                               ( p1 / (J + 2) + (1 - p1) * logistic(Xb) * logistic(Xg)^J )) * dlogistic(log.odds.p1), 
                             0
                      )
               )
        )
      
      # control group wrt gamma (NxK matrix)
      gradient_control_gamma <- 
        ifelse(not_treated, 
               (( (1-p0) * binomial_deriv(Y) ) / 
                   ( p0/(J + 1) + (1 - p0) * binomial(Y) )), 0) * X
      
      # control group wrt p0 (N vector)
      gradient_control_p0 <- 
        ifelse(not_treated, 
               ((( 1/(J + 1) - binomial(Y) ) / 
                  ( p0 / (J + 1) + (1 - p0) * binomial(Y)))) * dlogistic(log.odds.p0), 0)

      # K row vector
      c(colSums(gradient_treatment_beta) + gradient_cauchy_beta, colSums(gradient_treatment_gamma + gradient_control_gamma), sum(gradient_control_p0), sum(gradient_treatment_p1))
      
    }
    
    uniformE <- function(formula, data, treat, J, p0, p1, params) {

      mf <- model.frame(formula, data)
      n  <- nrow(mf)

      Y  <- model.response(mf)
      X  <- model.matrix(formula, data)
      Tr <- data[row.names(mf), treat]
      k  <- ncol(X)

      beta  <- params[1:k]
      gamma <- params[(k+1):(2*k)]

      Xb <- X %*% beta
      Xg <- X %*% gamma

      # probability of error
      xi <- ifelse(Y == J + 1, p1/(J+2)/(p1/(J+2) + (1-p1) * logistic(Xb) * logistic(Xg)^J), 
        ifelse(Tr == 1 & Y == 0, p1/(J+2)/(p1/(J+2) + (1-p1) * (1 - logistic(Xb)) * (1 - logistic(Xg))^J), 
          ifelse(Tr == 1 & Y %in% 1:J, 
            p1/(J+2)/(p1/(J+2) + (1-p1) * (logistic(Xb) * choose(J, Y-1) * logistic(Xg)^(Y-1) * 
              (1 - logistic(Xg))^(J-Y+1) + (1 - logistic(Xb)) * choose(J, Y) * logistic(Xg)^Y * 
                (1 - logistic(Xg))^(J-Y))), 
            p0/(J+1)/(p0/(J+1) + (1-p0) * choose(J, Y) * logistic(Xg)^Y * (1 - logistic(Xg))^(J-Y)))))

      # probability of sensitive trait
      eta <- ifelse(Y == J + 1, 
         (p1/(J + 2) * logistic(Xb) + (1 - p1) * logistic(Xb) * logistic(Xg)^J)/
          ((1 - p1) * logistic(Xb) * logistic(Xg)^J + p1/(J + 2)), 
        ifelse(Tr == 1 & Y == 0, 
          p1/(J + 2) * logistic(Xb)/
           ((1 - p1) * (1 - logistic(Xb)) * (1 - logistic(Xg))^J + p1/(J + 2)), 
        ifelse(Tr == 1 & Y %in% 1:J, 
          (p1/(J + 2) + (1 - p1) * choose(J, Y - 1) * logistic(Xg)^(Y - 1) * (1 - logistic(Xg))^(J - Y + 1)) * logistic(Xb)/
            (p1/(J + 2) + (1 - p1) * (logistic(Xb) * choose(J, Y - 1) * logistic(Xg)^(Y - 1) * (1 - logistic(Xg))^(J - Y + 1) + 
              (1 - logistic(Xb)) * choose(J, Y) * logistic(Xg)^Y * (1 - logistic(Xg))^(J - Y))), 
            0)
          )
        )

      # expected values for control obs
      yzeta0 <- Y * (p0/(J + 1) + (1 - p0)) * choose(J, Y) * logistic(Xg)^Y * (1 - logistic(Xg))^(J - Y)/
        (p0/(J + 1) + (1 - p0) * choose(J, Y) * logistic(Xg)^Y * (1 - logistic(Xg))^(J - Y))
      for (y in 0:J) {
        yzeta0 <- yzeta0 + ifelse(Y == y, 0, y * 
          p0/(J + 1) * choose(J, y) * logistic(Xg)^y * (1 - logistic(Xg))^(J - y)/
          (p0/(J + 1) + (1 - p0) * choose(J, Y) * logistic(Xg)^Y * (1 - logistic(Xg))^(J - Y)))
      }

      # expected values for treated obs
      yzetaJ1 <- J * ifelse(Y == J + 1, ((1-p1) * logistic(Xb) * logistic(Xg)^J + p1/(J+2) * logistic(Xg)^J)/
        (p1/(J+2) + (1-p1) * logistic(Xb) * logistic(Xg)^J), 
          ifelse(Tr == 1 & Y == J, ((1-p1) * (1 - logistic(Xb)) * logistic(Xg)^J + p1/(J+2) * logistic(Xg)^J)/
            (p1/(J+2) + (1-p1) * (logistic(Xb) * J * logistic(Xg)^(J-1) * (1 - logistic(Xg)) + 
              (1 - logistic(Xb)) * logistic(Xg)^J)), 
          ifelse(Tr == 1 & Y == 0, (p1/(J+2) * logistic(Xg)^J)/ 
            (p1/(J+2) + (1-p1) * (1 - logistic(Xb)) * (1 - logistic(Xg))^J), 
            ifelse(Tr == 1 & Y %in% 1:(J-1), p1/(J+2) * logistic(Xg)^J/
        (p1/(J+2) + (1-p1) * ((1 - logistic(Xb)) * choose(J, Y) * logistic(Xg)^Y * (1 - logistic(Xg))^(J - Y) + 
          logistic(Xb) * choose(J, Y-1) * logistic(Xg)^(Y-1) * (1 - logistic(Xg))^(J-Y+1))), 
              0))))

      yzetay1 <- rep(0, n) 
      for (y in 1:(J - 1)) {
        tmp <- ifelse(Tr == 1 & Y == y + 1, 
          y * (p1/(J + 2) + (1 - p1) * logistic(Xb)) * choose(J, y) * logistic(Xg)^y * (1 - logistic(Xg))^(J - y)/
              (p1/(J + 2) + (1 - p1) * (logistic(Xb) * choose(J, y) * logistic(Xg)^y * (1 - logistic(Xg))^(J - y) + 
                (1 - logistic(Xb)) * choose(J, y + 1) * logistic(Xg)^(y + 1) * (1 - logistic(Xg))^(J - y - 1))), 
        ifelse(Tr == 1 & Y == y, 
          y * (p1/(J + 2) + (1 - p1) * (1 - logistic(Xb))) * choose(J, y) * logistic(Xg)^y * (1 - logistic(Xg))^(J - y)/
              (p1/(J + 2) + (1 - p1) * (logistic(Xb) * choose(J, y - 1) * logistic(Xg)^(y - 1) * (1 - logistic(Xg))^(J - y + 1) + 
                (1 - logistic(Xb)) * choose(J, y) * logistic(Xg)^y * (1 - logistic(Xg))^(J - y))), 
        ifelse(Tr == 1 & Y == 0, 
          y * p1/(J + 2) * choose(J, y) * logistic(Xg)^y * (1 - logistic(Xg))^(J - y)/
              (p1/(J + 2) + (1 - p1) * (1 - logistic(Xb)) * (1 - logistic(Xg))^J), 
        ifelse(Tr == 0, 0, 
          y * p1/(J + 2) * choose(J, y) * logistic(Xg)^y * (1 - logistic(Xg))^(J - y)/
              (p1/(J + 2) + (1 - p1) * (
                (1 - logistic(Xb)) * choose(J, Y) * logistic(Xg)^Y * (1 - logistic(Xg))^(J - Y) + 
                logistic(Xb) * choose(J, Y - 1) * logistic(Xg)^(Y - 1) * (1 - logistic(Xg))^(J - Y + 1)
          ))))))
        yzetay1 <- yzetay1 + tmp
      }

      yzeta1 <- yzetaJ1 + yzetay1

      yzeta <- ifelse(Tr == 0, yzeta0, yzeta1)

      return(list(xi = signif(xi, 10), eta = signif(eta, 10), yzeta = signif(yzeta, 10)))

    }

    uniformM <- function(formula, data, treat, J, xi, eta, yzeta, fit.sensitive) {

      mf <- model.frame(formula, data)
      n  <- nrow(mf)

      Y  <- model.response(mf)
      X  <- model.matrix(formula, data)
      Tr <- data[row.names(mf), treat]
      k  <- ncol(X)

      # intercept-only models
      if (length(attr(attr(mf, "terms"), "term.labels")) == 0) {

        betaX <- rbind(X[Tr == 1, ], X[Tr == 1, ])
        betaY <- rep(0:1, each = sum(Tr))

        if(fit.sensitive == "glm"){
          beta.fit <- glm(cbind(betaY, 1 - betaY) ~ 1, weights = c(1 - eta[Tr == 1], eta[Tr == 1]), 
                           family = binomial("logit"), control = list(maxit = 5000))
        } else if (fit.sensitive == "bayesglm"){
          beta.fit <- bayesglm(cbind(betaY, 1 - betaY) ~ 1, weights = c(1 - eta[Tr == 1], eta[Tr == 1]), 
                                family = binomial(logit), 
                                control = glm.control(maxit = 5000), scaled = FALSE,
                                prior.scale = 2.5, prior.scale.for.intercept = 10)
        } else {
          stop("Please choose 'glm' or 'bayesglm' for fit.sensitive.")
        }
        
        gammaX <- rbind(X, X)
        gammaY <- rep(0:1, each = n)

        gamma.fit <- glm(cbind(gammaY, 1 - gammaY) ~ 1, weights = c(J - yzeta, yzeta), 
                         family = binomial("logit"), control = list(maxit = 5000))

      # models with covariates
      } else {

        betaX <- rbind(X[Tr == 1, ], X[Tr == 1, ])
        betaY <- rep(0:1, each = sum(Tr))

        if(fit.sensitive == "glm"){
          beta.fit <- glm(cbind(betaY, 1 - betaY) ~ betaX - 1, weights = c(1 - eta[Tr == 1], eta[Tr == 1]), 
                           family = binomial("logit"), control = list(maxit = 5000))
        } else if (fit.sensitive == "bayesglm"){
          beta.fit <- bayesglm(cbind(betaY, 1 - betaY) ~ betaX - 1, weights = c(1 - eta[Tr == 1], eta[Tr == 1]), 
                                family = binomial(logit), 
                                control = glm.control(maxit = 5000), scaled = FALSE,
                                prior.scale = 2.5, prior.scale.for.intercept = 10)
        } else {
          stop("Please choose 'glm' or 'bayesglm' for fit.sensitive.")
        }
        
        gammaX <- rbind(X, X)
        gammaY <- rep(0:1, each = n)

          gamma.fit <- glm(cbind(gammaY, 1 - gammaY) ~ gammaX - 1, weights = c(J - yzeta, yzeta), 
                           family = binomial("logit"), control = list(maxit = 5000))
      }

      return(list(beta.fit = beta.fit, gamma.fit = gamma.fit, 
        pars = c(coef(beta.fit), coef(gamma.fit)), p0 = mean(xi[Tr==0]), p1 = mean(xi[Tr==1])))

    }

    uniformEM <- function(formula, data, treat, J, p0 = 0.05, p1 = 0.05, params = NULL, eps = 1e-08, 
      maxIter, fit.sensitive) {

      mf <- model.frame(formula, data)
      n  <- nrow(mf)

      Y  <- model.response(mf)
      X  <- model.matrix(formula, data)
      Tr <- data[row.names(mf), treat]
      k  <- ncol(X)

      if (is.null(params)) {
        params <- c(coef.treat.start, coef.control.start)
      }

      beta  <- params[1:k]
      gamma <- params[(k+1):(2*k)]

      Xb <- X %*% beta
      Xg <- X %*% gamma

      Estep0 <- uniformE(formula, data, treat, J, p0, p1, params)
      Mstep0 <- uniformM(formula, data, treat, J, 
        eta = Estep0$eta, yzeta = Estep0$yzeta, xi = Estep0$xi,
        fit.sensitive = fit.sensitive)

      Estep <- uniformE(formula, data, treat, J, p0, p1, params)
      Mstep <- uniformM(formula, data, treat, J, 
        eta = Estep$eta, yzeta = Estep$yzeta, xi = Estep$xi,
        fit.sensitive = fit.sensitive)

      par.holder <- c(Mstep$pars, Mstep$p0, Mstep$p1)

      iteration <- 1

      while ((iteration == 1 | max(abs(par.holder - c(Mstep$par, Mstep$p0, Mstep$p1))) > eps) & 
        iteration <= maxIter) {

        par.holder <- c(Mstep$pars, Mstep$p0, Mstep$p1)

        obs.llik0 <- obs.llik.uniform(
          all.pars = c(Mstep$pars, log(Mstep$p0/(1-Mstep$p0)), log(Mstep$p1/(1-Mstep$p1))), 
          formula, data, treat, J, fit.sensitive = fit.sensitive)

        Estep <- uniformE(formula, data, treat, J, 
          p0 = Mstep$p0, p1 = Mstep$p1, params = Mstep$pars)
        Mstep <- uniformM(formula, data, treat, J, 
          eta = Estep$eta, yzeta = Estep$yzeta, xi = Estep$xi,
          fit.sensitive = fit.sensitive)

        obs.llik1 <- obs.llik.uniform(
          all.pars = c(Mstep$pars, log(Mstep$p0/(1-Mstep$p0)), log(Mstep$p1/(1-Mstep$p1))), 
          formula, data, treat, J, fit.sensitive = fit.sensitive)

        if (signif(obs.llik1) < signif(obs.llik0)) warning("Observed-data likelihood is not monotonically increasing.")
        iteration <- iteration + 1

      }

      if (iteration == maxIter) warning("Maximum number of iterations reached.")

      if (fit.sensitive == "bayesglm") {
        optim.out <- optim(fn = obs.llik.uniform, hessian = TRUE, method = "BFGS", gr = gr.uniform,
                           par = c(Mstep$pars, log(Mstep$p0/(1-Mstep$p0)), log(Mstep$p1/(1-Mstep$p1))), 
                           formula = formula, data = data, treat = treat, J = J, 
                           fit.sensitive = fit.sensitive,
                           control = list(fnscale = -1, maxit = 0))
      } else if (fit.sensitive == "glm") {
        optim.out <- optim(fn = obs.llik.uniform, hessian = TRUE, method = "BFGS", gr = gr.uniform,
                           par = c(Mstep$pars, log(Mstep$p0/(1-Mstep$p0)), log(Mstep$p1/(1-Mstep$p1))), 
                           formula = formula, data = data, treat = treat, J = J, 
                           fit.sensitive = fit.sensitive,
                           control = list(fnscale = -1, maxit = 0))
      } else {
        stop("Please choose 'glm' or 'bayesglm' for fit.sensitive.")
      }

      vcov <- vcov.flip <- solve(-optim.out$hessian, tol = 1e-20)
      std.errors <- sqrt(diag(vcov))
      vcov.flip[(1:k), (1:k)] <- vcov[(k+1):(2*k), (k+1):(2*k)] 
      vcov.flip[(k+1):(2*k), (k+1):(2*k)] <- vcov[(1:k), (1:k)]

      return(
        list(
          par.treat   = Mstep$par[1:k], 
          par.control = Mstep$par[(k+1):(2*k)],
          se.treat    = std.errors[1:k],
          se.control  = std.errors[(k+1):(2*k)],
          vcov        = vcov.flip, 
          p0.est      = Mstep$p0,
          p0.ci       = c("lwr" = logistic(optim.out$par[2*k+1] - qnorm(0.975)*std.errors[2*k+1]),
            "upr" = logistic(optim.out$par[2*k+1] + qnorm(0.975)*std.errors[2*k+1])), 
          p1.est      = Mstep$p1,
          p1.ci       = c("lwr" = logistic(optim.out$par[2*k+2] - qnorm(0.975)*std.errors[2*k+2]),
            "upr" = logistic(optim.out$par[2*k+2] + qnorm(0.975)*std.errors[2*k+2])), 
          lp0.est     = optim.out$par[2*k+1], 
          lp0.se      = std.errors[2*k+1], 
          lp1.est     = optim.out$par[2*k+2], 
          lp1.se      = std.errors[2*k+2], 
          iterations  = iteration, 
          obs.llik    = optim.out$value
          )
        )

    }

    uniform.out <- uniformEM(formula = formula, data = data, treat = treat, J = J, maxIter = maxIter, fit.sensitive = fit.sensitive)
    uniform.par.treat <- uniform.out$par.treat
    uniform.par.control <- uniform.out$par.control
    uniform.vcov <- uniform.out$vcov
    uniform.se.treat <- uniform.out$se.treat
    uniform.se.control <- uniform.out$se.control
    llik.uniform <- uniform.out$obs.llik   
    uniform.p0 <- uniform.out$p0.est
    uniform.p0.ci <- uniform.out$p0.ci
    uniform.lp0 <- uniform.out$lp0.est
    uniform.lp0.se <- uniform.out$lp0.se
    uniform.p1 <- uniform.out$p1.est
    uniform.p1.ci <- uniform.out$p1.ci
    uniform.lp1 <- uniform.out$lp1.est
    uniform.lp1.se <- uniform.out$lp1.se
    uniform.iter <- uniform.out$iterations

  }
  # end of measurement error models 

  # robust nls functionality 
  if (method == "nls" & robust) { 

    # gradient matrix of NLS moments
    Gmat <- function(delta, gamma, J, y, treat, x) {
      
      n <- length(y)
      y1 <- y[treat == 1]
      y0 <- y[treat == 0]
      x1 <- x[treat == 1, , drop = FALSE]
      x0 <- x[treat == 0, , drop = FALSE]
      
      Gtmp <- c(logistic(x1 %*% delta)/(1 + exp(x1 %*% delta))) * x1
      G1 <- -t(Gtmp) %*% Gtmp / n  
      Gtmp <- c(sqrt(J*logistic(x1 %*% delta)*logistic(x1 %*% gamma)/
                       ((1 + exp(x1 %*% delta))*(1 + exp(x1 %*% gamma))))) * x1
      G2 <- -t(Gtmp) %*% Gtmp / n
      Gtmp <- c(J*logistic(x0 %*% gamma)/(1 + exp(x0 %*% gamma))) * x0
      G3 <- -t(Gtmp) %*% Gtmp / n
      G <- rbind(cbind(G1, G2),
            cbind(matrix(0, ncol = ncol(G1), nrow = nrow(G3)), G3))
      
      return(G)
      
    }

    # nls objective fn and gradient
    NLSGMM <- function(pars, J, y, treat, x, W) {
        
      n <- length(y)
      
      y1 <- y[treat == 1]
      y0 <- y[treat == 0]
      x1 <- x[treat == 1, , drop = FALSE]
      x0 <- x[treat == 0, , drop = FALSE]

      delta <- pars[1:ncol(x)]
      gamma <- pars[(ncol(x) + 1):(2 * ncol(x))]

      # NLS moments
      m1 <- c((y1 - J * logistic(x1 %*% gamma) - logistic(x1 %*% delta)) * 
        logistic(x1 %*% delta)/(1 + exp(x1 %*% delta))) * x1
      m0 <- c((y0 - J * logistic(x0 %*% gamma)) * 
        J * logistic(x0 %*% gamma)/(1 + exp(x0 %*% gamma))) * x0

      # dim moment      
      dim <- mean(y1) - mean(y0)
      m.dim <- dim - logistic(x %*% delta)

      M <- c(colSums(m1), colSums(m0), sum(m.dim))/n
      as.numeric(t(M) %*% W %*% M)

    }

    NLSGMM.Grad <- function(pars, J, y, treat, x, W) {
        
      n <- length(y)

      y1 <- y[treat == 1]
      y0 <- y[treat == 0]
      x1 <- x[treat == 1, , drop = FALSE]
      x0 <- x[treat == 0, , drop = FALSE]
      delta <- pars[1:ncol(x)]
      gamma <- pars[(ncol(x) + 1):(2 * ncol(x))]

      # NLS moments
      m1 <- c((y1 - J * logistic(x1 %*% gamma) - logistic(x1 %*% delta)) * 
        logistic(x1 %*% delta)/(1 + exp(x1 %*% delta))) * x1
      m0 <- c((y0 - J * logistic(x0 %*% gamma)) * 
        J * logistic(x0 %*% gamma)/(1 + exp(x0 %*% gamma))) * x0

      # DIM moment
      dim <- mean(y1) - mean(y0)
      m.dim <- dim - logistic(x %*% delta)
      dm.dim <- -colMeans(c(logistic(x %*% delta)/(1 + exp(x %*% delta))) * x)

      # NLS Jacobian
      G <- Gmat(delta, gamma, J, y, treat, x)
      # complete moment vector
      M <- c(colSums(m1), colSums(m0), sum(m.dim)) / n
      # complete Jacobian
      G <- rbind(G, c(dm.dim, rep(0, length(gamma))))

      t(M) %*% (W + t(W)) %*% G

    }

    # weight matrix
    weightMatrix.nlsRobust <- function(pars, J, y, treat, x) {
       
      n <- length(y)
      n1 <- sum(treat)
      n0 <- sum(1 - treat)
      k  <- ncol(x)

      y1 <- y[treat == 1]
      y0 <- y[treat == 0]
      x1 <- x[treat == 1, , drop = FALSE]
      x0 <- x[treat == 0, , drop = FALSE]

      delta <- pars[1:k]
      gamma <- pars[(k + 1):(k * 2)]

      m1 <- c((y - J * logistic(x %*% gamma) - logistic(x %*% delta)) * 
        logistic(x %*% delta)/(1 + exp(x %*% delta))) * treat * x
      m0 <- c((y - J * logistic(x %*% gamma)) * 
        J * logistic(x %*% gamma)/(1 + exp(x %*% gamma))) * (1-treat) * x
      dim <- mean(y1) - mean(y0)
      m.dim <- dim - logistic(x %*% delta)

      F <- crossprod(cbind(m1, m0, m.dim))/n

      return(solve(F))

    }

    NLSGMM.var <- function(pars, J, y, treat, x, W) {
        
      n <- length(y)
      n1 <- sum(treat)
      n0 <- sum(1 - treat)
      k  <- ncol(x)

      y1 <- y[treat == 1]
      y0 <- y[treat == 0]
      x1 <- x[treat == 1, , drop = FALSE]
      x0 <- x[treat == 0, , drop = FALSE]

      dim <- mean(y[treat == 1]) - mean(y[treat == 0])

      delta <- pars[1:ncol(x)]
      gamma <- pars[(ncol(x) + 1):(2 * ncol(x))]

      m1 <- c((y - J * logistic(x %*% gamma) - logistic(x %*% delta)) * 
        logistic(x %*% delta)/(1 + exp(x %*% delta))) * treat * x
      m0 <- c((y - J * logistic(x %*% gamma)) * 
        J * logistic(x %*% gamma)/(1 + exp(x %*% gamma))) * (1-treat) * x
      m.dim <- -logistic(x %*% delta) + n/n1 * treat * y - n/n0 * (1-treat) * y

      F      <- crossprod(cbind(m1, m0, m.dim))/n

      dm.dim <- colMeans(-c(logistic(x %*% delta)/(1 + exp(x %*% delta))) * x)
      G <- Gmat(delta, gamma, J, y, treat, x)
      G <- rbind(G, c(dm.dim, rep(0, length(gamma))))

      V <- solve(t(G) %*% W %*% G, tol = 1e-20) %*% 
        t(G) %*% W %*% F %*% W %*% G %*%
          solve(t(G) %*% W %*% G, tol = 1e-20)

      return(V/n)

    }

    ictrobust <- function(formula, data, treat, J, robust, par0) {
      
      mf <- model.frame(formula, data)
      n  <- nrow(mf)

      Y  <- model.response(mf)
      X  <- model.matrix(formula, data)
      T  <- data[row.names(mf), treat]
      k  <- ncol(X)

      cW0   <- weightMatrix.nlsRobust(pars = par0, J = J, y = Y, treat = T, x = X)  
      step1 <- optim(par = par0, fn = NLSGMM, gr = NLSGMM.Grad,
        J = J, y = Y, treat = T, x = X, W = cW0, 
        method = "BFGS", control = list(maxit = 5000))
      par1  <- step1$par
      cW1   <- weightMatrix.nlsRobust(pars = par1, J = J, y = Y, treat = T, x = X)

      step2 <- optim(par = par1, fn = NLSGMM, gr = NLSGMM.Grad,
        J = J, y= Y, treat = T, x = X, W = cW1, 
        method = "BFGS", control = list(maxit = 5000))
      par2  <- step2$par
      cW2   <- weightMatrix.nlsRobust(pars = par2, J = J, y = Y, treat = T, x = X)

      step3 <- optim(par = par2, fn = NLSGMM, gr = NLSGMM.Grad,
        J = J, y= Y, treat = T, x = X, W = cW2, 
        method = "BFGS", control = list(maxit = 5000))
      par3  <- step3$par
      cW3   <- weightMatrix.nlsRobust(pars = par3, J = J, y = Y, treat = T, x = X)

      vcov <- NLSGMM.var(par3, J = J, y= Y, treat = T, x = X, W = cW3)

      return(list(
        par = step3$par, 
        vcov = vcov, 
        se = sqrt(diag(vcov)), 
        converge = 1 - step3$convergence, 
        J.stat = ifelse(robust, n*step3$value, NA),
        p.val = ifelse(robust, 1 - pchisq(q = step3$value, df = 2*k), NA),
        est.mean = mean(logistic(X %*% step3$par[1:k])), 
        diff.mean = mean(Y[T == 1]) - mean(Y[T == 0]), 
        robust = robust, 
        message = step3$message
        )
      )

    }


    ictrobust.out <- ictrobust(formula = formula, data = data, treat = treat, J = J, 
      robust = robust, par0 = c(par.treat.nls.std, par.control.nls.std))
    if (ictrobust.out$converge != 1) warning("Optimization routine did not converge.")
    ictrobust.par <- ictrobust.out$par
    ictrobust.vcov <- ictrobust.out$vcov
    ictrobust.se <- sqrt(diag(ictrobust.vcov))

    names(ictrobust.par) <- rep(names(x.all), 2)
    names(ictrobust.se)  <- rep(names(x.all), 2)

    par.control.robust <- ictrobust.par[1:ncol(x.all)]
    par.treat.robust <- ictrobust.par[(ncol(x.all)+1):(ncol(x.all)*2)]
    
  }

  # robust ml functionality 
  if (robust & method == "ml") { 

    sweepCross <- function(M, x) t(M) %*% sweep(M, MARGIN = 1, STATS = x, "*")

    loglik <- function(params, J, Y, treat, X) {
      
      n     <- length(Y)
      K     <- length(params)
      gamma <- params[1:(K/2)]
      beta  <- params[(K/2 + 1):K]

      Xb    <- X %*% beta
      Xg    <- X %*% gamma
     
      llik <- c(
         -J * log(1 + exp(Xg)) - log(1 + exp(Xb)) + 
          ifelse(Y == J + 1, Xb + J * Xg, 0) +  
          ifelse(treat == 0, log(choose(J, Y)) + Y * Xg + log(1 + exp(Xb)), 0) + 
          ifelse(treat == 1 & Y %in% 1:J, (Y - 1) * Xg + log(choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg)), 0)
        )

      return(sum(llik))

    }

    MLGMM <- function(params, cW, J, Y, treat, X, robust) {
      
      n     <- length(Y)
      K     <- length(params)
      beta  <- params[1:(K/2)]
      gamma <- params[(K/2 + 1):K]

      Xb    <- X %*% beta
      Xg    <- X %*% gamma
      Xbg   <- X %*% (beta + gamma)

      # identification of beta
      beta.coef <- c(
        -logistic(Xb) + 
        ifelse(Y == J + 1, 1, 0) + 
        ifelse(treat == 0, logistic(Xb), 0) + 
        ifelse(treat == 1 & Y %in% 1:J, choose(J, Y - 1) * exp(Xb)/(choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg)), 0)
       )

      beta.foc <- colMeans(X*beta.coef)

      # identification of gamma
      gamma.coef <- c(
        -J*logistic(Xg) + 
        ifelse(Y == J + 1, J, 0) + 
        ifelse(treat == 0, Y, 0) + 
        ifelse(treat == 1 & Y %in% 1:J, (Y - 1) + choose(J, Y) * exp(Xg)/(choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg)), 0)
      )
      
      gamma.foc <- colMeans(X*gamma.coef)

      # dim moment
      if (robust) {
        dim <- mean(Y[treat == 1]) - mean(Y[treat == 0])
        aux.vec <- dim - logistic(Xb)
        aux.mom <- mean(aux.vec)
        cG <- c(beta.foc, gamma.foc, aux.mom) 
      } else {
        cG <- c(beta.foc, gamma.foc)
      }

      # gmm objective
      gmm.objective <- as.numeric(t(cG) %*% cW %*% cG)

      return(gmm.objective)

    }

    MLGMM.Grad <- function(params, cW, J, Y, treat, X, robust) {
      
      n     <- length(Y)
      K     <- length(params)
      beta  <- params[1:(K/2)]
      gamma <- params[(K/2 + 1):K]

      Xb    <- X %*% beta
      Xg    <- X %*% gamma
      Xbg   <- X %*% (beta + gamma)

      # identification of beta
      beta.coef <- c(
        -logistic(Xb) + 
        ifelse(Y == J + 1, 1, 0) + 
        ifelse(treat == 0, logistic(Xb), 0) + 
        ifelse(treat == 1 & Y %in% 1:J, choose(J, Y - 1) * exp(Xb)/(choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg)), 0)
       )

      beta.foc <- colMeans(X*beta.coef)

      # identification of gamma
      gamma.coef <- c(
        -J*logistic(Xg) + 
        ifelse(Y == J + 1, J, 0) + 
        ifelse(treat == 0, Y, 0) + 
        ifelse(treat == 1 & Y %in% 1:J, (Y - 1) + choose(J, Y) * exp(Xg)/(choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg)), 0)
      )
      
      gamma.foc <- colMeans(X*gamma.coef)

      # dim moment
      if (robust == TRUE) {
        aux.vec <- mean(Y[treat == 1]) - mean(Y[treat == 0]) - logistic(Xb)
        aux.mom <- mean(aux.vec)
        cG <- c(beta.foc, gamma.foc, aux.mom) 
      } else {
        cG <- c(beta.foc, gamma.foc)
      }

      # jacobian
      tmp1 <- -logistic(Xb)/(1 + exp(Xb)) + 
        ifelse(treat == 0, logistic(Xb)/(1 + exp(Xb)), 0) + 
          ifelse(treat == 1 & Y %in% 1:J, choose(J, Y) * choose(J, Y - 1) * exp(Xbg)/((choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg))^2), 0)
      tmp2 <- -J*logistic(Xg)/(1+exp(Xg)) + 
        ifelse(treat == 1 & Y %in% 1:J, choose(J, Y) * choose(J, Y - 1) * exp(Xbg)/((choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg))^2), 0)
      tmp3 <- -ifelse(treat == 1 & Y %in% 1:J, choose(J, Y) * choose(J, Y - 1) * exp(Xbg)/((choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg))^2), 0)
      tmp4 <- -logistic(Xb)/(1+exp(Xb))

      if (robust) {
        dcG <- 1/n * rbind(
          cbind(sweepCross(M = X, x = tmp1), sweepCross(M = X, x = tmp3)), 
          cbind(t(sweepCross(M = X, x = tmp3)), sweepCross(M = X, x = tmp2)), 
          cbind(t(colSums(X*c(tmp4))), matrix(0, ncol = K/2, nrow = 1))
        )
      } else {
        dcG <- 1/n * rbind(
          cbind(sweepCross(M = X, x = tmp1), sweepCross(M = X, x = tmp3)), 
          cbind(t(sweepCross(M = X, x = tmp3)), sweepCross(M = X, x = tmp2))
        )
      }

      return.vec <- c(t(cG) %*% (cW + t(cW)) %*% dcG)
      
      return(return.vec)

    }

    weightMatrix.MLRobust <- function(params, J, Y, treat, X, robust) {
      
      n     <- length(Y)
      n1    <- sum(treat == 1)
      n0    <- sum(treat == 0)

      K     <- length(params)
      beta  <- params[1:(K/2)]
      gamma <- params[(K/2 + 1):K]

      Xb    <- X %*% beta
      Xg    <- X %*% gamma
      Xbg   <- X %*% (beta + gamma)

      # identification of beta
      beta.coef <- c(
        -logistic(Xb) + 
        ifelse(Y == J + 1, 1, 0) + 
        ifelse(treat == 0, logistic(Xb), 0) + 
        ifelse(treat == 1 & Y %in% 1:J, choose(J, Y - 1) * exp(Xb)/(choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg)), 0)
       )

      beta.mat <- X*beta.coef

      # identification of gamma
      gamma.coef <- c(
        -J*logistic(Xg) + 
        ifelse(Y == J + 1, J, 0) + 
        ifelse(treat == 0, Y, 0) + 
        ifelse(treat == 1 & Y %in% 1:J, (Y - 1) + choose(J, Y) * exp(Xg)/(choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg)), 0)
      )
      
      gamma.mat <- X*gamma.coef
      Wtmp <- crossprod(cbind(beta.mat, gamma.mat))/n

      if (robust) {
        # additional moment
        m.dim <- mean(Y[treat==1]) - mean(Y[treat==0]) - logistic(Xb)
        Wtmp  <- crossprod(cbind(beta.mat, gamma.mat, m.dim))/n
      }

      # weight matrix    
      return(solve(Wtmp, tol=1e-20))

    }

    MLGMM.var <- function(params, J, Y, treat, X, robust, cW) {
      
      n     <- length(Y)
      n1    <- sum(treat == 1)
      n0    <- sum(treat == 0)

      K     <- length(params)
      beta  <- params[1:(K/2)]
      gamma <- params[(K/2 + 1):K]

      Xb    <- X %*% beta
      Xg    <- X %*% gamma
      Xbg   <- X %*% (beta + gamma)

      # identification of beta
      beta.coef <- c(
        -logistic(Xb) + 
        ifelse(Y == J + 1, 1, 0) + 
        ifelse(treat == 0, logistic(Xb), 0) + 
        ifelse(treat == 1 & Y %in% 1:J, choose(J, Y - 1) * exp(Xb)/(choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg)), 0)
       )

      beta.foc <- colMeans(X*beta.coef)

      # identification of gamma
      gamma.coef <- c(
        -J*logistic(Xg) + 
        ifelse(Y == J + 1, J, 0) + 
        ifelse(treat == 0, Y, 0) + 
        ifelse(treat == 1 & Y %in% 1:J, (Y - 1) + choose(J, Y) * exp(Xg)/(choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg)), 0)
      )
      
      gamma.foc <- colMeans(X*gamma.coef)

      # vcov of moments
      if (robust) {
        m.dim <- n/n1 * treat * Y - n/n0 * (1-treat) * Y - logistic(Xb)
        F <- crossprod(cbind(X*beta.coef, X*gamma.coef, m.dim))/n
      } else {
        F <- crossprod(cbind(X*beta.coef, X*gamma.coef))/n
      }
           
      # jacobian
      tmp1 <- -logistic(Xb)/(1 + exp(Xb)) + 
        ifelse(treat == 0, logistic(Xb)/(1 + exp(Xb)), 0) + 
          ifelse(treat == 1 & Y %in% 1:J, choose(J, Y) * choose(J, Y - 1) * exp(Xbg)/((choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg))^2), 0)
      tmp2 <- -J*logistic(Xg)/(1+exp(Xg)) + 
        ifelse(treat == 1 & Y %in% 1:J, choose(J, Y) * choose(J, Y - 1) * exp(Xbg)/((choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg))^2), 0)
      tmp3 <- -ifelse(treat == 1 & Y %in% 1:J, choose(J, Y) * choose(J, Y - 1) * exp(Xbg)/((choose(J, Y - 1) * exp(Xb) + choose(J, Y) * exp(Xg))^2), 0)
      tmp4 <- -logistic(Xb)/(1+exp(Xb))

      if (robust) {
        dcG <- 1/n * rbind(
          cbind(sweepCross(M = X, x = tmp1), sweepCross(M = X, x = tmp3)), 
          cbind(t(sweepCross(M = X, x = tmp3)), sweepCross(M = X, x = tmp2)), 
          cbind(t(colSums(X*c(tmp4))), matrix(0, ncol = K/2, nrow = 1))
        )
      } else {
        dcG <- 1/n * rbind(
          cbind(sweepCross(M = X, x = tmp1), sweepCross(M = X, x = tmp3)), 
          cbind(t(sweepCross(M = X, x = tmp3)), sweepCross(M = X, x = tmp2))
        )
      }

      return.mat <- solve(t(dcG) %*% cW %*% dcG, tol=1e-20) %*% 
        t(dcG) %*% cW %*% F %*% cW %*% dcG %*% 
          solve(t(dcG) %*% cW %*% dcG, tol=1e-20)

      return(return.mat/n)

    }

    ictrobust <- function(formula, data, treat, J, robust, par0) {
      
      mf <- model.frame(formula, data)
      n  <- nrow(mf)

      Y  <- model.response(mf)
      X  <- model.matrix(formula, data)
      Tr <- data[row.names(mf), treat]
      k  <- ncol(X)

      cW0   <- weightMatrix.MLRobust(params = par0, J = J, Y = Y, treat = Tr, X = X, robust = TRUE)  
      step1 <- optim(par = par0, fn = MLGMM, gr = MLGMM.Grad, 
        robust = TRUE, J = J, Y = Y, treat = Tr, X = X, cW = cW0, 
        method = "BFGS", control = list(maxit = 5000))
      par1  <- step1$par
      cW1   <- weightMatrix.MLRobust(params = par1, J = J, Y = Y, treat = Tr, X = X, robust = TRUE)

      step2 <- optim(par = par1, fn = MLGMM, gr = MLGMM.Grad, 
        robust = TRUE, J = J, Y = Y, treat = Tr, X = X, cW = cW1, 
        method = "BFGS", control = list(maxit = 5000))
      par2  <- step2$par
      cW2   <- weightMatrix.MLRobust(params = par2, J = J, Y = Y, treat = Tr, X = X, robust = TRUE)

      step3 <- optim(par = par2, fn = MLGMM, gr = MLGMM.Grad, 
        robust = TRUE, J = J, Y = Y, treat = Tr, X = X, cW = cW2, 
        method = "BFGS", control = list(maxit = 5000))
      cW3   <- weightMatrix.MLRobust(params = step3$par, J = J, Y = Y, treat = Tr, X = X, robust = TRUE)

      vcov <- MLGMM.var(params = step3$par, J = J, Y = Y, treat = Tr, X = X, robust = TRUE, cW = cW3)

      return(list(
        par = step3$par, 
        vcov = vcov, 
        se = sqrt(diag(vcov)), 
        converge = 1 - step3$convergence, 
        J.stat = ifelse(robust, n*step3$value, NA),
        p.val = ifelse(robust, 1 - pchisq(q = step3$value, df = 2*k), NA),
        est.mean = mean(logistic(X %*% step3$par[1:k])), 
        diff.mean = mean(Y[treat == 1]) - mean(Y[treat == 0]), 
        robust = robust, 
        message = step3$message
        )
      )

    }


    ictrobust.out <- ictrobust(formula = formula, data = data, treat = treat, J = J, 
      robust = robust, par0 = c(par.treat.nls.std, par.control.nls.std))
    if (ictrobust.out$converge != 1) warning("Optimization routine did not converge.")
    ictrobust.par <- ictrobust.out$par
    ictrobust.vcov <- ictrobust.out$vcov
    ictrobust.se <- sqrt(diag(ictrobust.vcov))

    names(ictrobust.par) <- rep(names(x.all), 2)
    names(ictrobust.se)  <- rep(names(x.all), 2)

    par.control.robust <- ictrobust.par[1:ncol(x.all)]
    par.treat.robust <- ictrobust.par[(ncol(x.all)+1):(ncol(x.all)*2)]
    llik.robust <- loglik(params = c(par.control.robust, par.treat.robust), J = J, Y = y.all, treat = treat, X = x.all)   

  }

  # auxiliary data functionality
  if (aux.check) {

    # this function returns the weighting matrix for the GMM estimator
    invF <- function(pars, J, y, treat, x, w = NULL, h, group, matrixMethod = c("efficient", "princomp", "cue")) {

        n <- length(y)

        if (missing(w)) w <- rep(1, n)
        w <- (n/sum(w)) * w
        
        y1 <- y[treat == 1]
        y0 <- y[treat == 0]
        x1 <- x[treat == 1, , drop = FALSE]
        x0 <- x[treat == 0, , drop = FALSE]
        w1 <- w[treat == 1]
        w0 <- w[treat == 0]
        group1 <- group[treat == 1]
        delta <- pars[1:ncol(x)]
        gamma <- pars[(ncol(x) + 1):(ncol(x) * 2)]

        m1 <- c((y1 - J * logistic(x1 %*% gamma) - logistic(x1 %*% delta)) * 
          logistic(x1 %*% delta)/(1 + exp(x1 %*% delta))) * x1
        m0 <- c((y0 - J * logistic(x0 %*% gamma)) * 
          J * logistic(x0 %*% gamma)/(1 + exp(x0 %*% gamma))) * x0
        Em1 <- t(m1) %*% m1/n
        Em0 <- t(m0) %*% m0/n
        F <- adiag(Em1, Em0)

        if (length(h) > 0) {
          group.labels <- names(h)
          for (label in group.labels) {
            aux.mom <- h[label] - logistic(x[group == label, , drop = FALSE] %*% delta)
            F <- adiag(F, t(aux.mom) %*% aux.mom/n)
          }
        }
        
        efficientW <- solve(F, tol = 1e-20)

        if (matrixMethod == "efficient" | matrixMethod == "cue") {
          efficientW
        } else if (matrixMethod == "princomp") {
          k <- ncol(F)        
          decomp <- eigen(F)

          threshold <- .95 * sum(decomp$values)
          curr.sum <- r <- 0
          while (curr.sum <= threshold) {
            r <- r + 1
            curr.sum <- sum(decomp$values[1:r]) 
          }

          princompW <- matrix(0, nrow = k, ncol = k)
          for (value in 1:r) {
            princompW <- princompW + 1/decomp$values[value] * decomp$vectors[, value] %*% t(decomp$vectors[, value])
          }
          princompW
        }

    }

    # GMM objective function
    gmm.nls <- function(pars, J, y, treat, x, w = NULL, h, group, W = NULL) {

        n <- length(y)

        if (missing(w)) w <- rep(1, n)
        if (missing(W)) W <- diag(nrow = length(pars) + length(h)) 
        w <- (n/sum(w)) * w
        
        y1 <- y[treat == 1]
        y0 <- y[treat == 0]
        x1 <- x[treat == 1, , drop = FALSE]
        x0 <- x[treat == 0, , drop = FALSE]
        w1 <- w[treat == 1]
        w0 <- w[treat == 0]
        group1 <- group[treat == 1]
        delta <- pars[1:ncol(x)]
        gamma <- pars[(ncol(x) + 1):(2 * ncol(x))]

        # NLS moments
        m1 <- c((y1 - J * logistic(x1 %*% gamma) - logistic(x1 %*% delta)) * 
          logistic(x1 %*% delta)/(1 + exp(x1 %*% delta))) * x1
        m0 <- c((y0 - J * logistic(x0 %*% gamma)) * 
          J * logistic(x0 %*% gamma)/(1 + exp(x0 %*% gamma))) * x0

        # AUX moments
        g <- c()
        if (length(h) > 0) {
          group.labels <- names(h)
          for (label in group.labels) {
              g.tmp <- h[label] - logistic(x[group == label, , drop = FALSE] %*% delta)
              g <- c(g, sum(g.tmp))
          }
        }

        M <- c(colSums(m1), colSums(m0), g)/n
        as.numeric(t(M) %*% W %*% M)

    }

    # GMM gradient
    gmm.grad <- function(pars, J, y, treat, x, w = NULL, h, group, W = NULL) {

        n <- length(y)

        if (missing(w)) w <- rep(1, n)
        if (missing(W)) W <- diag(nrow = length(pars) + length(h)) 
        w <- (n/sum(w)) * w

        y1 <- y[treat == 1]
        y0 <- y[treat == 0]
        x1 <- x[treat == 1, , drop = FALSE]
        x0 <- x[treat == 0, , drop = FALSE]
        w1 <- w[treat == 1]
        w0 <- w[treat == 0]
        group1 <- group[treat == 1]
        delta <- pars[1:ncol(x)]
        gamma <- pars[(ncol(x) + 1):(2 * ncol(x))]

        # NLS moments
        m1 <- c((y1 - J * logistic(x1 %*% gamma) - logistic(x1 %*% delta)) * 
          logistic(x1 %*% delta)/(1 + exp(x1 %*% delta))) * x1
        m0 <- c((y0 - J * logistic(x0 %*% gamma)) * 
          J * logistic(x0 %*% gamma)/(1 + exp(x0 %*% gamma))) * x0

        # AUX moments
        g <- dh <- c()
        if (length(h) > 0) {
          group.labels <- names(h)
          for (label in group.labels) {
              g.tmp <- h[label] - logistic(x[group == label, , drop = FALSE] %*% delta)
              g <- c(g, sum(g.tmp))
          # AUX Jacobian
              dh.tmp <- -c(logistic(x[group == label, , drop = FALSE] %*% delta)/
                (1 + exp(x[group == label, , drop = FALSE] %*% delta))) * 
                  x[group == label, , drop = FALSE]
              dh <- rbind(dh, t(colSums(dh.tmp))/n)
          }
        } 
        # NLS Jacobian
        Gtmp <- c(logistic(x1 %*% delta)/(1 + exp(x1 %*%
            delta))) * x1
        G1 <- -t(Gtmp * w1) %*% Gtmp/n
        Gtmp <- c(sqrt(J * logistic(x1 %*% delta) * logistic(x1 %*%
            gamma)/((1 + exp(x1 %*% delta)) * (1 + exp(x1 %*%
            gamma))))) * x1
        G2 <- -t(Gtmp * w1) %*% Gtmp/n
        Gtmp <- c(J * logistic(x0 %*% gamma)/(1 + exp(x0 %*%
            gamma))) * x0
        G3 <- -t(Gtmp * w0) %*% Gtmp/n
        G <- rbind(cbind(G1, G2), cbind(matrix(0, ncol = ncol(G1), nrow = nrow(G3)), G3))
        
        # complete moment vector
        M <- c(colSums(m1), colSums(m0), g)/n
        # complete Jacobian
        if (length(h) > 0) G <- rbind(G, cbind(dh, matrix(0, nrow = length(h), ncol = ncol(G3))))

        t(M) %*% (W + t(W)) %*% G

    }

    gmm.cue <- function(pars, J, y, treat, x, w, h, group, matrixMethod = c("efficient", "princomp", "cue")) {

        n <- length(y)
        w <- (n/sum(w)) * w
        y1 <- y[treat == 1]
        y0 <- y[treat == 0]
        x1 <- x[treat == 1, , drop = FALSE]
        x0 <- x[treat == 0, , drop = FALSE]
        w1 <- w[treat == 1]
        w0 <- w[treat == 0]
        group1 <- group[treat == 1]
        delta <- pars[1:ncol(x)]
        gamma <- pars[(ncol(x) + 1):(2 * ncol(x))]

        # NLS moments
        m1 <- c((y1 - J * logistic(x1 %*% gamma) - logistic(x1 %*% delta)) * 
          logistic(x1 %*% delta)/(1 + exp(x1 %*% delta))) * x1
        m0 <- c((y0 - J * logistic(x0 %*% gamma)) * 
          J * logistic(x0 %*% gamma)/(1 + exp(x0 %*% gamma))) * x0

        # AUX moments
        g <- c()
        group.labels <- names(h)
        for (label in group.labels) {
            g.tmp <- c(h[label]) - logistic(x[group == label, , drop = FALSE] %*% delta)
            g <- c(g, sum(g.tmp * w[group == label]))
        }
        M <- c(colSums(m1), colSums(m0), g)/n

        Em1 <- t(m1) %*% m1/n
        Em0 <- t(m0) %*% m0/n
        F <- adiag(Em1, Em0)

        for (label in group.labels) {
          aux.mom <- h[label] - logistic(x[group == label, , drop = FALSE] %*% delta)
          aux.mom.var <- t(aux.mom) %*% aux.mom/n
          F <- adiag(F, aux.mom.var)
        }
        W <- solve(F, tol = 1e-20)

        as.numeric(t(M) %*% W %*% M)

    }

    list.gmm <- function(pars, J, y, treat, x, w, h, group, matrixMethod = c("efficient", "princomp", "cue")) {

        n <- length(y)
        w <- (n/sum(w)) * w

        y1 <- y[treat == 1]
        y0 <- y[treat == 0]
        x1 <- x[treat == 1, , drop = FALSE]
        x0 <- x[treat == 0, , drop = FALSE]
        w1 <- w[treat == 1] 
        w0 <- w[treat == 0]
        group1 <- group[treat == 1]

        if (matrixMethod == "efficient" | matrixMethod == "princomp") {
            # step 1
            W0 <- invF(pars = pars, J = J, y = y, treat = treat, 
                x = x, w = w, h = h, group = group, matrixMethod = matrixMethod)
            fit0 <- optim(par = pars, fn = gmm.nls, gr = gmm.grad,
                J = J, y = y, treat = treat, x = x, w = w, h = h, 
                group = group, W = W0, method = "BFGS", 
                control = list(maxit = maxIter))
            
            # step 2
            W1 <- invF(fit0$par, J = J, y = y, treat = treat, 
                x = x, w = w, h = h, group = group, matrixMethod = matrixMethod)
            fit1 <- optim(par = fit0$par, fn = gmm.nls, gr = gmm.grad,
                J = J, y = y, treat = treat, x = x, w = w, h = h, 
                group = group, W = W1, method = "BFGS", 
                control = list(maxit = maxIter))

        } else if (matrixMethod == "cue") {
            fit1 <- optim(par = pars, fn = gmm.cue, J = J, y = y, 
              treat = treat, x = x, w = w, h = h, group = group, matrixMethod = matrixMethod, 
              method = "BFGS", control = list(maxit = maxIter))

        }
        
        if (fit1$convergence != 0) stop("Optimization routine did not converge.")

        coef <- fit1$par
        delta <- coef[1:ncol(x)]
        gamma <- coef[(ncol(x) + 1):(2 * ncol(x))]

        # AUX Jacobian
        dh <- c()
        if (length(h) > 0) {
          group.labels <- names(h)
          for (label in group.labels) {
              dh.tmp <- -c(logistic(x[group == label, , drop = FALSE] %*% delta)/
                (1 + exp(x[group == label, , drop = FALSE] %*% delta))) * 
                  x[group == label, , drop = FALSE]
              dh <- rbind(dh, t(colSums(dh.tmp))/n)
          }
        }

        # NLS Jacobian
        Gtmp <- c(logistic(x1 %*% delta)/(1 + exp(x1 %*%
            delta))) * x1
        G1 <- -t(Gtmp) %*% Gtmp/n
        Gtmp <- c(sqrt(J * logistic(x1 %*% delta) * logistic(x1 %*%
            gamma)/((1 + exp(x1 %*% delta)) * (1 + exp(x1 %*%
            gamma))))) * x1
        G2 <- -t(Gtmp) %*% Gtmp/n
        Gtmp <- c(J * logistic(x0 %*% gamma)/(1 + exp(x0 %*%
            gamma))) * x0
        G3 <- -t(Gtmp) %*% Gtmp/n
        G <- rbind(cbind(G1, G2), cbind(matrix(0, ncol = ncol(G1), nrow = nrow(G3)), G3))

        # complete Jacobian
        if (length(h) > 0) G <- rbind(G, cbind(dh, matrix(0, nrow = length(h), ncol = ncol(G3))))

        weightMatrix <- invF(pars = coef, J = J, y = y, treat = treat, x = x, w = w, 
          h = h, group = group, matrixMethod = matrixMethod)
        if (matrixMethod == "princomp") 
          vcov.aux <- ginv(t(G) %*% weightMatrix %*% G)/n
        else 
          vcov.aux <- solve(t(G) %*% weightMatrix %*% G, tol = 1e-20)/n
        
        list(coef = coef, vcov = vcov.aux, val = fit1$value)
    
    }
      g.all <- group[na.x == 0 & na.y == 0 & na.w == 0]
      par.nls.std <- c(par.treat.nls.std, par.control.nls.std)
      fit.nls.aux <- list.gmm(pars = par.nls.std, J = J, y = y.all, treat = t, 
        x = x.all, w = w.all, h = h, group = g.all, matrixMethod = matrixMethod)
      par.nls.aux <- fit.nls.aux$coef
      vcov.nls <- fit.nls.aux$vcov
      se.twostep <- sqrt(diag(vcov.nls))
      par.treat.nls.std <- par.nls.aux[1:ncol(x.all)]
      par.control.nls.std <- par.nls.aux[(ncol(x.all)+1):(ncol(x.all)*2)]

      # Sargan-Hansen overidentification test
      J.stat <- fit.nls.aux$val * n
      overid.p <- round(1 - pchisq(J.stat, df = length(h)), 4)
     
  }

  ## 
  ## Set up return object
  
  if (method == "nls") {
    
    if (error == "topcode") {
      return.object <-
        list(
          coef = setNames(nm = rep(coef.names, 2), c(par.treat, par.control)), 
          par.treat = par.treat,
          se.treat = se.treat,
          par.control = par.control,
          se.control = se.control,
          vcov = vcov.nls,
          p.est = p.est,
          p.ci = p.ci,
          coef.names = coef.names,
          J = J,
          design = design,
          method = method,
          fit.start = fit.start,
          overdispersed = overdispersed,
          boundary = boundary,
          multi = multi,
          error = error,
          data = data,
          x = x.all,
          y = y.all,
          treat = t,
          call = match.call()
        )
    } else if (error == "uniform"){
      return.object <-
        list(
          coef = setNames(nm = rep(coef.names, 2), c(par.treat, par.control)), 
          par.treat = par.treat,
          se.treat = se.treat,
          par.control = par.control,
          se.control = se.control,
          vcov = vcov.nls,
          p0.est = p0.est,
          p0.ci = p0.ci,
          p1.est = p1.est,
          p1.ci = p1.ci,
          coef.names = coef.names,
          J = J,
          design = design,
          method = method,
          fit.start = fit.start,
          overdispersed = overdispersed,
          boundary = boundary,
          multi = multi,
          error = error,
          data = data,
          x = x.all,
          y = y.all,
          treat = t,
          call = match.call()
        )
      
    } else {
      
      if (multi == FALSE) {
        
        if(design=="standard")
          par.treat <- par.treat.nls.std
        if(design=="modified")
          par.treat <- par.treat.nls.mod
        se.treat <- se.twostep[1:(length(par.treat))]
        
        if(design=="standard")
          par.control <- par.control.nls.std
        if(design=="modified")
          par.control <- par.control.nls.mod
        se.control <- se.twostep[(length(par.treat)+1):(length(se.twostep))]
        
        names(par.treat) <- names(se.treat) <- coef.names
        
        if (design=="standard")
          names(par.control) <- names(se.control) <- coef.names
        if (design=="modified")
          names(par.control) <- names(se.control) <- rep(coef.names, J)
        
        sum.fit.treat <- summary(fit.treat)
        
        resid.se <- sum.fit.treat$sigma
        resid.df <- sum.fit.treat$df[2]
        
        if(design=="standard") {
          return.object <- list(par.treat=par.treat, se.treat=se.treat, par.control=par.control, se.control=se.control, vcov=vcov.nls, resid.se=resid.se, resid.df=resid.df, coef.names=coef.names,  J=J, design = design, method = method, fit.start = fit.start, overdispersed=overdispersed, boundary = boundary, multi = multi, data = data, x = x.all, y = y.all, treat = t, call = match.call())
          if(weighted == TRUE)
            return.object$weights <- w.all
          
        } else if (design=="modified") {
          return.object <- list(par.treat=par.treat, se.treat=se.treat, par.control=par.control, se.control=se.control, vcov=vcov.twostep, resid.se=resid.se, resid.df=resid.df, coef.names=coef.names, J=J, design = design, method = method, fit.nonsensitive = fit.nonsensitive, data = data, x = x.all, y = y.all, treat = t, boundary = FALSE, multi = FALSE, call = match.call())
          if(weighted == TRUE)
            return.object$weights <- w.all
          
        }
        
      } else if (multi == TRUE) {
        
        par.treat <- par.treat.nls.std
        
        se.treat <- list()
        for (m in 1:length(treatment.values))
          se.treat[[m]] <- se.twostep[[m]][1:(length(par.treat[[m]]))]
        
        par.control <- par.control.nls.std
        se.control <- se.twostep[[1]][(length(par.treat[[1]])+1):(length(se.twostep[[1]]))]
        
        for (m in 1:length(treatment.values)) {
          names(par.treat[[m]]) <- coef.names
          names(se.treat[[m]]) <- coef.names
        }
        
        names(par.control) <- names(se.control) <- coef.names
        
        
        resid.se <- resid.df <- rep(NA, length(treatment.values))
        for (m in 1:length(treatment.values)) {
          sum.fit.treat <- summary(fit.treat[[m]])
          resid.se[m] <- sum.fit.treat$sigma
          resid.df[m] <- sum.fit.treat$df[2]
        }
        
        return.object <- list(par.treat=par.treat, se.treat=se.treat, par.control=par.control, se.control=se.control, vcov=vcov.nls, treat.values = treatment.values, treat.labels = treatment.labels, control.label = control.label, resid.se=resid.se, resid.df=resid.df, J=J,  coef.names=coef.names, design = design, method = method, overdispersed=overdispersed, boundary = boundary, multi = multi, data = data, x = x.all, y = y.all, treat = t, call = match.call())
        if(weighted == TRUE)
          return.object$weights <- w.all
        
      }
      
    }
  }
  
  if (method == "ml" & !robust) {
    
    if(design == "standard") {
      
      if (multi == FALSE) {
        
        if (constrained == T) {

          par.control <- MLEfit$par[1:(nPar)]

          if (overdispersed == T){
            par.treat <- MLEfit$par[(nPar+2):(nPar*2+1)]
            se.treat <- se.mle[(nPar+2):(nPar*2+1)]
            se.control <- se.mle[1:(nPar)]
            par.overdispersion <- MLEfit$par[nPar+1]
            se.overdispersion <- se.mle[nPar+1]
            names(par.overdispersion) <- names(se.overdispersion) <- "overdispersion"
          } else {
            par.treat <- MLEfit$par[(nPar+1):(nPar*2)]
            se.treat <- se.mle[(nPar+1):(nPar*2)]
            se.control <- se.mle[1:(nPar)]
          }
          
          if(floor==TRUE)
            par.floor <- par[(nPar*2+1):(nPar*2 + nPar.floor)]
          if(floor==TRUE & ceiling==TRUE)
            par.ceiling <- par[(nPar*2 + nPar.floor + 1):(nPar*2 + nPar.floor + nPar.ceiling)]
          if(floor==FALSE & ceiling==TRUE)
            par.ceiling <- par[(nPar*2+1):(nPar*2 + nPar.ceiling)]
          
          if(floor==TRUE)
            se.floor <- se.mle[(nPar*2+1):(nPar*2+ nPar.floor)]
          if(floor==TRUE & ceiling==TRUE)
            se.ceiling <- se.mle[(nPar*2 + nPar.floor + 1):(nPar*2 + nPar.floor + nPar.ceiling)]
          if(floor==FALSE & ceiling==TRUE)
            se.ceiling <- se.mle[(nPar*2+1):(nPar*2 + nPar.ceiling)]
          
          names(par.treat) <- names(se.treat) <- names(par.control) <- names(se.control)  <- coef.names
          
          if(floor==TRUE)
            names(par.floor) <- names(se.floor) <- coef.names.floor
          if(ceiling==TRUE)
            names(par.ceiling) <- names(se.ceiling) <- coef.names.floor
          
          if(boundary == TRUE | multi == TRUE)
            llik.const <- llik
          
          if(boundary==F)
            if (overdispersed == T)
              return.object <- list(par.treat=par.treat, se.treat=se.treat, par.control=par.control, se.control=se.control, par.overdispersion=par.overdispersion, se.overdispersion=se.overdispersion, vcov=vcov.mle, pred.post = w, treat.labels = treatment.labels, control.label = control.label, llik=llik.const, J=J,  coef.names=coef.names, design = design, method = method, overdispersed=overdispersed, constrained=constrained, boundary = boundary, multi = multi, ceiling = ceiling, floor = floor, call = match.call(), data = data, x = x.all, y = y.all, treat = t)
            else
              return.object <- list(par.treat=par.treat, se.treat=se.treat, par.control=par.control, se.control=se.control, vcov=vcov.mle, pred.post = w, treat.labels = treatment.labels, control.label = control.label, llik=llik.const, J=J,  coef.names=coef.names, design = design, method = method, overdispersed=overdispersed, constrained=constrained, boundary = boundary, multi = multi, ceiling = ceiling, floor = floor, call = match.call(), data = data, x = x.all, y = y.all, treat = t)
          
          if(floor==FALSE & ceiling==TRUE)
            return.object <- list(par.treat=par.treat, se.treat=se.treat, par.control=par.control, se.control=se.control, par.ceiling = par.ceiling, se.ceiling = se.ceiling, vcov=vcov.mle, pred.post = w, treat.labels = treatment.labels, control.label = control.label, llik=llik.const, J=J,  coef.names=coef.names, coef.names.ceiling = coef.names.ceiling, design = design, method = method, overdispersed=overdispersed, constrained=constrained, boundary = boundary, multi = multi, ceiling = ceiling, floor = floor, call = match.call(), data = data, x = x.all, y = y.all, treat = t)
          
          if(floor==TRUE & ceiling==FALSE)
            return.object <- list(par.treat=par.treat, se.treat=se.treat, par.control=par.control, se.control=se.control, par.floor = par.floor, se.floor = se.floor, vcov=vcov.mle, pred.post = w, llik=llik.const, treat.labels = treatment.labels, control.label = control.label, J=J, coef.names=coef.names, coef.names.floor = coef.names.floor, design = design, method = method, overdispersed=overdispersed, constrained=constrained, boundary = boundary, multi = multi, ceiling = ceiling, floor = floor, call = match.call(), data = data, x = x.all, y = y.all, treat = t)
          
          if(floor==TRUE & ceiling==TRUE)
            return.object <- list(par.treat=par.treat, se.treat=se.treat, par.control=par.control, se.control=se.control, par.floor = par.floor, se.floor = se.floor, par.ceiling = par.ceiling, se.ceiling = se.ceiling, pred.post = w, vcov=vcov.mle, treat.labels = treatment.labels, control.label = control.label, llik=llik.const, J=J,  coef.names=coef.names, coef.names.floor = coef.names.floor, coef.names.ceiling = coef.names.ceiling, design = design, method = method, overdispersed=overdispersed, constrained=constrained, boundary = boundary, multi = multi, ceiling = ceiling, floor = floor, call = match.call(), data = data, x = x.all, y = y.all, treat = t)
          
        } else if (constrained == FALSE) { 
          
          par.treat <- MLEfit$par[(nPar*2+1):(nPar*3)]
          par.control.psi0 <- MLEfit$par[1:(nPar)]
          par.control.psi1 <- MLEfit$par[(nPar+1):(nPar*2)]
          
          if (overdispersed == T){
            se.treat <- se.mle[(nPar*2+3):(nPar*3 + 2)]
            se.control.psi0 <- se.mle[1:(nPar)]
            se.control.psi1 <- se.mle[(nPar*2+2):(nPar*2+1)]
            par.overdispersion <- MLEfit$par[nPar*2+2]
            se.overdispersion <- se.mle[nPar*2+2]
            names(par.overdispersion) <- names(se.overdispersion) <- "overdispersion"
          } else {
            se.treat <- se.mle[(nPar*2+1):(nPar*3)]
            se.control.psi0 <- se.mle[1:(nPar)]
            se.control.psi1 <- se.mle[(nPar+1):(nPar*2)]
          }
          
          names(par.treat) <- names(se.treat) <- names(par.control.psi0) <- names(se.control.psi0) <- names(par.control.psi1) <- names(se.control.psi1) <- coef.names

          if(overdispersed==T)
            return.object <- list(par.treat=par.treat, se.treat=se.treat, par.control.psi0=par.control.psi0, se.control.psi0=se.control.psi0, par.control.psi1=par.control.psi1, se.control.psi1=se.control.psi1, par.overdispersion=par.overdispersion, se.overdispersion=se.overdispersion, vcov=vcov.mle, pred.post = w, treat.labels = treatment.labels, control.label = control.label, llik=llik, J=J,  coef.names=coef.names, design = design, method = method, overdispersed=overdispersed, constrained=constrained, boundary = boundary, multi = multi, call = match.call(), data=data, x = x.all, y = y.all, treat = t)
          else
            return.object <- list(par.treat=par.treat, se.treat=se.treat, par.control.psi0=par.control.psi0, se.control.psi0=se.control.psi0, par.control.psi1=par.control.psi1, se.control.psi1=se.control.psi1, vcov=vcov.mle, pred.post = w, treat.labels = treatment.labels, control.label = control.label, llik=llik, J=J,  coef.names=coef.names, design = design, method = method, overdispersed=overdispersed, constrained=constrained, boundary = boundary, multi = multi, call = match.call(), data=data, x = x.all, y = y.all, treat = t)
	
        }

        if(weighted == TRUE)
            return.object$weights <- w.all

      } else if (multi == TRUE) {

        par.control <- MLEfit$par[1:(nPar)]
        se.control <- se.mle[1:(nPar)]

        if (multi.condition == "none") {
          se.treat <- list()
          for (m in 1:length(treatment.values))
            se.treat[[m]] <- se.mle[(nPar + (m-1) * nPar + 1) : (nPar + m * nPar)]
          
          for (m in 1:length(treatment.values)) {
            names(par.treat[[m]]) <- coef.names
            names(se.treat[[m]]) <- coef.names
          }
        } else if (multi.condition == "level") {
          se.treat <- list()
          for (m in 1:length(treatment.values))
            se.treat[[m]] <- se.mle[(nPar + (m-1) * (nPar+1) + 1) :
                                    (nPar + m * (nPar+1))]
          
          for (m in 1:length(treatment.values)) {
            names(par.treat[[m]]) <- c(coef.names, "y_i(0)")
            names(se.treat[[m]]) <- c(coef.names, "y_i(0)")
          }
        }
        
        names(par.control) <- names(se.control) <- coef.names
        
        return.object <- list(par.treat=par.treat, se.treat=se.treat, par.control=par.control, se.control=se.control, vcov=vcov.mle, pred.post = w, treat.values = treatment.values, treat.labels = treatment.labels, control.label = control.label, multi.condition = multi.condition, llik=llik, J=J,  coef.names=coef.names, design = design, method = method, overdispersed=overdispersed, constrained=constrained, boundary = boundary, multi = multi, call = match.call(), data = data, x = x.all, y = y.all, treat = t)
        if(weighted == TRUE)
            return.object$weights <- w.all

      }
      
    } else if (design == "modified") {
      
      par.treat <- MLEfit$par[(nPar*J+1):(nPar*(J+1))]
      par.control <- MLEfit$par[1:(nPar*J)]
      
      se.treat <- se.mle[(nPar*J+1):(nPar*(J+1))]
      se.control <- se.mle[1:(nPar*J)]
      
      names(par.treat) <- names(se.treat) <- coef.names
      
      names(par.control) <- names(se.control) <- rep(coef.names, J)
      
      return.object <- list(par.treat=par.treat, se.treat=se.treat, par.control=par.control, se.control=se.control, vcov=vcov.mle, llik=llik, treat.labels = treatment.labels, control.label = control.label, coef.names=coef.names,  J=J, design = design, method = method, boundary = FALSE, multi = FALSE, call = match.call(), data=data, x = x.all, y = y.all, treat = t)
      if(weighted == TRUE)
          return.object$weights <- w.all
    }
  }

  if ((method == "nls" | method == "ml") & robust) {

    par.treat <- ictrobust.par[1:nPar]
    par.control <- ictrobust.par[(nPar+1):(nPar*2)]
    se.treat <- sqrt(diag(ictrobust.vcov))[1:(nPar)]
    se.control <- sqrt(diag(ictrobust.vcov))[(nPar+1):(nPar*2)]
    dim  <- mean(y.all[t==1]) - mean(y.all[t==0])
    dim.se <- sqrt(sd(y.all[t==1])^2/sum(t) + sd(y.all[t==0])^2/sum(1-t))

    return.object <- list(coef = setNames(nm = rep(coef.names, 2), c(par.treat, par.control)),  
      par.treat=par.treat, se.treat=se.treat, par.control=par.control, se.control=se.control, 
      vcov=ictrobust.vcov, treat.labels = treatment.labels, control.label = control.label, 
        J=J, coef.names=coef.names, design = design, method = method, robust = robust, dim = dim, dim.se = dim.se, 
          overdispersed=overdispersed, constrained=constrained, boundary = boundary, multi = multi, 
            ceiling = ceiling, floor = floor, call = match.call(), data = data, x = x.all, y = y.all, treat = t)
  
    if (method == "ml") return.object$llik <- llik.robust
      else {
        resid <- y.treatment - logistic(x.treatment %*% par.treat)
        return.object$resid.df <- nrow(x.treatment) - length(par.treat)
        return.object$resid.se <- 1/return.object$resid.df * sum(resid^2)
      }
  }      

  # measurement error models -- setting up return objects
  if (error == "topcode" & method == "ml") {

    par.treat <- topcode.par.treat
    par.control <- topcode.par.control
    se.treat <- topcode.se.treat
    se.control <- topcode.se.control
    p.est <- topcode.p

    return.object <- list(coef = setNames(nm = rep(coef.names, 2), c(par.treat, par.control)), 
      par.treat=par.treat, se.treat=se.treat, par.control=par.control, se.control=se.control, 
      vcov=topcode.vcov, treat.labels = treatment.labels, control.label = control.label, 
        llik=llik.topcode, J=J, coef.names=coef.names, design = design, method = method, robust = robust, 
          overdispersed=overdispersed, constrained=constrained, boundary = boundary, multi = multi, 
            ceiling = ceiling, floor = floor, call = match.call(), data = data, x = x.all, y = y.all, treat = t, 
              p.est = topcode.p, p.ci = topcode.p.ci, lp.est = topcode.lp, lp.se = topcode.lp.se, iters = topcode.iter)
  
  }      

  if (error == "uniform" & method == "ml") {

    par.treat <- uniform.par.treat
    par.control <- uniform.par.control
    se.treat <- uniform.se.treat
    se.control <- uniform.se.control
    p0.est <- uniform.p0
    p1.est <- uniform.p1
    p0.est <- uniform.p0
    p1.est <- uniform.p1

    return.object <- list(coef = setNames(nm = rep(coef.names, 2), c(par.treat, par.control)), 
      par.treat=par.treat, se.treat=se.treat, par.control=par.control, se.control=se.control, 
      vcov=uniform.vcov, treat.labels = treatment.labels, control.label = control.label, 
        llik=llik.uniform, J=J, coef.names=coef.names, design = design, method = method, robust = robust, 
          overdispersed=overdispersed, constrained=constrained, boundary = boundary, multi = multi, 
            ceiling = ceiling, floor = floor, call = match.call(), data = data, x = x.all, y = y.all, treat = t, 
      p0.est = uniform.p0, p0.ci = uniform.p0.ci, lp0.est = uniform.lp0, lp0.se = uniform.lp0.se, 
        p1.est = uniform.p1, p1.ci = uniform.p1.ci, lp1.est = uniform.lp1, lp1.se = uniform.lp1.se, iters = uniform.iter)

  
  }      

  if (method == "onestep") {

    return.object <- list(coef = setNames(nm = rep(coef.names, 2), c(onestep.par.treat, onestep.par.control)), 
      par.treat=onestep.par.treat, se.treat=onestep.se.treat, 
      par.control=onestep.par.control, se.control = onestep.se.control, 
      vcov=onestep.vcov, treat.labels = treatment.labels, control.label = control.label, 
        J=J, coef.names=coef.names, design = design, method = method, robust = robust, 
          overdispersed=overdispersed, constrained=constrained, boundary = boundary, multi = multi, 
            ceiling = ceiling, floor = floor, call = match.call(), data = data, x = x.all, y = y.all, treat = t) 
  
    resid <- y.treatment - logistic(x.treatment %*% onestep.par.treat)
    return.object$resid.df <- nrow(x.treatment) - length(onestep.par.treat)
    return.object$resid.se <- 1/return.object$resid.df * sum(resid^2)

  }

  # robust functionality 
  return.object$robust <- robust

  # auxiliary data functionality -- setting up return object
  return.object$aux <- aux.check 

  if (aux.check) {
    return.object$nh <- length(h)
    return.object$wm <- ifelse(matrixMethod == "cue", "continuously updating", 
      ifelse(matrixMethod == "princomp", "principal components", "efficient"))
    return.object$J.stat <- round(J.stat, 4)
    return.object$overid.p <- overid.p
  }


  return.object$error <- error
  
  if (error == "topcode") {
    return.object$p.est <- p.est
  }

  class(return.object) <- "ictreg"
  
  return.object
  
}


print.ictreg <- function(x, ...){
  
  cat("\nItem Count Technique Regression \n\nCall: ")
  
  dput(x$call)

  cat("\nCoefficient estimates\n")

  tb <- as.matrix(x$par.treat)
  colnames(tb) <- "est."
  
  cat("\n")

  print(coef(x))

  cat("\n")

  treat.print <- c()
  for (i in 1:length(x$treat.labels)) {
    treat.print <- c(treat.print, "'", x$treat.labels[i], "'", sep = "")
    if (i != length(x$treat.labels))
      treat.print <- c(treat.print, " and ")
  }
  
  cat("Number of control items J set to ", x$J, ". Treatment groups were indicated by ", sep = "")
  cat(treat.print, sep ="")
  cat(" and the control group by '", x$control.label, "'.\n\n", sep = "")
     
  # auxiliary data functionality -- print details
  if (x$aux) cat("Incorporating ", x$nh, " auxiliary moment(s). Weighting method: ", x$wm, ".\n", 
    "The overidentification test statistic was: ", x$J.stat, " (p < ", x$overid.p, ")", ".\n", sep = "")

  # measurement error models -- print details
  if (x$error == "topcode") cat("Estimated proportion of top-coded respondents: ", x$p.est, ". 95% CI: (", 
    round(x$p.ci[1], 6), ", ", round(x$p.ci[2], 6), ").\n", sep = "")

  if (x$error == "uniform") cat("Estimated proportion of respondents with uniform error (control): ", x$p0.est, ". 95% CI: (", 
    round(x$p0.ci[1], 6), ", ", round(x$p0.ci[2], 6), ").\n", 
    "Estimated proportion of respondents with uniform error (treated): ", x$p1.est, ". 95% CI: (", 
    round(x$p1.ci[1], 6), ", ", round(x$p1.ci[2], 6), ").\n", sep = "")

  invisible(x)
  
}

print.predict.ictreg <- function(x, ...){
  
  cat("\nList Experiment Prediction\n")
  
  cat("\nProportion of affirmative responses to the sensitive item\n")

  if (inherits(x$fit, "data.frame"))
    n <- nrow(x$fit)
  else
    n <- length(x$fit)

  for (i in 1:n) {
    cat(paste("\nPrediction:", rownames(as.matrix(x$fit))[i], "\nEst. = ",
              round(as.matrix(x$fit)[i,1], 4)))
    if (is.null(x$se.fit) == FALSE) {
        cat(paste(" (s.e. = ", round(as.matrix(x$se.fit)[i], 4), ")", sep = ""))
    }
    if (inherits(x$fit, "data.frame")) {
      cat(paste("\n95% confidence interval: (", round(as.matrix(x$fit)[i,2], 4), ", ",
                round(as.matrix(x$fit)[i,3], 4), ")", sep = ""))
    }
    cat("\n")
  }
  
  cat("\n")

  invisible(x)
  
}



#' Predict Method for Item Count Technique
#' 
#' Function to calculate predictions and uncertainties of predictions from
#' estimates from multivariate regression analysis of survey data with the item
#' count technique.
#' 
#' \code{predict.ictreg} produces predicted values, obtained by evaluating the
#' regression function in the frame newdata (which defaults to
#' \code{model.frame(object)}. If the logical \code{se.fit} is \code{TRUE},
#' standard errors of the predictions are calculated. Setting \code{interval}
#' specifies computation of confidence intervals at the specified level or no
#' intervals.
#' 
#' If \code{avg} is set to \code{TRUE}, the mean prediction across all
#' observations in the dataset will be calculated, and if the \code{se.fit}
#' option is set to \code{TRUE} a standard error for this mean estimate will be
#' provided. The \code{interval} option will output confidence intervals
#' instead of only the point estimate if set to \code{TRUE}.
#' 
#' Two additional types of mean prediction are also available. The first, if a
#' \code{newdata.diff} data frame is provided by the user, calculates the mean
#' predicted values across two datasets, as well as the mean difference in
#' predicted value. Standard errors and confidence intervals can also be added.
#' For difference prediction, \code{avg} must be set to \code{TRUE}.
#' 
#' The second type of prediction, triggered if a \code{direct.glm} object is
#' provided by the user, calculates the mean difference in prediction between
#' predictions based on an \code{ictreg} fit and a \code{glm} fit from a direct
#' survey item on the sensitive question. This is defined as the revealed
#' social desirability bias in Blair and Imai (2010).
#' 
#' @param object Object of class inheriting from "ictreg"
#' @param newdata An optional data frame containing data that will be used to
#' make predictions from. If omitted, the data used to fit the regression are
#' used.
#' @param newdata.diff An optional data frame used to compare predictions with
#' predictions from the data in the provided newdata data frame.
#' @param direct.glm A glm object from a logistic binomial regression
#' predicting responses to a direct survey item regarding the sensitive item.
#' The predictions from the ictreg object are compared to the predictions based
#' on this glm object.
#' @param newdata.direct An optional data frame used for predictions from 
#' the direct.glm logistic regression fit. 
#' @param se.fit A switch indicating if standard errors are required.
#' @param interval Type of interval calculation.
#' @param level Significance level for confidence intervals.
#' @param avg A switch indicating if the mean prediction and associated
#' statistics across all obserations in the dataframe will be returned instead
#' of predictions for each observation.
#' @param sensitive.item For multiple sensitive item design list experiments,
#' specify which sensitive item fits to use for predictions. Default is the
#' first sensitive item.
#' @param ... further arguments to be passed to or from other methods.
#' @return \code{predict.ictreg} produces a vector of predictions or a matrix
#' of predictions and bounds with column names fit, lwr, and upr if interval is
#' set. If se.fit is TRUE, a list with the following components is returned:
#' 
#' \item{fit}{vector or matrix as above} \item{se.fit}{standard error of
#' prediction}
#' @author Graeme Blair, UCLA, \email{graeme.blair@ucla.edu}
#' and Kosuke Imai, Princeton University, \email{kimai@princeton.edu}
#' @seealso \code{\link{ictreg}} for model fitting
#' @references Blair, Graeme and Kosuke Imai. (2012) ``Statistical Analysis of
#' List Experiments."  Political Analysis, Vol. 20, No 1 (Winter). available at
#' \url{http://imai.princeton.edu/research/listP.html}
#' 
#' Imai, Kosuke. (2011) ``Multivariate Regression Analysis for the Item Count
#' Technique.'' Journal of the American Statistical Association, Vol. 106, No.
#' 494 (June), pp. 407-416. available at
#' \url{http://imai.princeton.edu/research/list.html}
#' @keywords models regression
#' @examples
#' 
#' data(race)
#' 
#' race.south <- race.nonsouth <- race
#' 
#' race.south[, "south"] <- 1
#' race.nonsouth[, "south"] <- 0
#' 
#' \dontrun{
#' 
#' # Fit EM algorithm ML model with constraint with no covariates
#' 
#' ml.results.south.nocov <- ictreg(y ~ 1, 
#'    data = race[race$south == 1, ], method = "ml", treat = "treat", 
#'    J = 3, overdispersed = FALSE, constrained = TRUE)
#' ml.results.nonsouth.nocov <- ictreg(y ~ 1, 
#'    data = race[race$south == 0, ], method = "ml", treat = "treat", 
#'    J = 3, overdispersed = FALSE, constrained = TRUE)
#' 
#' # Calculate average predictions for respondents in the South 
#' # and the the North of the US for the MLE no covariates 
#' # model, replicating the estimates presented in Figure 1, 
#' # Imai (2010)
#' 
#' avg.pred.south.nocov <- predict(ml.results.south.nocov,
#'    newdata = as.data.frame(matrix(1, 1, 1)), se.fit = TRUE, 
#'    avg = TRUE)
#' avg.pred.nonsouth.nocov <- predict(ml.results.nonsouth.nocov,
#'    newdata = as.data.frame(matrix(1, 1, 1)), se.fit = TRUE, 
#'    avg = TRUE)
#' 
#' # Fit linear regression
#' 
#' lm.results <- ictreg(y ~ south + age + male + college, 
#'    data = race, treat = "treat", J=3, method = "lm")
#' 
#' # Calculate average predictions for respondents in the 
#' # South and the the North of the US for the lm model, 
#' # replicating the estimates presented in Figure 1, Imai (2010)
#' 
#' avg.pred.south.lm <- predict(lm.results, newdata = race.south, 
#'    se.fit = TRUE, avg = TRUE)
#' 
#' avg.pred.nonsouth.lm <- predict(lm.results, newdata = race.nonsouth, 
#'    se.fit = TRUE, avg = TRUE)
#' 
#' # Fit two-step non-linear least squares regression
#' 
#' nls.results <- ictreg(y ~ south + age + male + college, 
#'    data = race, treat = "treat", J=3, method = "nls")
#' 
#' # Calculate average predictions for respondents in the South 
#' # and the the North of the US for the NLS model, replicating
#' # the estimates presented in Figure 1, Imai (2010)
#' 
#' avg.pred.nls <- predict(nls.results, newdata = race.south, 
#'    newdata.diff = race.nonsouth, se.fit = TRUE, avg = TRUE)
#' 
#' # Fit EM algorithm ML model with constraint
#' 
#' ml.constrained.results <- ictreg(y ~ south + age + male + college, 
#'    data = race, treat = "treat", J=3, method = "ml", 
#'    overdispersed = FALSE, constrained = TRUE)
#' 
#' # Calculate average predictions for respondents in the South 
#' # and the the North of the US for the MLE model, replicating the 
#' # estimates presented in Figure 1, Imai (2010)
#' 
#' avg.pred.diff.mle <- predict(ml.constrained.results, 
#'    newdata = race.south, newdata.diff = race.nonsouth,
#'    se.fit = TRUE, avg = TRUE)
#' 
#' # Calculate average predictions from the item count technique
#' # regression and from a direct sensitive item modeled with
#' # a logit.
#' 
#' # Estimate logit for direct sensitive question
#' 
#' data(mis)
#' 
#' mis.list <- subset(mis, list.data == 1)
#' 
#' mis.sens <- subset(mis, sens.data == 1)
#' 
#' # Fit EM algorithm ML model
#' 
#' fit.list <- ictreg(y ~ age + college + male + south,
#'    J = 4, data = mis.list, method = "ml")
#' 
#' # Fit logistic regression with directly-asked sensitive question
#' 
#' fit.sens <- glm(sensitive ~ age + college + male + south, 
#'    data = mis.sens, family = binomial("logit"))
#' 
#' # Predict difference between response to sensitive item
#' # under the direct and indirect questions (the list experiment).
#' # This is an estimate of the revealed social desirability bias
#' # of respondents. See Blair and Imai (2010).
#' 
#' avg.pred.social.desirability <- predict(fit.list, 
#'    direct.glm = fit.sens, se.fit = TRUE)
#' 
#' }
#' 
#' 
predict.ictreg <- function(object, newdata, newdata.diff, direct.glm, newdata.direct, se.fit = FALSE,
                           interval = c("none","confidence"), level = .95, avg = FALSE, sensitive.item, ...){

  ##if(object$design != "standard")
  ##  stop("The predict function only currently works for standard design, single sensitive item experiments without ceiling or floor effects.")
  
  if(missing(newdata.direct) & !missing(direct.glm)) newdata.direct <- direct.glm$data
  
  if(missing(interval)) interval <- "none"
  
  nPar <- length(object$coef.names)
  
  ## dummy value for constraint
  if(object$method!="ml") object$constrained <- F
  
  logistic <- function(object) exp(object)/(1+exp(object))

  if(missing(sensitive.item)) {
    sensitive.item <- 1
    if(object$multi==TRUE)
      warning("Using the first sensitive item for predictions. Change with the sensitive.item option.")
  }

  if (!inherits(object$par.treat, "list")) {
    ## extract only treatment coef's, var/covar
    beta <- object$par.treat ## was coef(object)[1:nPar]
    var.beta <- vcov(object)[1:nPar, 1:nPar]
  } else {
    ## extract only treatment coef's, var/covar
    beta <- object$par.treat[[sensitive.item]] ## was coef(object)[1:nPar]
    var.beta <- vcov(object)[(sensitive.item*nPar + 1):((sensitive.item+1)*nPar), (sensitive.item*nPar + 1):((sensitive.item+1)*nPar)]
  }
  
  
  if(missing(newdata)) {
    xvar <- object$x
  } else { 
    if(nrow(newdata)==0)
      stop("No data in the provided data frame.")
    ##formula <- do.call(c, as.list(as.character(print(object$call$formula[[3]]))))
    xvar <- model.matrix(as.formula(paste("~", c(object$call$formula[[3]]))), newdata)
  }

  if (missing(direct.glm) == TRUE) {
    
    if (!missing(newdata.diff)) {
      data.list <- list(xvar,
                        model.matrix(as.formula(paste("~", c(object$call$formula[[3]]))),
                                     newdata.diff))
    } else {
      data.list <- list(xvar)
    }
    
    k <- 1
    
    multi.return.object <- list()
    
    for (xvar in data.list) {
      
      n <- nrow(xvar)
      
      if (object$method == "lm")
        pix <- xvar %*% beta
      else
        pix <- logistic(xvar %*% beta)
      
      v <- pix/(1+exp(xvar %*% beta))
      
      if(avg==FALSE){        
          if (object$method == "lm"){
            var.pred <- diag(xvar %*% var.beta %*% t(xvar))
          } else {
            var.pred <- v^2 * diag(xvar %*% var.beta %*% t(xvar))
          }
        
        se <- sqrt(var.pred)
        
      } else {
        
        pix <- mean(pix)
        
        if (object$method == "lm"){
          var.pred <- sum(xvar %*% var.beta %*% t(xvar))
        } else {
          var.pred <- sum(v %*% t(v) * xvar %*% var.beta %*% t(xvar))
        }
        
        se <- c(sqrt(var.pred)/n)
        
      }
      
      return.object <- list(fit=as.vector(pix))
      
      if(interval=="confidence"){
        
        critical.value <- qt(1-(1-level)/2, df = n)
        ci.upper <- pix + critical.value * se
        ci.lower <- pix - critical.value * se
        
        return.object <- list(fit=data.frame(fit = pix, lwr = ci.lower, upr = ci.upper))
        
      }
      
      if(se.fit==T)
        return.object$se.fit <- as.vector(se)
      
      if (length(data.list) == 2)
        multi.return.object[[k]] <- return.object
      
      k <- k + 1
      
    }
    
    if (length(data.list) == 2) {
      
      if (object$method == "lm")
        stop("Currently the difference option does not work with the lm method. Try ml.")
      
      xa <- data.list[[1]]
      xb <- data.list[[2]]
      
      n <- nrow(xa)
      var <- est <- 0
      pixa <- logistic(xa %*% beta)
      pixb <- logistic(xb %*% beta)
      va <- pixa/(1+exp(xa %*% beta))
      vb <- pixb/(1+exp(xb %*% beta))

      var <- sum( va %*% t(va) * xa %*% var.beta %*% t(xa)) + sum(vb %*% t(vb) * xb %*% var.beta %*% t(xb)) - 
        2 * sum(va %*% t(vb) * xa %*% var.beta %*% t(xb))
        
      est <- mean(pixa-pixb)
      
      se <- as.vector(sqrt(var)) / n
      
      return.object <- list(fit = est)
      
      if(interval=="confidence"){
        
        critical.value <- qt(1-(1-level)/2, df = n)
        ci.upper <- est + critical.value * se
        ci.lower <- est - critical.value * se
        
        return.object <- list(fit=data.frame(fit = est, lwr = ci.lower, upr = ci.upper))
        
      }
      
      if(se.fit==T)
        return.object$se.fit <- as.vector(se)
      
      multi.return.object[[3]] <- return.object
      
      ## now put them together
      
      fit <- se <- list()
      for (j in 1:3) {
        fit[[j]] <- multi.return.object[[j]]$fit
        se[[j]] <- multi.return.object[[j]]$se.fit
      }
      
      return.object <- c()
      return.object$fit <- as.data.frame(do.call(rbind, fit))

      rownames(return.object$fit) <- c("newdata", "newdata.diff", "Difference (newdata - newdata.diff)")
      if (se.fit == T)
        return.object$se.fit <- as.vector(do.call(c, se))

      attr(return.object, "concat") <- TRUE
      
    }


  } else {

    ## direct versus list Monte Carlo prediction

    beta.direct <- coef(direct.glm)
    vcov.direct <- vcov(direct.glm)

    n.draws <- 10000
        
    xvar.direct <- model.matrix(as.formula(paste("~", c(object$call$formula[[3]]))),
                                newdata.direct)
    
    if (ncol(xvar.direct) != nPar)
      stop("Different number of covariates in direct and list regressions.")
     
    draws.list <- mvrnorm(n = n.draws, mu = beta, Sigma = var.beta)
    draws.direct <- mvrnorm(n = n.draws, mu = beta.direct, Sigma = vcov.direct)

    pred.list.mean <- pred.direct.mean <- pred.diff.mean <- rep(NA, n.draws)
    
    for (d in 1:n.draws) {
      
      par.g <- draws.list[d, ]

      if (object$method == "lm")   
        pred.list <- xvar %*% par.g
      else
        pred.list <- logistic(xvar %*% par.g)

      pred.direct <- logistic(xvar.direct %*% draws.direct[d,])
      
      pred.list.mean[d] <- mean(pred.list)
      pred.direct.mean[d] <- mean(pred.direct)
      pred.diff.mean[d] <- pred.list.mean[d] - pred.direct.mean[d]
      
    }

    est.list <- mean(pred.list.mean)
    se.list <- sd(pred.list.mean)
    est.direct <- mean(pred.direct.mean)
    se.direct <- sd(pred.direct.mean)
    est.diff <- mean(pred.diff.mean)
    se.diff <- sd(pred.diff.mean)
    
    critical.value <- qt(1-(1-level)/2, df = nrow(xvar))
    ci.upper.list <- est.list + critical.value * se.list
    ci.lower.list <- est.list - critical.value * se.list
    ci.upper.direct <- est.direct + critical.value * se.direct
    ci.lower.direct <- est.direct - critical.value * se.direct
    ci.upper.diff <- est.diff + critical.value * se.diff
    ci.lower.diff <- est.diff - critical.value * se.diff
       
    fit.matrix <- as.data.frame(rbind(c(est.list, ci.lower.list, ci.upper.list),
                                         c(est.direct, ci.lower.direct, ci.upper.direct),
                                         c(est.diff, ci.lower.diff, ci.upper.diff)))
    names(fit.matrix) <- c("fit","lwr","upr")
    rownames(fit.matrix) <- c("List", "Direct", "Difference (list - direct)")

    return.object <- c()
    return.object$fit <- fit.matrix
    if (se.fit == T)
      return.object$se.fit <- c(se.list, se.direct, se.diff)

    attr(return.object, "concat") <- TRUE
    
  }

  class(return.object) <- "predict.ictreg"
  
  return.object
  
}

coef.ictreg <- function(object, ...){
  
  nPar <- length(object$coef.names)
  
  if((object$method=="lm" | object$method=="nls" | object$method=="onestep" | object$design=="modified") & object$boundary == FALSE & object$multi == FALSE){
    coef <- c(object$par.treat,object$par.control)
    
    if(object$design=="standard") {
      names(coef) <- c(paste("sensitive.",object$coef.names,sep=""), paste("control.",object$coef.names,sep=""))
    } else if(object$design=="modified") {
      coef.names <- paste("sensitive.",object$coef.names,sep="")
      for(j in 1:object$J)
        coef.names <- c(coef.names, paste("control.", j, ".", object$coef.names, sep=""))
      names(coef) <- coef.names
    }
    
  } else if(object$method=="ml" & object$boundary == FALSE & object$multi == FALSE) {
    
    if(object$design=="standard"){
      if(object$constrained==F) {
        coef <- c(object$par.treat,object$par.control.psi0,object$par.control.psi1)
        names(coef) <- c(paste("sensitive.",object$coef.names,sep=""), paste("control.psi0.",object$coef.names,sep=""),paste("control.psi1.",object$coef.names,sep=""))
      } else if (object$design=="modified") {
        coef <- c(object$par.treat,object$par.control)
        names(coef) <- c(paste("sensitive.",object$coef.names,sep=""), paste("control.",object$coef.names,sep=""))
      } else if (object$constrained == T) {
        coef <- c(object$par.treat,object$par.control)
        names(coef) <- c(paste("sensitive.",object$coef.names,sep=""),
                         paste("control.",object$coef.names,sep=""))
      } 
    } 
  }

  if (object$boundary == TRUE) {
    if (object$floor == TRUE & object$ceiling == TRUE) {
      coef <- c(object$par.control, object$par.treat,
                object$par.floor, object$par.ceiling)
      coef.names <- paste("control.",object$coef.names,sep="")
      coef.names <- c(coef.names, paste("sensitive.",object$coef.names,sep=""))
      coef.names <- c(coef.names, paste("floor.",object$coef.names.floor,sep=""))
      coef.names <- c(coef.names, paste("ceiling.",object$coef.names.ceiling,sep=""))
      names(coef) <- coef.names
    } else if (object$floor == TRUE){
      coef <- c(object$par.control, object$par.treat, object$par.floor)
      coef.names <- paste("control.",object$coef.names,sep="")
      coef.names <- c(coef.names, paste("sensitive.",object$coef.names,sep=""))
      coef.names <- c(coef.names, paste("floor.",object$coef.names.floor,sep=""))
      names(coef) <- coef.names
    } else  if (object$ceiling == TRUE) {
      coef <- c(object$par.control, object$par.treat, object$par.ceiling)
      coef.names <- paste("control.",object$coef.names,sep="")
      coef.names <- c(coef.names, paste("sensitive.",object$coef.names,sep=""))
      coef.names <- c(coef.names, paste("ceiling.",object$coef.names.ceiling,sep=""))
      names(coef) <- coef.names
    }
  }

  if(object$multi == TRUE) {

    if (object$method == "nls" | object$method == "lm")
      object$multi.condition <- "none"
    
    coef <- c(do.call(c, object$par.treat), object$par.control)
    coef.names <- c() 
    if(object$multi.condition == "none")
      for(j in 1:length(object$treat.labels)) coef.names <- c(coef.names, paste("sensitive.", object$treat.labels[j], ".", object$coef.names,sep=""))
    if(object$multi.condition == "level")
      for(j in 1:length(object$treat.labels)) coef.names <- c(coef.names, paste("sensitive.", object$treat.labels[j], ".", c(object$coef.names, "y_i(0)"),sep=""))
    coef.names <- c(coef.names, paste("control.",object$coef.names,sep=""))
    names(coef) <- coef.names

  }
  
  return(coef)
  
}

vcov.ictreg <- function(object, ...){
  
  vcov <- object$vcov
  
  nPar <- length(object$coef.names)
  
  ## dummy value for constraint
  if(object$method=="nls" | object$method=="onestep" | object$design=="modified" | object$method == "lm")
    object$constrained <- F
  
  if (object$method == "lm" | (object$method=="ml" & object$constrained==T &
     object$boundary == F & object$multi == F & object$robust == F)) {
    vcov <- rbind( cbind( vcov[(nPar+1):(nPar*2), (nPar+1):(nPar*2)],
                         vcov[(nPar+1):(nPar*2), 1:nPar]  ),
                  cbind(vcov[1:nPar, (nPar+1):(nPar*2)] ,vcov[1:nPar, 1:nPar])  )
  } else if (object$method=="ml" & object$constrained==F & object$boundary == F & object$multi == F) {
    vcov <- rbind( cbind(vcov[(nPar*2+1):(nPar*3),(nPar*2+1):(nPar*3)],
                         vcov[(nPar*2+1):(nPar*3),1:nPar],
                         vcov[(nPar*2+1):(nPar*3), (nPar+1):(nPar*2)] ),
                  cbind(vcov[1:nPar, (nPar*2+1):(nPar*3)] , vcov[1:nPar,1:nPar] ,
                        vcov[1:nPar, (nPar+1):(nPar*2)]),
                  cbind(vcov[(nPar+1):(nPar*2), (nPar*2+1):(nPar*3)] ,
                        vcov[(nPar+1):(nPar*2), 1:nPar] ,
                        vcov[(nPar+1):(nPar*2),(nPar+1):(nPar*2)]) )
  }

  if((object$method=="nls" | object$method=="onestep" | object$method == "lm") &
     object$boundary == FALSE & object$multi == FALSE){
    if(object$design=="standard") rownames(vcov) <-
      colnames(vcov)<- c(paste("sensitive.",object$coef.names,sep=""),
                         paste("control.",object$coef.names,sep=""))
    else if(object$design=="modified") {
      coef.names<- paste("sensitive.",object$coef.names,sep="")
      for(j in 1:object$J) coef.names <-
        c(coef.names, paste("control.", j, ".", object$coef.names, sep=""))
      rownames(vcov) <- colnames(vcov) <- coef.names
    }
  } else if(object$method=="ml" & object$boundary == FALSE & object$multi == FALSE) {
    if(object$constrained==F) {
      rownames(vcov) <- colnames(vcov) <- c(paste("sensitive.",object$coef.names,sep=""),
                                            paste("control.psi0.",object$coef.names,
                                                  sep=""),
                                            paste("control.psi1.",object$coef.names,
                                                  sep=""))
    } else {
      rownames(vcov) <- colnames(vcov) <-
        c(paste("sensitive.",object$coef.names,sep=""),
          paste("control.",object$coef.names,sep=""))
    }
  }

  if (object$boundary == TRUE) {
    if (object$floor == TRUE & object$ceiling == TRUE) {
      coef.names <- paste("control.",object$coef.names,sep="")
      coef.names <- c(coef.names, paste("sensitive.",object$coef.names,sep=""))
      coef.names <- c(coef.names, paste("floor.",object$coef.names.floor,sep=""))
      coef.names <- c(coef.names, paste("ceiling.",object$coef.names.ceiling,sep=""))
      rownames(vcov) <- colnames(vcov) <- coef.names
    } else if (object$floor == TRUE){
      coef.names <- paste("control.",object$coef.names,sep="")
      coef.names <- c(coef.names, paste("sensitive.",object$coef.names,sep=""))
      coef.names <- c(coef.names, paste("floor.",object$coef.names.floor,sep=""))
      rownames(vcov) <- colnames(vcov) <- coef.names
    } else  if (object$ceiling == TRUE) {
      coef.names <- paste("control.",object$coef.names,sep="")
      coef.names <- c(coef.names, paste("sensitive.",object$coef.names,sep=""))
      coef.names <- c(coef.names, paste("ceiling.",object$coef.names.ceiling,sep=""))
      rownames(vcov) <- colnames(vcov) <- coef.names
    }
  }

  if(object$multi == TRUE) {

    ##if (object$method == "nls")
    ##  stop("This function does not yet work for the NLS estimator for the multiple sensitive item design.")

    coef.names <- paste("control.",object$coef.names,sep="")
    if(object$method != "ml") {
      for(j in 1:length(object$treat.values)) coef.names <-
        c(coef.names, paste("sensitive.", object$treat.labels[j], ".", object$coef.names,sep=""))
    } else {
      if(object$multi.condition == "none")
        for(j in 1:length(object$treat.values)) coef.names <-
          c(coef.names, paste("sensitive.", object$treat.labels[j], ".", object$coef.names,sep=""))
      else if(object$multi.condition == "level")
        for(j in 1:length(object$treat.values)) coef.names <-
          c(coef.names, paste("sensitive.", object$treat.labels[j], ".", c(object$coef.names, "y_i(0)"),sep=""))
      rownames(vcov) <- colnames(vcov) <- coef.names
    }
  }
  
  return(vcov)
  
}

c.predict.ictreg <- function(...){

  x <- list(...)
  
  for (j in 1:length(x)) 
    x[[j]] <- x[[j]]$fit
  
  return.object <- c()
  return.object$fit <- as.matrix(do.call(rbind, x))

  class(return.object) <- "predict.ictreg"

  return.object
  
}



#' Plot Method for the Item Count Technique
#' 
#' Function to plot predictions and confidence intervals of predictions from
#' estimates from multivariate regression analysis of survey data with the item
#' count technique.
#' 
#' \code{plot.predict.ictreg} produces plots with estimated population
#' proportions of respondents answering the sensitive item in a list experiment
#' in the affirmative, with confidence intervals.
#' 
#' The function accepts a set of \code{predict.ictreg} objects calculated in
#' the following manner:
#' 
#' \code{predict(ictreg.object, avg = TRUE, interval = "confidence")}
#' 
#' For each average prediction, a point estimate and its confidence interval is
#' plotted at equally spaced intervals. The x location of the points can be
#' manipulated with the \code{xvec} option.
#' 
#' Either a single predict object can be plotted, or a group of them combined
#' with \code{c(predict.object1, predict.object2)}. Predict objects with the
#' \code{newdata.diff} option, which calculates the mean difference in
#' probability between two datasets, and the \code{direct.glm} option, which
#' calculates the mean difference between the mean predicted support for the
#' sensitive item in the list experiment and in a direct survey item, can also
#' be plotted in the same way as other \code{predict} objects.
#' 
#' @param x object or set of objects of class inheriting from "predict.ictreg".
#' Either a single object from an \code{ictreg()} model fit or multiple
#' \code{predict} objects combined with the c() function.
#' @param labels a vector of labels for each prediction, plotted at the x axis.
#' @param axes.ict a switch indicating if custom plot axes are to be used with
#' the user-provided estimate \code{labels}.
#' @param xlim a title for the y axis.
#' @param ylim a title for the y axis.
#' @param xlab a title for the x axis.
#' @param ylab a title for the y axis.
#' @param axes an indicator for whether default plot axes are included.
#' @param pch either an integer specifying a symbol or a single character to be
#' used as the default in plotting points.
#' @param xvec a vector of x values at which the proportions will be printed.
#' @param ... Other graphical parameters to be passed to the \code{plot()}
#' command are accepted.
#' @author Graeme Blair, UCLA, \email{graeme.blair@ucla.edu}
#' and Kosuke Imai, Princeton University, \email{kimai@princeton.edu}
#' @seealso \code{\link{ictreg}} for model fitting and
#' \code{\link{predict.ictreg}} for predictions based on the model fits.
#' @references Blair, Graeme and Kosuke Imai. (2012) ``Statistical Analysis of
#' List Experiments."  Political Analysis, Vol. 20, No 1 (Winter). available at
#' \url{http://imai.princeton.edu/research/listP.html}
#' 
#' Imai, Kosuke. (2011) ``Multivariate Regression Analysis for the Item Count
#' Technique.'' Journal of the American Statistical Association, Vol. 106, No.
#' 494 (June), pp. 407-416. available at
#' \url{http://imai.princeton.edu/research/list.html}
#' @keywords models regression
#' @examples
#' 
#' data(race)
#' race.south <- race.nonsouth <- race
#' race.south[, "south"] <- 1
#' race.nonsouth[, "south"] <- 0
#' 
#' \dontrun{
#' 
#' # Fit EM algorithm ML model with constraint
#' ml.constrained.results <- ictreg(y ~ south + age + male + college, 
#'    data = race, treat = "treat", J=3, method = "ml", 
#'    overdispersed = FALSE, constrained = TRUE)
#' 
#' # Calculate average predictions for respondents in the South 
#' # and the the North of the US for the MLE model, replicating the 
#' # estimates presented in Figure 1, Imai (2011)
#' avg.pred.south.mle <- predict(ml.constrained.results, 
#'    newdata = race.south, avg = TRUE, interval = "confidence")
#' avg.pred.nonsouth.mle <- predict(ml.constrained.results, 
#'    newdata = race.nonsouth, avg = TRUE, interval = "confidence")
#' 
#' # A plot of a single estimate and its confidence interval
#' plot(avg.pred.south.mle, labels = "South")
#' 
#' # A  plot of the two estimates and their confidence intervals
#' # use c() to combine more than one predict object for plotting
#' plot(c(avg.pred.south.mle, avg.pred.nonsouth.mle), labels = c("South", "Non-South"))
#' 
#' # The difference option can also be used to simultaneously
#' # calculate separate estimates of the two sub-groups
#' # and the estimated difference. This can also be plotted.
#' 
#' avg.pred.diff.mle <- predict(ml.constrained.results, 
#'    newdata = race.south, newdata.diff = race.nonsouth,
#'    se.fit = TRUE, avg = TRUE, interval="confidence")
#' 
#' plot(avg.pred.diff.mle, labels = c("South", "Non-South", "Difference"))
#' 
#' # Social desirability bias plots
#' 
#' # Estimate logit for direct sensitive question
#' 
#' data(mis)
#' 
#' mis.list <- subset(mis, list.data == 1)
#' 
#' mis.sens <- subset(mis, sens.data == 1)
#' 
#' # Fit EM algorithm ML model
#' 
#' fit.list <- ictreg(y ~ age + college + male + south,
#'    J = 4, data = mis.list, method = "ml")
#' 
#' # Fit logistic regression with directly-asked sensitive question
#' 
#' fit.sens <- glm(sensitive ~ age + college + male + south, 
#'    data = mis.sens, family = binomial("logit"))
#' 
#' # Predict difference between response to sensitive item
#' # under the direct and indirect questions (the list experiment).
#' # This is an estimate of the revealed social desirability bias
#' # of respondents. See Blair and Imai (2010).
#' 
#' avg.pred.social.desirability <- predict(fit.list, 
#'    direct.glm = fit.sens, se.fit = TRUE)
#' 
#' plot(avg.pred.social.desirability)
#' 
#' }
#' @method plot predict.ictreg
plot.predict.ictreg <- function(x, labels = NA, axes.ict = TRUE,
                                xlim = NULL, ylim = NULL, xlab = NULL, ylab = "Estimated Proportion",
                                axes = F, pch = 19, xvec = NULL, ...){

  x <- as.matrix(x$fit)

  if (is.null(xlim))
    xlim <- c(0.5,nrow(x)+0.5)
  if (is.null(ylim))
    ylim <- c(min(x[,2]-.05,0), max(x[,3]+.05))  
  if (is.null(xvec))
    xvec <- 1:nrow(x)  
  if (is.null(xlab))
    xlab <- ""

  plot(xvec, x[,1], pch = pch, xlim = xlim, ylim = ylim, 
       xlab = xlab, ylab = ylab, axes = axes, ...)
  
  critical <- abs(qnorm(0.025))
  for (j in 1:nrow(x))
    lines(rep(xvec[j], 2), x[j, 2:3]) 
  
  abline(h = 0, lty = "dashed")
  
  if (axes.ict == TRUE) {
    axis(side = 2, at = seq(from = round(min(x[,2],0),1),
                     to = round(max(x[,3]),1), by = 0.1))
    axis(side = 1, at = xvec, tick = FALSE, labels = labels)
  }
  
}



#' Summary Method for the Item Count Technique
#' 
#' Function to summarize results from list experiment regression based on the
#' ictreg() function, and to produce proportions of liars estimates.
#' 
#' \code{predict.ictreg} produces a summary of the results from an
#' \code{ictreg} object. It displays the coefficients, standard errors, and fit
#' statistics for any model from \code{ictreg}.
#' 
#' \code{predict.ictreg} also produces estimates of the conditional probability
#' of lying and of the population proportion of liars for boundary models from
#' \code{ictreg()} if \code{ceiling = TRUE} or \code{floor = TRUE}.
#' 
#' The conditional probability of lying for the ceiling model is the
#' probability that a respondent with true affirmative views of all the
#' sensitive and non-sensitive items lies and responds negatively to the
#' sensitive item. The conditional probability for the floor model is the
#' probability that a respondent lies to conceal her true affirmative views of
#' the sensitive item when she also holds true negative views of all the
#' non-sensitive items. In both cases, the respondent may believe her privacy
#' is not protected, so may conceal her true affirmative views of the sensitive
#' item.
#' 
#' @param object Object of class inheriting from "ictreg"
#' @param boundary.proportions A switch indicating whether, for models with
#' ceiling effects, floor effects, or both (indicated by the \code{floor =
#' TRUE}, \code{ceiling = TRUE} options in ictreg), the conditional probability
#' of lying and the population proportions of liars are calculated.
#' @param n.draws For quasi-Bayesian approximation based predictions, specify
#' the number of Monte Carlo draws.
#' @param ... further arguments to be passed to or from other methods.
#' @author Graeme Blair, UCLA, \email{graeme.blair@ucla.edu}
#' and Kosuke Imai, Princeton University, \email{kimai@princeton.edu}
#' @seealso \code{\link{ictreg}} for model fitting
#' @references Blair, Graeme and Kosuke Imai. (2012) ``Statistical Analysis of
#' List Experiments."  Political Analysis, Vol. 20, No 1 (Winter). available at
#' \url{http://imai.princeton.edu/research/listP.html}
#' 
#' Imai, Kosuke. (2011) ``Multivariate Regression Analysis for the Item Count
#' Technique.'' Journal of the American Statistical Association, Vol. 106, No.
#' 494 (June), pp. 407-416. available at
#' \url{http://imai.princeton.edu/research/list.html}
#' @keywords models regression
#' @examples
#' 
#' data(race)
#' \dontrun{
#' # Fit standard design ML model with ceiling effects
#' # Replicates Table 7 Columns 3-4 in Blair and Imai (2012)
#' 
#' ceiling.results <- ictreg(y ~ age + college + male + south, treat = "treat", 
#' 		   	  J = 3, data = affirm, method = "ml", fit.start = "nls",
#' 			  ceiling = TRUE, ceiling.fit = "bayesglm",
#' 			  ceiling.formula = ~ age + college + male + south)
#' 
#' # Summarize fit object and generate conditional probability 
#' # of ceiling liars the population proportion of ceiling liars,
#' # both with standard errors.
#' # Replicates Table 7 Columns 3-4 last row in Blair and Imai (2012)
#' 
#' summary(ceiling.results, boundary.proportions = TRUE)
#' }
#' 
#' 
summary.ictreg <- function(object, boundary.proportions = FALSE, n.draws = 10000, ...) {
  object$boundary.proportions <- boundary.proportions
  object$n.draws <- n.draws
  structure(object, class = c("summary.ictreg", class(object)))
}

print.summary.ictreg <- function(x, ...){
  
  cat("\nItem Count Technique Regression \n\nCall: ")
  
  dput(x$call)

  cat("\n")
  
  if(x$method=="nls" | x$method=="onestep" | x$design=="modified" | x$method == "lm")
    x$constrained <- T

  if (x$design == "standard") {
  
    if((x$method=="nls" | x$method=="onestep" | x$method == "lm") & x$multi == FALSE){
      ##cat(rep(" ", max(nchar(x$coef.names))+8), "Sensitive Item", rep(" ", 5),
      ##    "Control Items \n", sep="")
      tb.treat <- tb.control <- matrix(NA, ncol = 2, nrow = length(x$par.control))
      colnames(tb.treat) <- colnames(tb.control) <- c("Est.", "S.E.")
      if(x$design=="standard") {
        rownames(tb.treat) <- rownames(tb.control) <- x$coef.names
      } else if(x$design=="modified") {
        rownames(tb.treat) <- rownames(tb.control) <- rep(x$coef.names, x$J)
      }
      tb.treat[,1] <- x$par.treat
      tb.treat[,2] <- x$se.treat
      tb.control[,1] <- x$par.control
      tb.control[,2] <- x$se.control
      
      cat("Sensitive item \n")
      print(as.matrix(round(tb.treat,5)))
      cat("\nControl items \n")
      print(as.matrix(round(tb.control,5)))    
      
      summ.stat <- paste("Residual standard error:",signif(x$resid.se, digits=6),
                         "with",x$resid.df,"degrees of freedom")
      
      cat("\n",summ.stat,"\n\n", sep="")
      
    } else if(x$method=="ml" | x$multi == TRUE) {
      
      if (x$multi == FALSE) {
        if(x$constrained==F) {
          ##cat(rep(" ", max(nchar(x$coef.names))+7), "Sensitive Item", rep(" ", 3),
          ##    "Control (psi0)", rep(" ",2), "Control (psi1)\n", sep="")
          tb.treat <- tb.control.psi0 <- tb.control.psi1 <- matrix(NA, ncol = 2, nrow = length(x$par.treat))
          colnames(tb.treat) <- colnames(tb.control.psi0) <- colnames(tb.control.psi1) <- c("Est.", "S.E.")
          rownames(tb.treat) <- rownames(tb.control.psi0) <- rownames(tb.control.psi1) <- x$coef.names
          tb.treat[,1] <- x$par.treat
          tb.treat[,2] <- x$se.treat
          tb.control.psi0[,1] <- x$par.control.psi0
          tb.control.psi0[,2] <- x$se.control.psi0
          tb.control.psi1[,1] <- x$par.control.psi1
          tb.control.psi1[,2] <- x$se.control.psi1

          cat("Sensitive item \n")
          print(as.matrix(round(tb.treat, 5)))
          cat("\nControl items (psi0) \n")
          print(as.matrix(round(tb.control.psi0,5)))
          cat("\nControl items (psi1) \n")
          print(as.matrix(round(tb.control.psi1,5)))
          
        } else {
          #cat(rep(" ", max(nchar(x$coef.names))+8), "Sensitive Item", rep(" ", 5),
          #    "Control Items \n", sep="")
          tb.treat <- tb.control <- matrix(NA, ncol = 2, nrow = length(x$par.treat))
          colnames(tb.treat) <- colnames(tb.control) <- c("Est.", "S.E.")
          rownames(tb.treat) <- rownames(tb.control) <- x$coef.names
          tb.treat[,1] <- x$par.treat
          tb.treat[,2] <- x$se.treat
          tb.control[,1] <- x$par.control
          tb.control[,2] <- x$se.control
          
          cat("Sensitive item \n")
          print(as.matrix(round(tb.treat,5)))
          cat("\nControl items \n")
          print(as.matrix(round(tb.control,5)))
          
        }
        
        if (x$overdispersed == TRUE)
          cat(paste("\nOverdispersion parameter: ",
                    round(x$par.overdispersion, 4), " (s.e. = ", round(x$se.overdispersion, 4), ").\n",
                    sep = ""))
        
      } else {
        ## multi
        
        if (x$method == "nls" | x$method == "lm")
          x$multi.condition <- "none"
        
        ##cat(rep(" ", max(nchar(x$coef.names))+5), x$treat.labels[1], sep="")
        ##for(k in 2:length(x$treat.values))
        ##  cat(rep(" ", 7), x$treat.labels[k], "\n", sep = "")
        
        tb.treat <- tb.control <- matrix(NA, ncol = 2, nrow = length(x$par.treat[[1]]))
        colnames(tb.treat) <- c("Est.", "S.E.")
        if(x$multi.condition == "none")
          rownames(tb.treat) <- x$coef.names
        else if (x$multi.condition == "level")
          rownames(tb.treat) <- c(x$coef.names, "y_i(0)")
        else if (x$multi.condition == "indicators")
          rownames(tb.treat) <- rep("varnames", length(x$par.treat[[1]]))

        for (k in 1:length(x$treat.values)) {

          tb.treat[,1] <- x$par.treat[[k]]
          tb.treat[,2] <- x$se.treat[[k]]

          cat(paste("Sensitive item (", x$treat.labels[k], ")", "\n", sep = ""))
          print(as.matrix(round(tb.treat,5)))
   
          cat("\n")

        }
                
        ##cat("\n",rep(" ", max(nchar(x$coef.names))+16), "Control\n", sep="")
        cat("Control items\n")
        tb.control <- matrix(NA, ncol = 2, nrow = length(x$par.control))
        colnames(tb.control) <- rep(c("Est.", "S.E."), 1)
        rownames(tb.control) <- x$coef.names
        tb.control[,1] <- x$par.control
        tb.control[,2] <- x$se.control
        
        print(as.matrix(round(tb.control, 5)))
        
        if (x$method == "lm")
          summ.stat <- paste("Residual standard error:",signif(x$resid.se, digits=6),
                             "with",x$resid.df,"degrees of freedom")
        
      }
      
      if(x$method == "ml")
        summ.stat <- paste("Log-likelihood:", round(x$llik, 3))
      else if (x$method == "nls") {
        summ.stat <- "Residual standard error of "
        for (m in 1:length(x$treat.values)) {
          if(m==1)
            summ.stat <- paste(summ.stat, signif(x$resid.se[m], digits=6), " with ",x$resid.df[m]," degrees of freedom for sensitive item ",m,"", sep = "")
          else
            summ.stat <- paste(summ.stat, "; residual standard error of ", signif(x$resid.se[m], digits=6), " with ",x$resid.df[m]," degrees of freedom for sensitive item ",m,"", sep ="")
        }
        summ.stat <- paste(summ.stat, ".", sep = "")
      }
      
      if(x$boundary == FALSE)
        cat("\n",summ.stat,"\n\n", sep="")
      
    }
    
    if (x$boundary == TRUE & x$method == "ml" & x$design == "standard") {
      
      tb.ceiling <- tb.floor <- matrix(NA, ncol = 2, nrow = length(x$par.treat))
      colnames(tb.ceiling) <- colnames(tb.floor) <- rep(c("Est.", "S.E."), 1)
      rownames(tb.ceiling) <- rownames(tb.floor) <- x$coef.names
      
      if (x$ceiling == TRUE) {
        cat("\nCeiling\n", sep="")
        tb.ceiling[,1] <- x$par.ceiling
        tb.ceiling[,2] <- x$se.ceiling
        print(as.matrix(round(tb.ceiling, 5)))
      }
      if (x$floor == TRUE) {
        cat("\nFloor\n", sep="")
        tb.floor[,1] <- x$par.floor
        tb.floor[,2] <- x$se.floor
        print(as.matrix(round(tb.floor, 5)))
     }
      
      cat("\n",summ.stat,"\n\n", sep="")
      
      if (x$boundary.proportions == TRUE) {
        
        logistic <- function(x) exp(x)/(1+exp(x))
        logit <- function(x) return(log(x)-log(1-x))
        
        nPar <- ncol(x$x)
        n <- nrow(x$x)
                
        mu.par <- c(x$par.control, x$par.treat)
        if (x$floor == TRUE)
          mu.par <- c(mu.par, x$par.floor)
        if (x$ceiling == TRUE)
          mu.par <- c(mu.par, x$par.ceiling)
        
        try(draws <- mvrnorm(n = x$n.draws, mu = mu.par, Sigma = x$vcov, tol = 1e-1))
        
        if (exists("draws")) {
          
          coef.h.draws <- draws[,1:nPar, drop = FALSE]
          coef.g.draws <- draws[,(nPar+1):(2*nPar), drop = FALSE]
          
          if(x$floor==TRUE) {
            coef.ql.draws <- draws[,(2*nPar+1):(3*nPar), drop = FALSE]
            if(x$ceiling==TRUE){
              coef.qu.draws <- draws[,(3*nPar+1):(4*nPar), drop = FALSE] 
            } 
          } else {
            if(x$ceiling==TRUE){
              coef.qu.draws <- draws[,(2*nPar+1):(3*nPar), drop = FALSE] 
            } 
          }
          
          liar.ceiling.sims <- liar.floor.sims <-
            liar.ceiling.pop.sims <- liar.floor.pop.sims <- rep(NA, x$n.draws)
          
          for (i in 1:x$n.draws) {
            if (x$floor==TRUE)
              qlX <- logistic(x$x %*% as.vector(coef.ql.draws[i, , drop = FALSE]))
            if (x$ceiling==TRUE)
              quX <- logistic(x$x %*% as.vector(coef.qu.draws[i, , drop = FALSE]))
            
            hX <- logistic(x$x %*% as.vector(coef.h.draws[i,, drop = FALSE]))
            gX <- logistic(x$x %*% as.vector(coef.g.draws[i,, drop = FALSE]))
            
            hX0 <- dbinom(x = 0, size = x$J, prob = hX, log = FALSE)
            hXJ <- dbinom(x = x$J, size = x$J, prob = hX, log = FALSE)
            
            if (x$ceiling==TRUE) {
              liar.ceiling.sims[i] <- sum(quX * hXJ * gX) / sum(hXJ * gX)
              liar.ceiling.pop.sims[i] <- (1/n) * sum(quX * hXJ * gX)
            }
            
            if (x$floor==TRUE) {
              liar.floor.sims[i] <- sum(qlX * hX0 * gX) / sum(hX0 * gX)
              liar.floor.pop.sims[i] <- (1/n) * sum(qlX * hX0 * gX)
            }
            
          }
          
          if(x$ceiling==TRUE | x$floor == TRUE)
            cat("Quasi-Bayesian approximation estimates\n")
          
          if(x$ceiling==TRUE) {
            liar.ceiling <- mean(liar.ceiling.sims)
            liar.ceiling.se <- sd(liar.ceiling.sims)/sqrt(nrow(x$x))
            liar.ceiling.pop <- mean(liar.ceiling.pop.sims)
            liar.ceiling.pop.se <- sd(liar.ceiling.pop.sims)/sqrt(nrow(x$x))
            cat("Ceiling liar cond. prob. est. = ", round(liar.ceiling,5), " (s.e. = ",
                round(liar.ceiling.se,5), ")\nCeiling liar pop. prop. est. = ", round(liar.ceiling.pop,5), " (s.e. = ",
                round(liar.ceiling.pop.se,5), ")\n", sep = "")
          } 
          
          if(x$floor==TRUE) {
            liar.floor <- mean(liar.floor.sims)
            liar.floor.se <- sd(liar.floor.sims)/sqrt(nrow(x$x))
            liar.floor.pop <- mean(liar.floor.pop.sims)
            liar.floor.pop.se <- sd(liar.floor.pop.sims)/sqrt(nrow(x$x))
            cat("Floor liar cond. prob. est. = ",round(liar.floor,5), " (s.e. = ",
                round(liar.floor.se,5), ")\nFloor liar pop. prop. est. = ",round(liar.floor.pop,5), " (s.e. = ",
                round(liar.floor.pop.se,5), ")\n", sep = "")
          } 
          
        ##} else {
        }
        
          ## cannot use mvrnorm()
          
          coef.h <- x$par.control
          coef.g <- x$par.treat
          
          if(x$floor == TRUE) {
            coef.ql <- x$par.floor
            qlX <- logistic(x$x %*% coef.ql)
          }
          if(x$ceiling == TRUE) {
            coef.qu <- x$par.ceiling
            quX <- logistic(x$x %*% coef.qu)
          }
          
          hX <- logistic(x$x %*% coef.h)
          gX <- logistic(x$x %*% coef.g)
          
          hX0 <- dbinom(x = 0, size = x$J, prob = hX, log = FALSE)
          hXJ <- dbinom(x = x$J, size = x$J, prob = hX, log = FALSE)
          
           if(x$ceiling == TRUE | x$floor == TRUE)
          cat("\nMaximum likelihood estimates\n")
    
          if (x$ceiling == TRUE) {
            liar.ceiling <- sum(quX * hXJ * gX) / sum(hXJ * gX)
            liar.ceiling.pop <- (1/n) * sum(quX * hXJ * gX)
            cat("Ceiling liar cond. prob. est. =", round(liar.ceiling,5), "\n")
            cat("Ceiling liar pop. prop. est. =", round(liar.ceiling.pop,5),"\n")
          }
          if (x$floor == TRUE) {
            liar.floor <- sum(qlX * hX0 * gX) / sum(hX0 * gX)
            liar.floor.pop <- (1/n) * sum(qlX * hX0 * gX)
            cat("Floor liar cond. prob. est. =", round(liar.floor,5), "\n")
            cat("Floor liar pop. prop. est. =", round(liar.floor.pop,5),"\n")
          }
          if(!exists("draws")) {
          cat("\nNote: covariance matrix was not computationally singular,\n")
          cat("so std. errors for the proportion estimates could not be calculated.\n\n")
          } else {
          cat("\n")
          }
        }
        
      }
    
  } else {

    ## modified design
    
    cat("Sensitive Item \n", sep = "")
    tb.control <- matrix(NA, ncol = 2, nrow = length(x$par.control))
    tb.treat <- matrix(NA, ncol = 2, nrow = length(x$par.treat))
    colnames(tb.control) <- colnames(tb.treat) <- c("Est.", "S.E.")
    rownames(tb.control) <- rep(x$coef.names, x$J)
    rownames(tb.treat) <- x$coef.names
    
    tb.control[,1] <- x$par.control
    tb.control[,2] <- x$se.control

    tb.treat[,1] <- x$par.treat
    tb.treat[,2] <- x$se.treat
    
    print(as.matrix(round(tb.treat,5)))
    
    cat("Control Items \n", sep="")
    print(as.matrix(round(tb.control,5)))

    if (x$method == "nls")
      summ.stat <- paste("Residual standard error:",signif(x$resid.se, digits=6),
                         "with",x$resid.df,"degrees of freedom")
    else if (x$method == "ml")
      summ.stat <- paste("Log-likelihood:", round(x$llik,5))
    
    cat("\n",summ.stat,"\n\n", sep="")
      
  }


  treat.print <- c()
  for (i in 1:length(x$treat.labels)) {
    treat.print <- c(treat.print, "'", x$treat.labels[i], "'", sep = "")
    if (i != length(x$treat.labels))
      treat.print <- c(treat.print, " and ")
  }
  
  cat("Number of control items J set to ", x$J, ". Treatment groups were indicated by ", sep = "")
  cat(treat.print, sep ="")
  cat(" and the control group by '", x$control.label, "'.\n\n", sep = "")
  
  # auxiliary data functionality
  if (x$aux) cat("Incorporating ", x$nh, " auxiliary moment(s). Weighting method: ", x$wm, ".\n", 
    "The overidentification test statistic was: ", x$J.stat, " (p < ", x$overid.p, ")", ".\n", sep = "")

  # measurement error models
  if (x$error == "topcode") cat("Estimated proportion of top-coded respondents: ", x$p.est, ". 95% CI: (", 
    round(x$p.ci[1], 6), ", ", round(x$p.ci[2], 6), ").\n", sep = "")

  if (x$error == "uniform") cat("Estimated proportion of respondents with uniform error (control): ", x$p0.est, ". 95% CI: (", 
    round(x$p0.ci[1], 6), ", ", round(x$p0.ci[2], 6), ").\n", 
    "Estimated proportion of respondents with uniform error (treated): ", x$p1.est, ". 95% CI: (", 
    round(x$p1.ci[1], 6), ", ", round(x$p1.ci[2], 6), ").\n", sep = "")

  invisible(x)
  
}


# utility functions for calculating the scale parameters (priors) used in bayesglm models
# intercept: 10; other coefficients: 2.5

is_intercept <- function(data){
  unname(apply(data, 2, function(x) length(unique(x))==1 & x[1] == 1))
}

bayesglm_priors <- function(data){
  int <- is_intercept(data)
  if(sum(int) > 1) stop("More than one intercept found in data.")
  ifelse(int == 1, 10, 2.5)
}

bayesglm_priors_from_fit <- function(fit) {
  has_intercept <- attr(fit$terms, "intercept")
  coef_length <- length(coef(fit))
  intercept_scale <- if(has_intercept == TRUE) 10 else NULL
  c(intercept_scale, rep(2.5, coef_length - has_intercept))
}
SensitiveQuestions/list documentation built on Feb. 1, 2024, 10:51 a.m.
rdrr.io home R language documentation Run R code online
CRAN packages Bioconductor packages R-Forge packages GitHub packages
Note that we can't provide technical support on individual packages. You should contact the package authors for that.
SensitiveQuestions/list
Statistical Methods for the Item Count Technique and List Experiment

R/ictreg.R
In SensitiveQuestions/list: Statistical Methods for the Item Count Technique and List Experiment

Defines functions bayesglm_priors_from_fit bayesglm_priors is_intercept print.summary.ictreg summary.ictreg plot.predict.ictreg c.predict.ictreg vcov.ictreg coef.ictreg predict.ictreg print.predict.ictreg print.ictreg ictreg

Documented in ictreg plot.predict.ictreg predict.ictreg summary.ictreg

R Package Documentation

Browse R Packages

We want your feedback!

SensitiveQuestions/list Statistical Methods for the Item Count Technique and List Experiment

R/ictreg.R In SensitiveQuestions/list: Statistical Methods for the Item Count Technique and List Experiment

Defines functions bayesglm_priors_from_fit bayesglm_priors is_intercept print.summary.ictreg summary.ictreg plot.predict.ictreg c.predict.ictreg vcov.ictreg coef.ictreg predict.ictreg print.predict.ictreg print.ictreg ictreg

Documented in ictreg plot.predict.ictreg predict.ictreg summary.ictreg

R Package Documentation

Browse R Packages

We want your feedback!

SensitiveQuestions/list
Statistical Methods for the Item Count Technique and List Experiment

R/ictreg.R
In SensitiveQuestions/list: Statistical Methods for the Item Count Technique and List Experiment