Nothing
# Commands to genereate the help files:
# load("data/trade.RData")
# roxygen2::roxygenise(roclets = "rd")
#' Fixed effects maximum likelihood models
#'
#' This function estimates maximum likelihood models (e.g., Poisson or Logit) and is efficient to handle any number of fixed effects (i.e. cluster variables). It further allows for nonlinear in parameters right hand sides.
#'
#' @param fml A formula. This formula gives the linear formula to be estimated (it is similar to a \code{lm} formula), for example: \code{fml = z~x+y}. To include cluster variables, you can 1) either insert them in this formula using a pipe (e.g. \code{fml = z~x+y|cluster1+cluster2}), or 2) either use the argment \code{cluster}. To include a non-linear in parameters element, you must use the argment \code{NL.fml}.
#' @param NL.fml A formula. If provided, this formula represents the non-linear part of the right hand side (RHS). Note that contrary to the \code{fml} argument, the coefficients must explicitely appear in this formula. For instance, it can be \code{~a*log(b*x + c*x^3)}, where \code{a}, \code{b}, and \code{c} are the coefficients to be estimated. Note that only the RHS of the formula is to be provided, and NOT the left hand side.
#' @param data A data.frame containing the necessary variables to run the model. The variables of the non-linear right hand side of the formula are identified with this \code{data.frame} names. Note that no \code{NA} is allowed in the variables to be used in the estimation. Can also be a matrix.
#' @param family Character scalar. It should provide the family. The possible values are "poisson" (Poisson model with log-link, the default), "negbin" (Negative Binomial model with log-link), "logit" (LOGIT model with log-link), "gaussian" (Gaussian model).
#' @param cluster Character vector. The name/s of a/some variable/s within the dataset to be used as clusters. These variables should contain the identifier of each observation (e.g., think of it as a panel identifier).
#' @param na.rm Logical, default is \code{FALSE}. If the variables necessary for the estimation contain NAs and \code{na.rm = TRUE}, then all observations containing NA are removed prior to estimation and a warning message is raised detailing the number of observations removed.
#' @param useAcc Default is \code{TRUE}. Whether an acceleration algorithm (Irons and Tuck iterations) should be used to otbain the cluster coefficients when there are two or more clusters.
#' @param NL.start (For NL models only) A list of starting values for the non-linear parameters. ALL the parameters are to be named and given a staring value. Example: \code{NL.start=list(a=1,b=5,c=0)}. Though, there is an exception: if all parameters are to be given the same starting value, you can use the argument \code{NL.start.init}.
#' @param lower (For NL models only) A list. The lower bound for each of the non-linear parameters that requires one. Example: \code{lower=list(b=0,c=0)}. Beware, if the estimated parameter is at his lower bound, then asymptotic theory cannot be applied and the standard-error of the parameter cannot be estimated because the gradient will not be null. In other words, when at its upper/lower bound, the parameter is considered as 'fixed'.
#' @param upper (For NL models only) A list. The upper bound for each of the non-linear parameters that requires one. Example: \code{upper=list(a=10,c=50)}. Beware, if the estimated parameter is at his upper bound, then asymptotic theory cannot be applied and the standard-error of the parameter cannot be estimated because the gradient will not be null. In other words, when at its upper/lower bound, the parameter is considered as 'fixed'.
#' @param env (For NL models only) An environment. You can provide an environement in which the non-linear part will be evaluated. (May be useful for some particular non-linear functions.)
#' @param NL.start.init (For NL models only) Numeric scalar. If the argument \code{NL.start} is not provided, or only partially filled (i.e. there remain non-linear parameters with no starting value), then the starting value of all remaining non-linear parameters is set to \code{NL.start.init}.
#' @param offset A formula. An offset can be added to the estimation. It should be a formula of the form (for example) ~0.5*x**2. This offset is linearily added to the elements of the main formula 'fml'. Note that when using the argument 'NL.fml', you can directly add the offset there.
#' @param nl.gradient (For NL models only) A formula. The user can prodide a function that computes the gradient of the non-linear part. The formula should be of the form \code{~f0(a1,x1,a2,a2)}. The important point is that it should be able to be evaluated by: \code{eval(nl.gradient[[2]], env)} where \code{env} is the working environment of the algorithm (which contains all variables and parameters). The function should return a list or a data.frame whose names are the non-linear parameters.
#' @param linear.start Numeric named vector. The starting values of the linear part.
#' @param jacobian.method Character scalar. Provides the method used to numerically compute the jacobian of the non-linear part. Can be either \code{"simple"} or \code{"Richardson"}. Default is \code{"simple"}. See the help of \code{\link[numDeriv]{jacobian}} for more information.
#' @param useHessian Logical. Should the Hessian be computed in the optimization stage? Default is \code{TRUE}.
#' @param opt.control List of elements to be passed to the optimization method \code{\link[stats]{nlminb}}. See the help page of \code{\link[stats]{nlminb}} for more information.
#' @param cores Integer, default is 1. Number of threads to be used (accelerates the algorithm via the use of openMP routines). This is particularly efficient for the negative binomial and logit models, less so for the Gaussian and Poisson likelihoods (unless for very large datasets).
#' @param verbose Integer, default is 0. It represents the level of information that should be reported during the optimisation process. If \code{verbose=0}: nothing is reported. If \code{verbose=1}: the value of the coefficients and the likelihood are reported. If \code{verbose=2}: \code{1} + information on the computing tiime of the null model, the cluster coefficients and the hessian are reported.
#' @param theta.init Positive numeric scalar. The starting value of the dispersion parameter if \code{family="negbin"}. By default, the algorithm uses as a starting value the theta obtained from the model with only the intercept.
#' @param precision.cluster Precision used to obtain the fixed-effects (ie cluster coefficients). Defaults to \code{1e-5}. It corresponds to the maximum absolute difference allowed between two iterations. Argument \code{precision.cluster} cannot be lower than \code{10000*.Machine$double.eps}.
#' @param itermax.cluster Maximum number of iterations in the step obtaining the fixed-effects (only in use for 2+ clusters). Default is 10000.
#' @param itermax.deriv Maximum number of iterations in the step obtaining the derivative of the fixed-effects (only in use for 2+ clusters). Default is 5000.
#' @param showWarning Logical, default is \code{TRUE}. Whether warnings should be displayed (concerns warnings relating to: convergence state, collinearity issues and observation removal due to only 0/1 outcomes or presence of NA values).
#' @param ... Not currently used.
#'
#' @details
#' This function estimates maximum likelihood models where the conditional expectations are as follows:
#'
#' Gaussian likelihood:
#' \deqn{E(Y|X)=X\beta}{E(Y|X) = X*beta}
#' Poisson and Negative Binomial likelihoods:
#' \deqn{E(Y|X)=\exp(X\beta)}{E(Y|X) = exp(X*beta)}
#' where in the Negative Binomial there is the parameter \eqn{\theta}{theta} used to model the variance as \eqn{\mu+\mu^2/\theta}{mu+mu^2/theta}, with \eqn{\mu}{mu} the conditional expectation.
#' Logit likelihood:
#' \deqn{E(Y|X)=\frac{\exp(X\beta)}{1+\exp(X\beta)}}{E(Y|X) = exp(X*beta) / (1 + exp(X*beta))}
#'
#' When there are one or more clusters, the conditional expectation can be written as:
#' \deqn{E(Y|X) = h(X\beta+\sum_{k}\sum_{m}\gamma_{m}^{k}\times C_{im}^{k}),}
#' where \eqn{h(.)} is the function corresponding to the likelihood function as shown before. \eqn{C^k} is the matrix associated to cluster \eqn{k} such that \eqn{C^k_{im}} is equal to 1 if observation \eqn{i} is of category \eqn{m} in cluster \eqn{k} and 0 otherwise.
#'
#' When there are non linear in parameters functions, we can schematically split the set of regressors in two:
#' \deqn{f(X,\beta)=X^1\beta^1 + g(X^2,\beta^2)}
#' with first a linear term and then a non linear part expressed by the function g. That is, we add a non-linear term to the linear terms (which are \eqn{X*beta} and the cluster coefficients). It is always better (more efficient) to put into the argument \code{NL.fml} only the non-linear in parameter terms, and add all linear terms in the \code{fml} argument.
#'
#' To estimate only a non-linear formula without even the intercept, you must exclude the intercept from the linear formula by using, e.g., \code{fml = z~0}.
#'
#' The over-dispersion parameter of the Negative Binomial family, theta, is capped at 10,000. If theta reaches this high value, it means that there is no overdispersion.
#'
#' @return
#' An \code{femlm} object.
#' \item{coefficients}{The named vector of coefficients.}
#' \item{coeftable}{The table of the coefficients with their standard errors, z-values and p-values.}
#' \item{loglik}{The loglikelihood.}
#' \item{iterations}{Number of iterations of the algorithm.}
#' \item{n}{The number of observations.}
#' \item{nparams}{The number of parameters of the model.}
#' \item{call}{The call.}
#' \item{fml}{The linear formula of the call.}
#' \item{ll_null}{Log-likelihood of the null model (i.e. with the intercept only).}
#' \item{pseudo_r2}{The adjusted pseudo R2.}
#' \item{message}{The convergence message from the optimization procedures.}
#' \item{sq.cor}{Squared correlation between the dependent variable and the expected predictor (i.e. fitted.values) obtained by the estimation.}
#' \item{hessian}{The Hessian of the parameters.}
#' \item{fitted.values}{The fitted values are the expected value of the dependent variable for the fitted model: that is \eqn{E(Y|X)}.}
#' \item{cov.unscaled}{The variance-covariance matrix of the parameters.}
#' \item{se}{The standard-error of the parameters.}
#' \item{scores}{The matrix of the scores (first derivative for each observation).}
#' \item{family}{The ML family that was used for the estimation.}
#' \item{residuals}{The difference between the dependent variable and the expected predictor.}
#' \item{sumFE}{The sum of the fixed-effects for each observation.}
#' \item{offset}{The offset formula.}
#' \item{NL.fml}{The nonlinear formula of the call.}
#' \item{bounds}{Whether the coefficients were upper or lower bounded. -- This can only be the case when a non-linear formula is included and the arguments 'lower' or 'upper' are provided.}
#' \item{isBounded}{The logical vector that gives for each coefficient whether it was bounded or not. This can only be the case when a non-linear formula is included and the arguments 'lower' or 'upper' are provided.}
#' \item{clusterNames}{The names of each cluster.}
#' \item{id_dummies}{The list (of length the number of clusters) of the cluster identifiers for each observation.}
#' \item{clusterSize}{The size of each cluster.}
#' \item{obsRemoved}{In the case there were clusters and some observations were removed because of only 0/1 outcome within a cluster, it gives the row numbers of the observations that were removed.}
#' \item{clusterRemoved}{In the case there were clusters and some observations were removed because of only 0/1 outcome within a cluster, it gives the list (for each cluster) of the clustr identifiers that were removed.}
#' \item{theta}{In the case of a negative binomial estimation: the overdispersion parameter.}
#'
#' @seealso
#' See also \code{\link[FENmlm]{summary.femlm}} to see the results with the appropriate standard-errors, \code{\link[FENmlm]{getFE}} to extract the cluster coefficients, and the functions \code{\link[FENmlm]{res2table}} and \code{\link[FENmlm]{res2tex}} to visualize the results of multiple estimations.
#'
#' @author
#' Laurent Berge
#'
#' @references
#'
#' Berge, Laurent, 2018, "Efficient estimation of maximum likelihood models with multiple fixed-effects: the R package FENmlm." CREA Discussion Papers, 13 (\url{https://github.com/lrberge/fixest/blob/master/_DOCS/FENmlm_paper.pdf}).
#'
#' For models with multiple fixed-effects:
#'
#' Gaure, Simen, 2013, "OLS with multiple high dimensional category variables", Computational Statistics & Data Analysis 66 pp. 8--18
#'
#' On the unconditionnal Negative Binomial model:
#'
#' Allison, Paul D and Waterman, Richard P, 2002, "Fixed-Effects Negative Binomial Regression Models", Sociological Methodology 32(1) pp. 247--265
#'
#' @examples
#'
#' #
#' # Linear examples
#' #
#'
#' # Load trade data
#' data(trade)
#'
#' # We estimate the effect of distance on trade => we account for 3 cluster effects
#' # 1) Poisson estimation
#' est_pois = femlm(Euros ~ log(dist_km)|Origin+Destination+Product, trade)
#' # alternative formulation giving the same results:
#' # est_pois = femlm(Euros ~ log(dist_km), trade, cluster = c("Origin", "Destination", "Product"))
#'
#' # 2) Log-Log Gaussian estimation (with same clusters)
#' est_gaus = update(est_pois, log(Euros+1) ~ ., family="gaussian")
#'
#' # 3) Negative Binomial estimation
#' est_nb = update(est_pois, family="negbin")
#'
#' # Comparison of the results using the function res2table
#' res2table(est_pois, est_gaus, est_nb)
#' # Now using two way clustered standard-errors
#' res2table(est_pois, est_gaus, est_nb, se = "twoway")
#'
#' # Comparing different types of standard errors
#' sum_white = summary(est_pois, se = "white")
#' sum_oneway = summary(est_pois, se = "cluster")
#' sum_twoway = summary(est_pois, se = "twoway")
#' sum_threeway = summary(est_pois, se = "threeway")
#'
#' res2table(sum_white, sum_oneway, sum_twoway, sum_threeway)
#'
#'
#' #
#' # Example of Equivalences
#' #
#' \dontrun{
#' # equivalence with glm poisson
#' est_glm <- glm(Euros ~ log(dist_km) + factor(Origin) +
#' factor(Destination) + factor(Product), trade, family = poisson)
#'
#' # coefficient estimates + Standard-error
#' summary(est_glm)$coefficients["log(dist_km)", ]
#' est_pois$coeftable
#'
#' # equivalence with lm
#' est_lm <- lm(log(Euros+1) ~ log(dist_km) + factor(Origin) +
#' factor(Destination) + factor(Product), trade)
#'
#' # coefficient estimates + Standard-error
#' summary(est_lm)$coefficients["log(dist_km)", ]
#' summary(est_gaus, dof_correction = TRUE)$coeftable
#' }
#'
#'
#' #
#' # Non-linear examples
#' #
#'
#' # Generating data for a simple example
#' n = 100
#' x = rnorm(n, 1, 5)**2
#' y = rnorm(n, -1, 5)**2
#' z1 = rpois(n, x*y) + rpois(n, 2)
#' base = data.frame(x, y, z1)
#'
#' # Estimating a 'linear' relation:
#' est1_L = femlm(z1 ~ log(x) + log(y), base)
#' # Estimating the same 'linear' relation using a 'non-linear' call
#' est1_NL = femlm(z1 ~ 1, base, NL.fml = ~a*log(x)+b*log(y), NL.start = list(a=0, b=0))
#' # we compare the estimates with the function res2table (they are identical)
#' res2table(est1_L, est1_NL)
#'
#' # Now generating a non-linear relation (E(z2) = x + y + 1):
#' z2 = rpois(n, x + y) + rpois(n, 1)
#' base$z2 = z2
#'
#' # Estimation using this non-linear form
#' est2_NL = femlm(z2~0, base, NL.fml = ~log(a*x + b*y),
#' NL.start = list(a=1, b=2), lower = list(a=0, b=0))
#' # we can't estimate this relation linearily
#' # => closest we can do:
#' est2_L = femlm(z2~log(x)+log(y), base)
#'
#' # Difference between the two models:
#' res2table(est2_L, est2_NL)
#'
#' # Plotting the fits:
#' plot(x, z2, pch = 18)
#' points(x, fitted(est2_L), col = 2, pch = 1)
#' points(x, fitted(est2_NL), col = 4, pch = 2)
#'
#'
#' # Using a custom Jacobian for the function log(a*x + b*y)
#' myGrad = function(a,x,b,y){
#' s = a*x+b*y
#' data.frame(a = x/s, b = y/s)
#' }
#'
#' est2_NL_grad = femlm(z2~0, base, NL.fml = ~log(a*x + b*y),
#' NL.start = list(a=1,b=2), nl.gradient = ~myGrad(a,x,b,y))
#'
#'
femlm <- function(fml, data, family=c("poisson", "negbin", "logit", "gaussian"), NL.fml, cluster, na.rm = FALSE, useAcc=TRUE, NL.start, lower, upper, env, NL.start.init, offset, nl.gradient, linear.start=0, jacobian.method=c("simple", "Richardson"), useHessian=TRUE, opt.control=list(), cores = 1, verbose=0, theta.init, precision.cluster, itermax.cluster = 10000, itermax.deriv = 5000, showWarning = TRUE, ...){
# use of the conjugate gradient in the gaussian case to get
# the cluster coefficients
accDeriv = TRUE
jacobian.method <- match.arg(jacobian.method)
family = match.arg(family)
# Some settings (too complicated to be tweaked by the user)
# Nber of evaluations of the NL part to be kept in memory
# Default keeps the two last evaluations
NLsave = 2
# DOTS
dots = list(...)
# Future parameter in development: na.rm
# na.rm = ifelse(is.null(dots$na.rm), FALSE, dots$na.rm)
isNA_sample = FALSE
ptm = proc.time()
# DEPRECATED INFORMATION
# I initially called the cluster dummies... I keep it for compatibility
if(missing(cluster) && "dummy" %in% names(dots)) cluster = dots$dummy
if("linear.fml" %in% names(dots)) stop("Argument 'linear.fml' is deprecated, now use 'fml' in combination with 'NL.fml'.")
if(missing(NL.start) && "start" %in% names(dots)) {
warning("Argument 'start' is deprecated.\nUse 'NL.start' instead.", immediate. = TRUE)
NL.start = dots$start
}
if(missing(NL.start.init) && "start.init" %in% names(dots)) {
warning("Argument 'start.init' is deprecated.\nUse 'NL.start.init' instead.", immediate. = TRUE)
NL.start.init = dots$start.init
}
#
# The clusters => controls + setup
if(!inherits(fml, "formula")) stop("The argument 'fml' must be a formula.")
if(length(fml) != 3) stop("The formula must be two sided.\nEG: a~exp(b/x), or a~0 if there is no linear part.")
FML = Formula::Formula(fml)
n_rhs = length(FML)[2]
if(n_rhs > 2){
stop("The argument 'fml' cannot contain more than two parts separated by a pipe ('|').")
}
if(n_rhs == 2){
if(missing(cluster) || length(cluster) == 0){
cluster = formula(FML, lhs = 0, rhs = 2)
fml = formula(FML, lhs = 1, rhs = 1)
} else {
stop("To add cluster variables: either include them in argument 'fml' using a pipe ('|'), either use the argument 'cluster'. You cannot use both!")
}
}
# Other paramaters
if(!is.null(dots$debug) && dots$debug) verbose = 100
d.hessian = dots$d.hessian
#
# cores argument
FENmlm_CORES = 1
if(!length(cores) == 1 || !is.numeric(cores) || !(cores%%1) == 0 || cores < 0){
stop("The argument 'cores' must be an integer greater than 0 and lower than the number of threads of the computer.")
} else if(cores == 1){
isMulticore = FALSE
} else if(is.na(parallel::detectCores())){
# This can happen...
isMulticore = FALSE
warning("The number of cores has been set to 1 because the function detectCores() could not evaluate the maximum number of nodes.")
} else if(cores > parallel::detectCores()){
stop("The argument 'cores' must be lower or equal to the number of possible threads (equal to ", parallel::detectCores(), ") in this computer.")
} else {
FENmlm_CORES = cores
isMulticore = TRUE
}
famFuns = switch(family,
poisson = ml_poisson(),
negbin = ml_negbin(),
logit = ml_logit(),
gaussian = ml_gaussian())
call = match.call(expand.dots = FALSE)
# cleaning the call from update method
# we drop 'hidden' arguments for a clean call
call$"..." = NULL
if(is.matrix(data)){
if(is.null(colnames(data))){
stop("If argument data is to be a matrix, its columns must be named.")
}
data = as.data.frame(data)
}
# The conversion of the data (due to data.table)
if(!inherits(data, "data.frame")){
stop("The argument 'data' must be a data.frame or a matrix.")
}
if(inherits(data, "data.table")){
# this is a local change only
class(data) = "data.frame"
}
dataNames = names(data)
# The LHS must contain only values in the DF
namesLHS = all.vars(fml[[2]])
if(!all(namesLHS %in% dataNames)) stop("Some elements on the LHS of the formula are not in the dataset:\n", paste0(namesLHS[!namesLHS %in% dataNames], collapse = ", "))
# Now the nonlinear part:
isNA_NL = FALSE # initialisation of NAs flag (FALSE is neutral)
if(!missing(NL.fml) && !is.null(NL.fml)){
if(!inherits(NL.fml, "formula")) stop("Argument 'NL.fml' must be a formula.")
isNonLinear = TRUE
nl.call = NL.fml[[length(NL.fml)]]
allnames = all.vars(nl.call)
nonlinear.params = allnames[!allnames %in% dataNames]
nonlinear.varnames = allnames[allnames %in% dataNames]
if(length(nonlinear.params) == 0){
warning("As there is no parameter to estimate in argument 'NL.fml', this argument is ignored.\nIf you want to add an offset, use argument 'offset'.")
}
# Control for NAs
if(anyNA(data[, nonlinear.varnames])){
if(!na.rm){
# Default
varWithNA = nonlinear.varnames[which(apply(data[, nonlinear.varnames, FALSE], 2, anyNA))]
text = show_vars_limited_width(varWithNA)
stop("Some variables in 'NL.fml' contain NA. NAs are not supported, please remove them first (or use na.rm). FYI, the variables are:\n", text, call. = FALSE)
} else {
# If Na.rm => we keep track of the NAs
isNA_NL = is.na(rowSums(data[, nonlinear.varnames]))
isNA_sample = isNA_sample || TRUE
}
}
} else {
isNonLinear = FALSE
nl.call = 0
allnames = nonlinear.params = nonlinear.varnames = character(0)
}
# The dependent variable: lhs==left_hand_side
lhs = as.numeric(as.vector(eval(fml[[2]], data)))
# creation de l'environnement
if(missing(env)) env <- new.env()
else stopifnot(inherits(env, "environment"))
#
# First check
#
isNA_y = FALSE # initialisation of NAs flag (FALSE is neutral)
if(anyNA(lhs)){
if(!na.rm){
# Default behavior
stop("The left hand side of the fomula has NA values. Please provide data without NA (or use na.rm).")
} else {
# If Na.rm => we keep track of the NAs
isNA_y = is.na(lhs)
isNA_sample = isNA_sample || TRUE
}
# We repeat twice the controls, but one for a cleaned y
# It's a bit "ugly" but I don't recreate unecessary vectors this way
# We check that the dep var is not a constant
lhs_clean = lhs[!isNA_y]
# we check the var is not a constant
if(var(lhs_clean) == 0){
stop("The dependent variable is a constant. Estimation cannot be done.")
}
if(family %in% c("poisson", "negbin") & any(lhs_clean<0)) stop("Negative values of the dependant variable \nare not allowed for the \"", family, "\" family.", sep="")
if(family %in% c("logit") & !all(lhs_clean==0 | lhs_clean==1)) stop("The dependant variable has values different from 0 or 1.\nThis is not allowed with the \"logit\" family.")
} else if(any(!is.finite(lhs))){
stop("The dependent variable contains non-finite values.")
} else {
# regular controls when there is no NA
# We check that the dep var is not a constant
if(var(lhs) == 0){
stop("The dependent variable is a constant. Estimation cannot be done.")
}
if(family %in% c("poisson", "negbin") & any(lhs<0)) stop("Negative values of the dependant variable \nare not allowed for the \"", family, "\" family.", sep="")
if(family %in% c("logit") & !all(lhs==0 | lhs==1)) stop("The dependant variable has values different from 0 or 1.\nThis is not allowed with the \"logit\" family.")
}
#
# Controls and setting of the linear part:
#
isLinear = FALSE
linear.varnames = all.vars(fml[[3]])
if(length(linear.varnames) > 0 || attr(terms(fml), "intercept") == 1){
isLinear = TRUE
linear.fml = fml
}
isNA_L = FALSE # initialisation of NAs flag (FALSE is neutral)
if(isLinear){
if(!all(linear.varnames %in% dataNames)) stop(paste("In 'fml', some variables are not in the data:\n", paste(linear.varnames[!linear.varnames%in%dataNames], collapse=', '), ".", sep=""))
if(!missing(cluster) && length(cluster) != 0){
# if dummies are provided, we make sure there is an
# intercept so that factors can be handled properly
linear.fml = update(linear.fml, ~.+1)
}
#
# We construct the linear matrix
#
# we look at whether there are factor-like variables to be evaluated
# if there is factors => model.matrix
types = sapply(data[, dataNames %in% linear.varnames, FALSE], class)
if(grepl("factor", deparse(linear.fml)) || any(types %in% c("character", "factor"))){
useModel.matrix = TRUE
} else {
useModel.matrix = FALSE
}
if(useModel.matrix){
# linear.mat = stats::model.matrix(linear.fml, data)
# to catch the NAs
linear.mat = stats::model.matrix(linear.fml, stats::model.frame(linear.fml, data, na.action=na.pass))
} else {
# just to check => will give error if not proper formula
linear.mat = stats::model.matrix(linear.fml, data[1:10, ])
linear.mat = prepare_matrix(linear.fml, data)
}
linear.params <- colnames(linear.mat)
# N_linear <- length(linear.params)
if(anyNA(linear.mat)){
if(!na.rm){
# Default behavior: no NA tolerance
quiNA = apply(linear.mat, 2, anyNA)
whoIsNA = linear.params[quiNA]
text = show_vars_limited_width(whoIsNA)
stop("Evaluation of the linear part returns NA. NAs are not supported, please remove them before running this function (or use na.rm). FYI the variables with NAs are:\n", text)
} else {
# If Na.rm => we keep track of the NAs
isNA_L = is.na(rowSums(linear.mat))
isNA_sample = isNA_sample || TRUE
}
}
if(!is.numeric(linear.start)) stop("'linear.start' must be numeric!")
} else {
linear.params <- linear.start <- linear.varnames <- NULL
useModel.matrix = FALSE
}
params <- c(nonlinear.params, linear.params)
lparams <- length(params)
varnames <- c(nonlinear.varnames, linear.varnames)
# Attention les parametres non lineaires peuvent etre vides
if(length(nonlinear.params)==0) isNL = FALSE
else isNL = TRUE
#
# Handling Clusters ####
#
isDummy = FALSE
if(!is.null(dots$clusterFromUpdate) && dots$clusterFromUpdate){
# Cluster information coming from the update method
# means that there is no modification of past clusters
object = dots$object
# we retrieve past information
dum_all = object$id_dummies
obs2remove = object$obsRemoved
dummyOmises = object$clusterRemoved
cluster = object$clusterNames
nbCluster = object$clusterSize
Q = length(nbCluster)
# # If obsRemoved => need to modify the data base
# # This is done later only once
# if(length(obs2remove) > 0){
# data = data[-obs2remove, ]
#
# # We recreate the linear matrix
# if(isLinear) {
# if(useModel.matrix){
# # means there are factors
# linear.mat = stats::model.matrix(linear.fml, data)
# } else {
# linear.mat = linear.mat[-obs2remove, ]
# }
# }
#
# lhs = eval(fml[[2]], data)
# }
# We still need to recreate some objects though
isDummy = TRUE
names(dum_all) = NULL
# dumMat_cpp = matrix(unlist(dum_all), ncol = Q) - 1
dum_names = sum_y_all = obs_per_cluster_all = list()
for(i in 1:Q){
k = nbCluster[i]
dum = dum_all[[i]]
if(length(obs2remove) > 0){
sum_y_all[[i]] = cpp_tapply_vsum(k, lhs[-obs2remove], dum)
} else {
sum_y_all[[i]] = cpp_tapply_vsum(k, lhs, dum)
}
obs_per_cluster_all[[i]] = cpp_table(k, dum)
dum_names[[i]] = attr(dum_all[[i]], "clust_names")
}
#
# Re-ordering the CC
#
if(any(nbCluster != sort(nbCluster, decreasing = TRUE))){
new_order = order(nbCluster, decreasing = TRUE)
reorder = order(new_order)
nbCluster = nbCluster[new_order]
cluster = cluster[new_order]
dum_all = dum_all[new_order]
dum_names = dum_names[new_order]
sum_y_all = sum_y_all[new_order]
obs_per_cluster_all = obs_per_cluster_all[new_order]
} else {
reorder = 1:Q
}
} else if(!missing(cluster) && length(cluster)!=0){
# The main cluster construction
isDummy = TRUE
isClusterFml = FALSE
if(is.character(cluster) && any(!cluster %in% names(data))){
var_problem = cluster[!cluster%in%names(data)]
stop("The argument 'cluster' must be variable names! Cluster(s) not in the data: ", paste0(var_problem, collapse = ", "), ".")
} else if(!is.character(cluster)){
# means the cluster is a formula
cluster_fml = cluster
# we check that the cluster variables are indeed in the data
cluster_vars = all.vars(cluster)
if(!all(cluster_vars %in% names(data))){
var_problem = cluster_vars[!cluster_vars %in% names(data)]
stop("The following 'cluster' variable", ifelse(length(var_problem) == 1, " is", "s are"), " not in the data: ", paste0(var_problem, collapse = ", "), ".")
}
cluster_mat = model.frame(cluster_fml, data, na.action = NULL)
# we change cluster to become a vector of characters
cluster = names(cluster_mat)
isClusterFml = TRUE
} else {
# the clusters are valid cluster names
cluster_mat = data[, cluster, drop = FALSE]
}
# We change factors to character
isFactor = sapply(cluster_mat, is.factor)
if(any(isFactor)){
for(i in which(isFactor)){
cluster_mat[[i]] = as.character(cluster_mat[[i]])
}
}
isNA_cluster = FALSE
if(anyNA(cluster_mat)){
if(!na.rm){
# Default behavior, NA not allowed
var_problem = cluster[sapply(cluster_mat, anyNA)]
stop("The cluster variables contain NAs, this is not allowed (or use na.rm).\nFYI, the clusters with NA are: ", paste0(var_problem, collapse = ", "), ".")
} else {
# If Na.rm => we keep track of the NAs
isNA_cluster = apply(cluster_mat, MARGIN = 1, anyNA)
isNA_sample = isNA_sample || TRUE
}
}
if(isNA_sample){
# we remove NAs from the clusters only
# after, we'll remove them from the data too
isNA_full = isNA_y | isNA_L | isNA_NL | isNA_cluster
nbNA = sum(isNA_full)
nobs = nrow(data)
if(nbNA == nobs){
stop("All observations contain NAs. Estimation cannot be done.")
}
message_NA = paste0(numberFormatNormal(nbNA), " observations removed because of NA values. (Breakup: LHS: ", numberFormatNormal(sum(isNA_y)), ", RHS: ", numberFormatNormal(sum(isNA_L + isNA_NL)), ", Cluster: ", numberFormatNormal(sum(isNA_cluster)), ")")
# we drop the NAs from the cluster matrix
cluster_mat = cluster_mat[!isNA_full, , drop = FALSE]
obs2remove_NA = which(isNA_full)
index_noNA = (1:nobs)[!isNA_full]
# we change the LHS variable
lhs = as.numeric(eval(fml[[2]], data[-obs2remove_NA, ]))
}
Q = length(cluster)
dum_all = dum_names = list()
sum_y_all = obs_per_cluster_all = list()
obs2remove = c()
dummyOmises = list()
for(i in 1:Q){
dum_raw = cluster_mat[[cluster[i]]]
# in order to avoid "unclassed" values > real nber of classes: we re-factor the cluster
dum_names[[i]] = thisNames = getItems(dum_raw)
dum = quickUnclassFactor(dum_raw)
dum_all[[i]] = dum
k = length(thisNames)
# We delete "all zero" outcome
sum_y_all[[i]] = sum_y_clust = cpp_tapply_vsum(k, lhs, dum)
obs_per_cluster_all[[i]] = n_perClust = cpp_table(k, dum)
if(family %in% c("poisson", "negbin")){
qui = which(sum_y_clust==0)
} else if(family == "logit"){
qui = which(sum_y_clust==0 | sum_y_clust==n_perClust)
} else if(family == "gaussian"){
qui = NULL
}
if(length(qui>0)){
# We first delete the data:
dummyOmises[[i]] = thisNames[qui]
obs2remove = unique(c(obs2remove, which(dum %in% qui)))
} else {
dummyOmises[[i]] = character(0)
}
}
# We remove the problems
if(length(obs2remove)>0){
# update of the cluster matrix
cluster_mat = cluster_mat[-obs2remove, , drop = FALSE]
# update of the lhs
lhs = lhs[-obs2remove]
# Then we recreate the dummies
for(i in 1:Q){
dum_raw = cluster_mat[[cluster[i]]]
dum_names[[i]] = getItems(dum_raw)
dum = quickUnclassFactor(dum_raw)
dum_all[[i]] = dum
k = length(dum_names[[i]])
# We also recreate these values
sum_y_all[[i]] = cpp_tapply_vsum(k, lhs, dum)
obs_per_cluster_all[[i]] = cpp_table(k, dum)
}
# Then the warning message
nb_missing = sapply(dummyOmises, length)
message_cluster = paste0(paste0(nb_missing, collapse = "/"), " cluster", ifelse(sum(nb_missing) == 1, "", "s"), " (", length(obs2remove), " observations) removed because of only ", ifelse(family=="logit", "zero/one", "zero"), " outcomes.")
if(isNA_sample){
if(showWarning) warning(message_NA, "\n ", message_cluster)
} else {
if(showWarning) warning(message_cluster)
}
names(dummyOmises) = cluster
} else if(isNA_sample){
if(showWarning) warning(message_NA)
}
if(isNA_sample){
# we update the value of obs2remove (will contain both NA and removed bc of outcomes)
if(length(obs2remove) > 0){
obs2remove_cluster = index_noNA[obs2remove]
} else {
obs2remove_cluster = c()
}
obs2remove = sort(c(obs2remove_NA, obs2remove_cluster))
}
#
# We re-order the clusters
#
nbCluster = sapply(dum_all, max)
if(any(nbCluster != sort(nbCluster, decreasing = TRUE))){
new_order = order(nbCluster, decreasing = TRUE)
reorder = order(new_order)
nbCluster = nbCluster[new_order]
cluster = cluster[new_order]
dum_all = dum_all[new_order]
dum_names = dum_names[new_order]
sum_y_all = sum_y_all[new_order]
obs_per_cluster_all = obs_per_cluster_all[new_order]
} else {
reorder = 1:Q
}
} else {
# There is no cluster
Q = 0
# NA management is needed to create obs2remove
if(isNA_sample){
# after, we'll remove them from the data too
isNA_full = isNA_y | isNA_L | isNA_NL
nbNA = sum(isNA_full)
nobs = nrow(data)
if(nbNA == nobs){
stop("All observations contain NAs. Estimation cannot be done.")
}
if(showWarning) warning(numberFormatNormal(nbNA), " observations removed because of NA values. (Breakup: LHS: ", numberFormatNormal(sum(isNA_y)), ", RHS: ", numberFormatNormal(sum(isNA_L + isNA_NL)), ")")
# we drop the NAs from the cluster matrix
obs2remove = which(isNA_full)
} else {
obs2remove = c()
}
}
# NA & problem management
if(length(obs2remove) > 0){
# we kick out the problems (both NA related and cluster related)
data = data[-obs2remove, ]
# We recreate the linear matrix and the LHS
if(isLinear) {
if(useModel.matrix){
# means there are factors
linear.mat = stats::model.matrix(linear.fml, data)
} else {
linear.mat = linear.mat[-obs2remove, ]
}
}
lhs = as.numeric(eval(fml[[2]], data))
}
# If presence of clusters => we exclude the intercept
if(Q > 0){
# If there is a linear intercept, we withdraw it
# We drop the intercept:
if(isLinear && "(Intercept)" %in% colnames(linear.mat)){
var2remove = which(colnames(linear.mat) == "(Intercept)")
if(ncol(linear.mat) == length(var2remove)){
isLinear = FALSE
linear.params = NULL
params <- nonlinear.params
lparams <- length(params)
varnames <- nonlinear.varnames
} else{
linear.mat = linear.mat[, -var2remove, drop=FALSE]
linear.params <- colnames(linear.mat)
# N_linear <- length(linear.params)
params <- c(nonlinear.params, linear.params)
lparams <- length(params)
varnames <- c(nonlinear.varnames, linear.varnames)
}
}
}
#
# Checks for MONKEY TEST
#
if(lparams==0 & Q==0) stop("No parameter to be estimated.")
if(!is.logical(useHessian)) stop("'useHessian' must be of type 'logical'!")
#
# Controls: The non linear part
#
if(isNL){
if(missing(NL.start.init)){
if(missing(NL.start)) stop("There must be starting values for NL parameters. Please use argument NL.start (or NL.start.init).")
if(typeof(NL.start)!="list") stop("NL.start must be a list.")
if(any(!nonlinear.params %in% names(NL.start))) stop(paste("Some NL parameters have no starting values:\n", paste(nonlinear.params[!nonlinear.params %in% names(NL.start)], collapse=", "), ".", sep=""))
# we restrict NL.start to the nonlinear.params
NL.start = NL.start[nonlinear.params]
} else {
if(length(NL.start.init)>1) stop("NL.start.init must be a scalar.")
if(!is.numeric(NL.start.init)) stop("NL.start.init must be numeric!")
if(!is.finite(NL.start.init)) stop("Infinites values as starting values, you must be kidding me...")
if(missing(NL.start)){
NL.start <- list()
NL.start[nonlinear.params] <- NL.start.init
} else {
if(typeof(NL.start)!="list") stop("NL.start must be a list.")
if(any(!names(NL.start) %in% params)) stop(paste("Some parameters in 'NL.start' are not in the formula:\n", paste(names(NL.start)[!names(NL.start) %in% params], collapse=", "), ".", sep=""))
missing.params <- nonlinear.params[!nonlinear.params%in%names(NL.start)]
NL.start[missing.params] <- NL.start.init
}
}
} else {
NL.start <- list()
}
#
# The upper and lower limits
#
if(!missing(lower) && !is.null(lower)){
if(typeof(lower)!="list"){
stop("'lower' MUST be a list.")
}
lower[params[!params %in% names(lower)]] <- -Inf
lower <- unlist(lower[params])
} else {
lower <- rep(-Inf, lparams)
names(lower) <- params
}
if(!missing(upper) && !is.null(upper)){
if(typeof(upper)!="list"){
stop("'upper' MUST be a list.")
}
upper[params[!params %in% names(upper)]] <- Inf
upper <- unlist(upper[params])
} else {
upper <- rep(Inf, lparams)
names(upper) <- params
}
lower <- c(lower)
upper <- c(upper)
#
# Controls: user defined gradient
#
if(!missing(nl.gradient)){
isGradient = TRUE
if(!inherits(nl.gradient, "formula") || length(nl.gradient)==3) stop("'nl.gradient' must be a formula like, for ex., ~f0(a1, x1, a2, x2). f0 giving the gradient.")
} else {
isGradient = FALSE
}
if(!is.null(d.hessian)){
hessianArgs = list(d=d.hessian)
} else hessianArgs = NULL
assign("hessianArgs", hessianArgs, env)
#
# Offset
#
offset.value = 0
if(!missing(offset) && !is.null(offset)){
# control
if(!inherits(offset, "formula")){
stop("Argument 'offset' must be a formula (e.g. ~ 1+x^2).")
}
if(length(offset) != 2){
stop("Argument 'offset' must be a formula of the type (e.g.): ~ 1+x^2.")
}
offset.call = offset[[length(offset)]]
vars.offset = all.vars(offset.call)
if(any(!vars.offset %in% dataNames)){
var_missing = vars.offset[!vars.offset %in% dataNames]
stop("Some variable in the argument 'offset' are not in the data:\n", paste0(var_missing, sep=", "), ".")
}
offset.value = eval(offset.call, data)
if(anyNA(offset.value)){
stop("Evaluating the argument 'offset' lead to NA values.")
}
}
assign("offset.value", offset.value, env)
#
# PRECISION
#
n = length(lhs)
# The main precision
if(!missing(precision.cluster) && !is.null(precision.cluster)){
if(!length(precision.cluster)==1 || !is.numeric(precision.cluster) || precision.cluster <= 0 || precision.cluster >1){
stop("If provided, argument 'precision.cluster' must be a strictly positive scalar lower than 1.")
} else if(precision.cluster < 10000*.Machine$double.eps){
stop("Argument 'precision.cluster' cannot be lower than ", signif(10000*.Machine$double.eps))
}
eps.cluster = precision.cluster
} else {
eps.cluster = 1e-5 # min(10**-(log10(n) + Q/3), 1e-5)
}
if(!is.numeric(itermax.cluster) || length(itermax.cluster) > 1 || itermax.cluster < 1){
stop("Argument itermax.cluster must be an integer greater than 0.")
}
if(!is.numeric(itermax.deriv) || length(itermax.deriv) > 1 || itermax.deriv < 1){
stop("Argument itermax.deriv must be an integer greater than 0.")
}
# other precisions
eps.NR = ifelse(is.null(dots$eps.NR), eps.cluster/100, dots$eps.NR)
eps.deriv = ifelse(is.null(dots$eps.deriv), 1e-4, dots$eps.deriv)
# Initial checks are done
nonlinear.params <- names(NL.start) #=> in the order the user wants
# starting values of all parameters (initialized with the NL ones):
start = NL.start
# control for the linear start => we can provide coefficients
# from past estimations. Coefficients that are not provided are set
# to 0
if(length(linear.start)>1){
what = linear.start[linear.params]
what[is.na(what)] = 0
linear.start = what
}
start[linear.params] <- linear.start
params <- names(start)
start <- unlist(start)
start <- c(start)
lparams <- length(params)
names(start) <- params
# The right order of upper and lower
upper = upper[params]
lower = lower[params]
#
# The MODEL0 => to get the init of the theta for the negbin
#
# Bad location => rethink the design of the code
assign(".famFuns", famFuns, env)
assign(".family", family, env)
assign("nobs", length(lhs), env)
assign(".lhs", lhs, env)
assign(".isMulticore", isMulticore, env)
assign(".CORES", FENmlm_CORES, env)
assign(".verbose", verbose, env)
if(missing(theta.init)){
theta.init = NULL
} else {
if(!is.null(theta.init) && (!is.numeric(theta.init) || length(theta.init)!=1 || theta.init<=0)){
stop("the argument 'theta.init' must be a strictly positive scalar.")
}
}
model0 <- get_model_null(env, theta.init)
theta.init = model0$theta
# For the negative binomial:
if(family == "negbin"){
params = c(params, ".theta")
start = c(start, theta.init)
names(start) = params
upper = c(upper, 10000)
lower = c(lower, 1e-3)
}
# On balance les donnees a utiliser dans un nouvel environnement
if(isLinear) assign("linear.mat", linear.mat, env)
if(isGradient) assign(".call_gradient", nl.gradient[[2]], env)
####
#### Sending to the env ####
####
useExp_clusterCoef = family %in% c("poisson")
# The dummies
assign("isDummy", isDummy, env)
if(isDummy){
assign(".dum_all", dum_all, env)
assign(".nbCluster", nbCluster, env)
assign(".sum_y", sum_y_all, env)
assign(".tableCluster", obs_per_cluster_all, env)
assign(".familyConv", family, env) # new family -- used in convergence, can be modified
if(Q > 1 || family %in% c("negbin")){
# for the derivatives
dumMat_cpp = matrix(unlist(dum_all), ncol = Q) - 1
assign(".dumMat_cpp", dumMat_cpp, env)
}
# the saved dummies
if(!is.null(dots$clusterStart)){
# information on starting values coming from update method
doExp = ifelse(useExp_clusterCoef, exp, I)
if(dots$clusterFromUpdate || length(obs2remove) == 0){
# Means it's the full cluster properly given
assign(".savedDummy", doExp(dots$clusterStart), env)
} else {
assign(".savedDummy", doExp(dots$clusterStart[-obs2remove]), env)
}
} else if(useExp_clusterCoef){
assign(".savedDummy", rep(1, length(lhs)), env)
} else {
assign(".savedDummy", rep(0, length(lhs)), env)
}
# TO DELETE when new cpp functions fully implemented
assign(".orderCluster", NULL, env)
if(isMulticore || family %in% c("negbin", "logit")){
# we also add this variable used in cpp
orderCluster_all = list()
for(i in 1:Q){
orderCluster_all[[i]] = order(dum_all[[i]]) - 1
}
# orderCluster_mat = do.call("cbind", orderCluster_all)
assign(".orderCluster", orderCluster_all, env)
}
# New cpp functions
# This is where we send the elements needed for convergence in cpp
assign(".dum_vector", as.integer(unlist(dum_all) - 1), env)
assign(".tableCluster_vector", as.integer(unlist(obs_per_cluster_all)), env)
assign(".sum_y_vector", unlist(sum_y_all), env)
if(family == "gaussian" && Q >= 2){
assign(".invTableCluster", 1/unlist(obs_per_cluster_all))
} else {
assign(".invTableCluster", 1L)
}
if(family %in% c("negbin", "logit")){
assign(".cumtable_vector", as.integer(unlist(lapply(obs_per_cluster_all, cumsum))), env)
assign(".obsCluster_vector", as.integer(unlist(orderCluster_all)), env)
} else {
# we need to assign it anyway
assign(".cumtable_vector", 1L, env)
assign(".obsCluster_vector", 1L, env)
}
}
# basic NL
envNL = new.env()
assign("isNL", isNL, env)
if(isNL){
for(var in nonlinear.varnames) assign(var, data[[var]], envNL)
for(var in nonlinear.params) assign(var, start[var], envNL)
}
assign("envNL", envNL, env)
assign(".nl.call", nl.call, env)
assign("isGradient", isGradient, env)
# other
assign(".lhs", lhs, env)
assign("isLinear", isLinear, env)
assign("linear.params", linear.params, env)
assign("nonlinear.params", nonlinear.params, env)
assign("params", params, env)
assign("nobs", length(lhs), env)
assign(".verbose", verbose, env)
assign("jacobian.method", jacobian.method, env)
assign(".famFuns", famFuns, env)
assign(".family", family, env)
assign(".iter", 0, env)
# Pour gerer les valeurs de mu:
assign(".coefMu", list(), env)
assign(".valueMu", list(), env)
assign(".valueExpMu", list(), env)
assign(".wasUsed", TRUE, env)
# Pour les valeurs de la Jacobienne non lineaire
assign(".JC_nbSave", 0, env)
assign(".JC_nbMaxSave", 1, env)
assign(".JC_savedCoef", list(), env)
assign(".JC_savedValue", list(), env)
# PRECISION
assign(".eps.cluster", eps.cluster, env)
assign(".eps.NR", eps.NR, env)
assign(".eps.deriv", eps.deriv, env)
# ITERATIONS
assign(".itermax.cluster", itermax.cluster, env)
assign(".itermax.deriv", itermax.deriv, env)
# OTHER
assign(".useAcc", useAcc, env)
assign(".warn_0_Hessian", FALSE, env)
assign(".warn_overfit_logit", FALSE, env)
# To monitor how the clusters are computed (if the problem is difficult or not)
assign(".firstIterCluster", 1e10, env) # the number of iterations in the first run
assign(".firstRunCluster", TRUE, env) # flag for first enrty in get_dummies
assign(".iterCluster", 1e10, env) # the previous number of cluster iterations
assign(".evolutionLL", Inf, env) # diff b/w two successive LL
assign(".pastLL", 0, env)
assign(".iterLastPrecisionIncrease", 0, env) # the last iteration when precision was increased
assign(".nbLowIncrease", 0, env) # number of successive evaluations with very low LL increments
assign(".nbIterOne", 0, env) # nber of successive evaluations with only 1 iter to get the clusters
assign(".difficultConvergence", FALSE, env)
# Same for derivatives
assign(".derivDifficultConvergence", FALSE, env)
assign(".firstRunDeriv", TRUE, env) # flag for first entry in derivative
assign(".accDeriv", TRUE, env) # Logical: flag for accelerating deriv
assign(".iterDeriv", 1e10, env) # number of iterations in the derivatives step
#
# if there is only the intercept and cluster => we estimate only the clusters
#
if(!isLinear && !isNonLinear && Q>0){
if(family == "negbin"){
stop("To estimate the negative binomial model, you need at least one variable. (The estimation of the model with only the clusters is not implemented.)")
}
results = femlm_only_clusters(env, model0, cluster, dum_names)
results$call = match.call()
results$fml = fml
if(length(obs2remove)>0){
results$obsRemoved = obs2remove
results$clusterRemoved = dummyOmises
}
results$onlyCluster = TRUE
return(results)
}
# On teste les valeurs initiales pour informer l'utilisateur
if(isNL){
mu = NULL
try(mu <- eval(nl.call, envir = envNL), silent = FALSE)
if(is.null(mu)){
# the non linear part could not be evaluated - ad hoc message
stop("The non-linear part (NL.fml) could not be evaluated. There may be a problem in 'NL.fml'.")
}
# DEPREC: the NL part should return stg the same length
# # the special case of the constant
# if(length(mu) == 1){
# mu = rep(mu, nrow(data))
# }
# No numeric vectors
if(!is.vector(mu) || !is.numeric(mu)){
stop("Evaluation of NL.fml should return a numeric vector. (This is currently not the case)")
}
# Handling NL.fml errors
if(length(mu) != nrow(data)){
stop("Evaluation of NL.fml leads to ", length(mu), " observations while there are ", nrow(data), " observations in the data base. They should be of the same lenght.")
}
if(anyNA(mu)){
stop("Evaluating NL.fml leads to NA values (which are forbidden). Maybe it's a problem with the starting values, maybe it's another problem.")
}
} else {
mu = eval(nl.call, envir = envNL)
}
# On sauvegarde les valeurs de la partie non lineaire
assign(".nbMaxSave", NLsave, env) # nombre maximal de valeurs a sauvegarder
assign(".nbSave", 1, env) # nombre de valeurs totales sauvegardees
assign(".savedCoef", list(start[nonlinear.params]), env)
assign(".savedValue", list(mu), env)
if(isLinear) {
mu <- mu + c(linear.mat%*%unlist(start[linear.params]))
}
if(anyNA(mu)){
stop("Evaluating the left hand side leads to NA values.")
}
# Check of the user-defined gradient, if given
if(isGradient){
for(nom in nonlinear.params) assign(nom, start[nom], env)
test <- eval(nl.gradient[[2]], envir=env)
if(!class(test)%in%c("list", "data.frame")) stop("The function called by 'nl.gradient' must return an object of type 'list' or 'data.frame'.")
if(!all(nonlinear.params%in%names(test))) stop(paste("The gradient must return a value for each parameter. Some are missing:\n", paste(nonlinear.params[!nonlinear.params%in%names(test)], collapse=", "), ".", sep=""))
if(!all(names(test)%in%nonlinear.params)) warning(paste("Some values given by 'nl.gradient' are not in the parameters:\n", paste(names(test)[!names(test)%in%nonlinear.params], collapse=", "), ".", sep=""))
if(mean(sapply(test[nonlinear.params], length))!=length(lhs)) stop("Strange, the length of the vector returned by 'nl.gradient' does not match with the data.")
#we save 1 gradient:
jacob.mat = as.matrix(test[nonlinear.params])
assign(".JC_nbSave", 1, env)
assign(".JC_savedCoef", list(start[nonlinear.params]), env)
assign(".JC_savedValue", list(jacob.mat), env)
}
# Mise en place du calcul du gradient
gradient = femlm_gradient
hessian <- NULL
if(useHessian) hessian <- femlm_hessian
# GIVE PARAMS
if(!is.null(dots$give.params) && dots$give.params) return(list(coef=start, env=env))
if(verbose >= 2) cat("Setup in ", (proc.time() - ptm)[3], "s\n", sep="")
#
# Maximizing the likelihood
#
opt <- NULL
opt <- stats::nlminb(start=start, objective=femlm_ll, env=env, lower=lower, upper=upper, gradient=gradient, hessian=hessian, control=opt.control)
if(is.null(opt)){
stop("Could not achieve maximization.")
}
convStatus = TRUE
warningMessage = ""
if(!opt$message %in% c("X-convergence (3)", "relative convergence (4)", "both X-convergence and relative convergence (5)")){
# warning("[femlm] The optimization algorithm did not converge, the results are not reliable. Use function diagnostic() to see what's wrong.", call. = FALSE)
warningMessage = " The optimization algorithm did not converge, the results are not reliable."
convStatus = FALSE
}
####
#### After Maximization ####
####
coef <- opt$par
# The Hessian
hessian = femlm_hessian(coef, env=env)
# we add the names of the non linear variables in the hessian
if(isNonLinear || family == "negbin"){
dimnames(hessian) = list(params, params)
}
# we create the Hessian without the bounded parameters
hessian_noBounded = hessian
# Handling the bounds
if(!isNonLinear){
NL.fml = NULL
bounds = NULL
isBounded = NULL
} else {
# we report the bounds & if the estimated parameters are bounded
upper_bound = upper[nonlinear.params]
lower_bound = lower[nonlinear.params]
# 1: are the estimated parameters at their bounds?
coef_NL = coef[nonlinear.params]
isBounded = rep(FALSE, length(params))
isBounded[1:length(coef_NL)] = (coef_NL == lower_bound) | (coef_NL == upper_bound)
# 2: we save the bounds
upper_bound_small = upper_bound[is.finite(upper_bound)]
lower_bound_small = lower_bound[is.finite(lower_bound)]
bounds = list()
if(length(upper_bound_small) > 0) bounds$upper = upper_bound_small
if(length(lower_bound_small) > 0) bounds$lower = lower_bound_small
if(length(bounds) == 0){
bounds = NULL
}
# 3: we update the Hessian (basically, we drop the bounded element)
if(any(isBounded)){
hessian_noBounded = hessian[-which(isBounded), -which(isBounded), drop = FALSE]
boundText = ifelse(coef_NL == upper_bound, "Upper bounded", "Lower bounded")[isBounded]
attr(isBounded, "type") = boundText
}
}
# Variance
var <- NULL
try(var <- solve(hessian_noBounded), silent = TRUE)
if(is.null(var)){
warningMessage = paste(warningMessage, "The information matrix is singular (likely presence of collinearity). Use function diagnostic() to pinpoint collinearity problems.")
var = hessian_noBounded*NA
se = diag(var)
} else {
se = diag(var)
se[se < 0] = NA
se = sqrt(se)
}
# Warning message
if(nchar(warningMessage) > 0){
if(showWarning) warning("[femlm]:", warningMessage)
}
# To handle the bounded coefficient, we set its SE to NA
if(any(isBounded)){
se = se[params]
names(se) = params
}
zvalue <- coef/se
pvalue <- 2*pnorm(-abs(zvalue))
# We add the information on the bound for the se & update the var to drop the bounded vars
se_format = se
if(any(isBounded)){
se_format[!isBounded] = decimalFormat(se_format[!isBounded])
se_format[isBounded] = boundText
}
coeftable <- data.frame("Estimate"=coef, "Std. Error"=se_format, "z value"=zvalue, "Pr(>|z|)"=pvalue, stringsAsFactors = FALSE)
names(coeftable) <- c("Estimate", "Std. Error", "z value", "Pr(>|z|)")
row.names(coeftable) <- params
attr(se, "type") = attr(coeftable, "type") = "Standard"
mu_both = get_mu(coef, env, final = TRUE)
mu = mu_both$mu
exp_mu = mu_both$exp_mu
# calcul pseudo r2
loglik <- -opt$objective # moins car la fonction minimise
ll_null <- model0$loglik
# degres de liberte
df_k = length(coef)
if(isDummy) df_k = df_k + sum(sapply(dum_all, max) - 1) + 1
# dummies are constrained, they don't have full dof (cause you need to take one value off for unicity)
# this is an approximation, in some cases there can be more than one ref. But good approx.
pseudo_r2 <- 1 - (loglik-df_k)/(ll_null-1)
# Calcul residus
expected.predictor = famFuns$expected.predictor(mu, exp_mu, env)
residuals = lhs - expected.predictor
# calcul squared corr
if(sd(expected.predictor) == 0){
sq.cor = NA
} else {
sq.cor = stats::cor(lhs, expected.predictor)**2
}
# The scores
scores = femlm_scores(coef, env)
if(isNonLinear){
# we add the names of the non linear params in the score
colnames(scores) = params
}
res <- list(coefficients=coef, coeftable=coeftable, loglik=loglik, iterations=opt$iterations, n=length(lhs), nparams=df_k, call=call, fml=fml, ll_null=ll_null, pseudo_r2=pseudo_r2, message=opt$message, convStatus=convStatus, sq.cor=sq.cor, fitted.values=expected.predictor, hessian=hessian, cov.unscaled=var, se=se, scores=scores, family=family, residuals=residuals)
# Other optional elements
if(!missing(offset)){
res$offset = offset
}
# The value of mu (if cannot be recovered from fitted())
if(family == "logit"){
qui_01 = expected.predictor %in% c(0, 1)
if(any(qui_01)){
res$mu = mu
}
} else if(family %in% c("poisson", "negbin")){
qui_0 = expected.predictor == 0
if(any(qui_0)){
res$mu = mu
}
}
if(!is.null(NL.fml)){
res$NL.fml = NL.fml
if(!is.null(bounds)){
res$bounds = bounds
res$isBounded = isBounded
}
}
# Dummies
if(isDummy){
dummies = attr(mu, "mu_dummies")
if(useExp_clusterCoef){
dummies = rpar_log(dummies, env)
}
res$sumFE = dummies
# res$clusterNames = cluster
#
# id_dummies = list()
# for(i in 1:length(cluster)){
# dum = dum_all[[i]]
# attr(dum, "clust_names") = as.character(dum_names[[i]])
# id_dummies[[cluster[i]]] = dum
# }
res$clusterNames = cluster[reorder]
id_dummies = list()
for(i in reorder){
dum = dum_all[[i]]
attr(dum, "clust_names") = as.character(dum_names[[i]])
id_dummies[[cluster[i]]] = dum
}
res$id_dummies = id_dummies
# clustSize = sapply(id_dummies, max)
clustSize = nbCluster[reorder]
names(clustSize) = res$clusterNames
res$clusterSize = clustSize
}
# Observations removed (either NA or clusters)
if(length(obs2remove)>0){
res$obsRemoved = obs2remove
if(isDummy && any(lengths(dummyOmises) > 0)){
res$clusterRemoved = dummyOmises
}
}
if(family == "negbin"){
theta = coef[".theta"]
res$theta = theta
if(theta > 1000){
warning("Very high value of theta (", theta, "). There is no sign of overdisperion, you may consider a Poisson model.")
}
}
class(res) = "femlm"
if(verbose > 0){
cat("\n")
}
return(res)
}
femlm_only_clusters <- function(env, model0, cluster, dum_names){
# Estimation with only the cluster coefficients
#
# 1st step => computing the dummies
#
nobs = get("nobs", env)
family = get(".family", env)
offset.value = get("offset.value", env)
if(family == "negbin"){
coef[[".theta"]] = model0$theta
} else {
coef = list()
}
# indicator of whether we compute the exp(mu)
useExp = family %in% c("poisson", "logit", "negbin")
useExp_clusterCoef = family %in% c("poisson")
# mu, using the offset
mu_noDum = offset.value
if(length(mu_noDum) == 1) mu_noDum = rep(mu_noDum, nobs)
# we create the exp of mu => used for later functions
exp_mu_noDum = NULL
if(useExp_clusterCoef){
exp_mu_noDum = rpar_exp(mu_noDum, env)
}
dummies = getDummies(mu_noDum, exp_mu_noDum, env, coef)
exp_mu = NULL
if(useExp_clusterCoef){
# despite being called mu, it is in fact exp(mu)!!!
exp_mu = exp_mu_noDum*dummies
mu = rpar_log(exp_mu, env)
} else {
mu = mu_noDum + dummies
if(useExp){
exp_mu = rpar_exp(mu, env)
}
}
#
# 2nd step => saving information
#
dum_all = get(".dum_all", env)
famFuns = get(".famFuns", env)
lhs = get(".lhs", env)
# The log likelihoods
loglik = famFuns$ll(lhs, mu, exp_mu, env, coef)
ll_null = model0$loglik
# degres de liberte
df_k = sum(sapply(dum_all, max) - 1) + 1
pseudo_r2 = 1 - loglik/ll_null # NON Adjusted
# Calcul residus
expected.predictor = famFuns$expected.predictor(mu, exp_mu, env)
residuals = lhs - expected.predictor
# calcul squared corr
if(sd(expected.predictor) == 0){
sq.cor = NA
} else {
sq.cor = stats::cor(lhs, expected.predictor)**2
}
# calcul r2 naif
naive.r2 = 1 - sum(residuals**2) / sum((lhs - mean(lhs))**2)
res = list(loglik=loglik, n=length(lhs), nparams=df_k, call=call, ll_null=ll_null, pseudo_r2=pseudo_r2, naive.r2=naive.r2, sq.cor=sq.cor, expected.predictor=expected.predictor, residuals=residuals, family=family)
#
# Information on the dummies
if(useExp_clusterCoef){
dummies = rpar_log(dummies, env)
}
res$sumFE = dummies
res$clusterNames = cluster
id_dummies = list()
for(i in 1:length(cluster)){
dum = dum_all[[i]]
attr(dum, "clust_names") = as.character(dum_names[[i]])
id_dummies[[cluster[i]]] = dum
}
res$id_dummies = id_dummies
clustSize = sapply(dum_all, max)
names(clustSize) = cluster
res$clusterSize = clustSize
if(family == "negbin"){
theta = coef[".theta"]
res$theta = theta
}
res$convStatus = TRUE
class(res) = "femlm"
return(res)
}
femlm_hessian <- function(coef, env){
# Computes the hessian
verbose = get(".verbose", env)
if(verbose >= 2) ptm = proc.time()
params <- get("params", env)
names(coef) <- params
nonlinear.params <- get("nonlinear.params", env)
k <- length(nonlinear.params)
isNL <- get("isNL", env)
hessianArgs = get("hessianArgs", env)
famFuns = get(".famFuns", env)
family = get(".family", env)
y = get(".lhs", env)
isDummy = get("isDummy", env)
mu_both = get_savedMu(coef, env)
mu = mu_both$mu
exp_mu = mu_both$exp_mu
jacob.mat = get_Jacobian(coef, env)
ll_d2 = famFuns$ll_d2(y, mu, exp_mu, coef)
if(isDummy){
dxi_dbeta = deriv_xi(jacob.mat, ll_d2, env, coef)
jacob.mat = jacob.mat + dxi_dbeta
} else dxi_dbeta = 0
hessVar = crossprod(jacob.mat, jacob.mat * ll_d2)
if(isNL){
# we get the 2nd derivatives
z = numDeriv::genD(evalNLpart, coef[nonlinear.params], env=env, method.args = hessianArgs)$D[, -(1:k), drop=FALSE]
ll_dl = famFuns$ll_dl(y, mu, exp_mu, coef=coef, env=env)
id_r = rep(1:k, 1:k)
id_c = c(sapply(1:k, function(x) 1:x), recursive=TRUE)
H = matrix(0, nrow=k, ncol=k)
H[cbind(id_r, id_c)] = H[cbind(id_r, id_c)] = colSums(z*ll_dl)
} else H = 0
# on ajoute la partie manquante
if(isNL) hessVar[1:k, 1:k] = hessVar[1:k, 1:k] + H
if(family=="negbin"){
theta = coef[".theta"]
ll_dx_dother = famFuns$ll_dx_dother(y, mu, exp_mu, coef, env)
if(isDummy){
dxi_dother = deriv_xi_other(ll_dx_dother, ll_d2, env, coef)
} else {
dxi_dother = 0
}
# calcul des derivees secondes vav de theta
h.theta.L = famFuns$hess.thetaL(theta, jacob.mat, y, dxi_dbeta, dxi_dother, ll_d2, ll_dx_dother)
hessVar = cbind(hessVar, h.theta.L)
h.theta = famFuns$hess_theta_part(theta, y, mu, exp_mu, dxi_dother, ll_dx_dother, ll_d2, env)
hessVar = rbind(hessVar, c(h.theta.L, h.theta))
}
if(anyNA(hessVar)){
stop("NaN in the Hessian, can be due to a possible overfitting problem.\nIf so, to have an idea of what's going on, you can reduce the value of the argument 'rel.tol' of the nlminb algorithm using the argument 'opt.control = list(rel.tol=?)' with ? the new value.")
}
# warn_0_Hessian = get(".warn_0_Hessian", env)
# if(!warn_0_Hessian && any(diag(hessVar) == 0)){
# # We apply the warning only once
# var_problem = params[diag(hessVar) == 0]
# warning("Some elements of the diagonal of the hessian are equal to 0: likely presence of collinearity. FYI the problematic variables are: ", paste0(var_problem, collapse = ", "), ".", immediate. = TRUE)
# assign(".warn_0_Hessian", TRUE, env)
# }
# if(verbose >= 2) cat("Hessian: ", (proc.time()-ptm)[3], "s\n", sep="")
- hessVar
}
femlm_gradient <- function(coef, env){
# cat("gradient:\n") ; print(as.vector(coef))
params = get("params", env)
names(coef) = params
nonlinear.params = get("nonlinear.params", env)
linear.params = get("linear.params", env)
famFuns = get(".famFuns", env)
family = get(".family", env)
y = get(".lhs", env)
mu_both = get_savedMu(coef, env)
mu = mu_both$mu
exp_mu = mu_both$exp_mu
# calcul de la jacobienne
res <- list() #stocks the results
# cat("\tgetting jacobian")
# ptm = proc.time()
jacob.mat = get_Jacobian(coef, env)
# cat("in", (proc.time()-ptm)[3], "s.\n")
# cat("\tComputing gradient ")
# ptm = proc.time()
# res = famFuns$grad(jacob.mat, y, mu, env, coef)
res = getGradient(jacob.mat, y, mu, exp_mu, env, coef)
# cat("in", (proc.time()-ptm)[3], "s.\n")
names(res) = c(nonlinear.params, linear.params)
if(family=="negbin"){
theta = coef[".theta"]
res[".theta"] = famFuns$grad.theta(theta, y, mu, exp_mu, env)
}
return(-unlist(res[params]))
}
femlm_scores <- function(coef, env){
# Computes the scores (Jacobian)
params = get("params", env)
names(coef) <- params
famFuns = get(".famFuns", env)
family = get(".family", env)
y = get(".lhs", env)
mu_both = get_savedMu(coef, env)
mu = mu_both$mu
exp_mu = mu_both$exp_mu
jacob.mat = get_Jacobian(coef, env)
scores = getScores(jacob.mat, y, mu, exp_mu, env, coef)
if(family=="negbin"){
theta = coef[".theta"]
score.theta = famFuns$scores.theta(theta, y, mu, exp_mu, env)
scores = cbind(scores, score.theta)
}
return(scores)
}
femlm_ll <- function(coef, env){
# Log likelihood
# misc funs
iter = get(".iter", env) + 1
assign(".iter", iter, env)
pastLL = get(".pastLL", env)
verbose = get(".verbose", env)
ptm = proc.time()
if(verbose >= 1){
coef_names = sapply(names(coef), charShorten, width = 10)
coef_line = paste0(coef_names, ": ", signif(coef), collapse = " -- ")
cat("\nIter", iter, "- Coefficients:", coef_line, "\n")
}
# computing the LL
famFuns = get(".famFuns", env)
family = get(".family", env)
y <- get(".lhs", env)
if(any(is.na(coef))) stop("Divergence... (some coefs are NA)\nTry option verbose=2 to figure out the problem.")
mu_both = get_mu(coef, env)
mu = mu_both$mu
exp_mu = mu_both$exp_mu
# for the NEGBIN, we add the coef
ll = famFuns$ll(y, mu, exp_mu, env, coef)
evolutionLL = ll - pastLL
assign(".evolutionLL", evolutionLL, env)
assign(".pastLL", ll, env)
if(iter == 1) evolutionLL = "--"
if(verbose >= 1) cat("LL = ", ll, " (", (proc.time()-ptm)[3], "s)\tevol = ", evolutionLL, "\n", sep = "")
if(ll==(-Inf)) return(1e308)
return(-ll) # je retourne -ll car la fonction d'optimisation minimise
}
evalNLpart = function(coef, env){
# cat("Enter evalNLpart : ", as.vector(coef), "\n")
# fonction qui evalue la partie NL
isNL = get("isNL", env)
if(!isNL) return(0)
envNL = get("envNL", env)
nonlinear.params <- get("nonlinear.params", env)
nl.call <- get(".nl.call", env)
nbSave = get(".nbSave", env)
nbMaxSave = get(".nbMaxSave", env)
savedCoef = get(".savedCoef", env)
savedValue = get(".savedValue", env)
if(!is.null(names(coef))){
coef = coef[nonlinear.params]
} else if (length(coef) != length(nonlinear.params)){
stop("Problem with the length of the NL coefficients.")
}
if(nbMaxSave == 0){
for(var in nonlinear.params) assign(var, coef[var], envNL)
y_nl <- eval(nl.call, envir = envNL)
# we check problems
if(anyNA(y_nl)){
stop("Evaluation of non-linear part returns NAs. The coefficients were: ", paste0(nonlinear.params, " = ", signif(coef[nonlinear.params], 3)), ".")
}
return(y_nl)
}
for(i in nbSave:1){
#les valeurs les + recentes sont en derniere position
if(all(coef == savedCoef[[i]])){
return(savedValue[[i]])
}
}
# Si la valeur n'existe pas, on la sauvegarde
# on met les valeurs les plus recentes en derniere position
for(var in nonlinear.params) assign(var, coef[var], envNL)
y_nl = eval(nl.call, envir = envNL)
# we check problems
if(anyNA(y_nl)){
stop("Evaluation of non-linear part returns NAs. The coefficients were: ", paste0(nonlinear.params, " = ", signif(coef[nonlinear.params], 3)), ".")
}
if(nbSave < nbMaxSave){
savedCoef[[nbSave + 1]] = coef
savedValue[[nbSave + 1]] = y_nl
assign(".nbSave", nbSave + 1, env)
} else if(nbMaxSave > 1){
tmp = list()
tmp[[nbSave]] = coef
tmp[1:(nbSave-1)] = savedCoef[2:nbSave]
savedCoef = tmp
tmp = list()
tmp[[nbSave]] = y_nl
tmp[1:(nbSave-1)] = savedValue[2:nbSave]
savedValue = tmp
} else{
savedCoef = list(coef)
savedValue = list(y_nl)
}
# cat("computed NL part:", as.vector(coef), "\n")
assign(".savedCoef", savedCoef, env)
assign(".savedValue", savedValue, env)
return(y_nl)
}
get_mu = function(coef, env, final = FALSE){
# This function computes the RHS of the equation
# mu_L => to save one matrix multiplication
isNL = get("isNL", env)
isLinear = get("isLinear", env)
isDummy = get("isDummy", env)
nobs = get("nobs", env)
params = get("params", env)
family = get(".family", env)
offset.value = get("offset.value", env)
names(coef) = params
# UseExp: indicator if the family needs to use exp(mu) in the likelihoods:
# this is useful because we have to compute it only once (save computing time)
# useExp_clusterCoef: indicator if we use the exponential of mu to obtain the cluster coefficients
# if it is TRUE, it will mean that the dummy will be equal
# to exp(mu_dummies) despite being named mu_dummies
useExp = family %in% c("poisson", "logit", "negbin")
useExp_clusterCoef = family %in% c("poisson")
# For managing mu:
coefMu = get(".coefMu", env)
valueMu = get(".valueMu", env)
valueExpMu = get(".valueExpMu", env)
wasUsed = get(".wasUsed", env)
if(wasUsed){
coefMu = valueMu = valueExpMu = list()
assign(".wasUsed", FALSE, env)
}
if(length(coefMu)>0){
for(i in 1:length(coefMu)){
if(all(coef==coefMu[[i]])){
return(list(mu = valueMu[[i]], exp_mu = valueExpMu[[i]]))
}
}
}
if(isNL){
muNL = evalNLpart(coef, env)
} else muNL = 0
if(isLinear){
linear.params = get("linear.params", env)
linear.mat = get("linear.mat", env)
mu_L = c(linear.mat %*% coef[linear.params])
} else mu_L = 0
mu_noDum = muNL + mu_L + offset.value
# Detection of overfitting issues with the logit model:
if(family == "logit"){
warn_overfit_logit = get(".warn_overfit_logit", env)
if(!warn_overfit_logit && max(abs(mu_noDum)) >= 300){
# overfitting => now finding the precise cause
# browser()
if(!isNL || (isLinear && max(abs(mu_L)) >= 100)){
# we create the matrix with the coefficients to find out the guy
mat_L_coef = linear.mat * matrix(coef[linear.params], nrow(linear.mat), 2, byrow = TRUE)
max_var = apply(abs(mat_L_coef), 2, max)
best_suspect = linear.params[which.max(max_var)]
warning("in femlm(): Likely presence of an overfitting problem. One suspect variable is: ", best_suspect, ".", immediate. = TRUE, call. = FALSE)
} else {
warning("in femlm(): Likely presence of an overfitting problem due to the non-linear part.", immediate. = TRUE, call. = FALSE)
}
assign(".warn_overfit_logit", TRUE, env)
}
}
# we create the exp of mu => used for later functions
exp_mu_noDum = NULL
if(useExp_clusterCoef){
exp_mu_noDum = rpar_exp(mu_noDum, env)
}
if(isDummy){
# we get back the last dummy
mu_dummies = getDummies(mu_noDum, exp_mu_noDum, env, coef, final)
} else {
if(useExp_clusterCoef){
mu_dummies = 1
} else {
mu_dummies = 0
}
}
# We add the value of the dummy to mu and we compute the exp if necessary
exp_mu = NULL
if(useExp_clusterCoef){
# despite being called mu_dummies, it is in fact exp(mu_dummies)!!!
exp_mu = exp_mu_noDum*mu_dummies
mu = rpar_log(exp_mu, env)
} else {
mu = mu_noDum + mu_dummies
if(useExp){
exp_mu = rpar_exp(mu, env)
}
}
if(isDummy){
# BEWARE, if useExp_clusterCoef, it is equal to exp(mu_dummies)
attr(mu, "mu_dummies") = mu_dummies
}
if(length(mu)==0) mu = rep(mu, nobs)
# we save the value of mu:
coefMu = append(coefMu, list(coef))
valueMu = append(valueMu, list(mu))
valueExpMu = append(valueExpMu, list(exp_mu))
assign(".coefMu", coefMu, env)
assign(".valueMu", valueMu, env)
assign(".valueExpMu", valueExpMu, env)
return(list(mu = mu, exp_mu = exp_mu))
}
get_savedMu = function(coef, env){
# This function gets the mu without computation
# It follows a LL evaluation
coefMu = get(".coefMu", env)
valueMu = get(".valueMu", env)
valueExpMu = get(".valueExpMu", env)
assign(".wasUsed", TRUE, env)
if(length(coefMu)>0) for(i in 1:length(coefMu)) if(all(coef==coefMu[[i]])){
# cat("coef nb:", i, "\n")
return(list(mu = valueMu[[i]], exp_mu = valueExpMu[[i]]))
}
stop("Problem in \"get_savedMu\":\n gradient did not follow LL evaluation.")
}
get_Jacobian = function(coef, env){
# retrieves the Jacobian of the "rhs"
params <- get("params", env)
names(coef) <- params
isNL <- get("isNL", env)
isLinear <- get("isLinear", env)
isGradient = get("isGradient", env)
if(isNL){
nonlinear.params = get("nonlinear.params", env)
jacob.mat = get_NL_Jacobian(coef[nonlinear.params], env)
} else jacob.mat = c()
if(isLinear){
linear.mat = get("linear.mat", env)
if(is.null(dim(jacob.mat))){
jacob.mat = linear.mat
} else {
jacob.mat = cbind(jacob.mat, linear.mat)
}
}
return(jacob.mat)
}
get_NL_Jacobian = function(coef, env){
# retrieves the Jacobian of the non linear part
#cat("In NL JAC:\n")
#print(coef)
nbSave = get(".JC_nbSave", env)
nbMaxSave = get(".JC_nbMaxSave", env)
savedCoef = get(".JC_savedCoef", env)
savedValue = get(".JC_savedValue", env)
nonlinear.params <- get("nonlinear.params", env)
coef = coef[nonlinear.params]
if(nbSave>0) for(i in nbSave:1){
#les valeurs les + recentes sont en derniere position
if(all(coef == savedCoef[[i]])){
# cat("Saved value:", as.vector(coef), "\n")
return(savedValue[[i]])
}
}
#Si la valeur n'existe pas, on la sauvegarde
#on met les valeurs les plus recentes en derniere position
isGradient <- get("isGradient", env)
if(isGradient){
call_gradient <- get(".call_gradient", env)
#we send the coef in the environment
for(var in nonlinear.params) assign(var, coef[var], env)
jacob.mat <- eval(call_gradient, envir=env)
jacob.mat <- as.matrix(as.data.frame(jacob.mat[nonlinear.params]))
} else {
jacobian.method <- get("jacobian.method", env)
jacob.mat <- numDeriv::jacobian(evalNLpart, coef, env=env, method=jacobian.method)
}
#Controls:
if(anyNA(jacob.mat)){
qui <- which(apply(jacob.mat, 2, function(x) anyNA(x)))
variables <- nonlinear.params[qui]
stop("ERROR: The Jacobian of the nonlinear part has NA!\nThis concerns the following variables:\n", paste(variables, sep=" ; "))
}
#Sauvegarde
if(nbSave<nbMaxSave){
savedCoef[[nbSave+1]] = coef
savedValue[[nbSave+1]] = jacob.mat
assign(".JC_nbSave", nbSave+1, env)
} else if(nbMaxSave>1){
tmp = list()
tmp[[nbSave]] = coef
tmp[1:(nbSave-1)] = savedCoef[2:nbSave]
savedCoef = tmp
tmp = list()
tmp[[nbSave]] = jacob.mat
tmp[1:(nbSave-1)] = savedValue[2:nbSave]
savedValue = tmp
} else{
savedCoef = list(coef)
savedValue = list(jacob.mat)
}
# print(colSums(jacob.mat))
# cat("computed NL Jacobian:", as.vector(coef), "\n")
# print(savedCoef)
assign(".JC_savedCoef", savedCoef, env)
assign(".JC_savedValue", savedValue, env)
return(jacob.mat)
}
get_model_null <- function(env, theta.init){
# I have the closed form of the ll0
famFuns = get(".famFuns", env)
family = get(".family", env)
N = get("nobs", env)
y = get(".lhs", env)
verbose = get(".verbose", env)
ptm = proc.time()
# one of the elements to be returned
theta = NULL
if(family == "poisson"){
# There is a closed form
if(".lfactorial" %in% names(env)){
lfact = get(".lfactorial", env)
} else {
# lfactorial(x) == lgamma(x+1)
# lfact = sum(lfactorial(y))
lfact = sum(rpar_lgamma(y + 1, env))
assign(".lfactorial", lfact, env)
}
sy = sum(y)
constant = log(sy / N)
# loglik = sy*log(sy) - sy*log(N) - sy - sum(lfactorial(y))
loglik = sy*log(sy) - sy*log(N) - sy - lfact
} else if(family == "gaussian"){
# there is a closed form
constant = mean(y)
ss = sum( (y - constant)**2 )
sigma = sqrt( ss / N )
loglik = -1/2/sigma^2*ss - N*log(sigma) - N*log(2*pi)/2
} else if(family == "logit"){
# there is a closed form
sy = sum(y)
constant = log(sy) - log(N - sy)
loglik = sy*log(sy) - sy*log(N-sy) - N*log(N) + N*log(N-sy)
} else if(family=="negbin"){
if(".lgamma" %in% names(env)){
lgamm = get(".lgamma", env)
} else {
# lgamm = sum(lgamma(y + 1))
lgamm = sum(rpar_lgamma(y + 1, env))
assign(".lgamma", lgamm, env)
}
sy = sum(y)
constant = log(sy / N)
mean_y = mean(y)
invariant = sum(y*constant) - lgamm
if(is.null(theta.init)){
theta.guess = max(mean_y**2 / max((var(y) - mean_y), 1e-4), 0.05)
} else {
theta.guess = theta.init
}
# I set up a limit of 0.05, because when it is too close to 0, convergence isnt great
opt <- nlminb(start=theta.guess, objective=famFuns$ll0_theta, y=y, gradient=famFuns$grad0_theta, lower=1e-3, mean_y=mean_y, invariant=invariant, hessian = famFuns$hess0_theta, env=env)
loglik = -opt$objective
theta = opt$par
}
if(verbose >= 2) cat("Null model in ", (proc.time()-ptm)[3], "s. ", sep ="")
return(list(loglik=loglik, constant=constant, theta = theta))
}
getGradient = function(jacob.mat, y, mu, exp_mu, env, coef, ...){
famFuns = get(".famFuns", env)
ll_dl = famFuns$ll_dl(y, mu, exp_mu, coef=coef, env=env)
c(crossprod(jacob.mat, ll_dl))
}
getScores = function(jacob.mat, y, mu, exp_mu, env, coef, ...){
famFuns = get(".famFuns", env)
isDummy = get("isDummy", env)
ll_dl = famFuns$ll_dl(y, mu, exp_mu, coef=coef, env=env)
scores = jacob.mat* ll_dl
if(isDummy){
ll_d2 = famFuns$ll_d2(y, mu, exp_mu, coef=coef, env=env)
dxi_dbeta = deriv_xi(jacob.mat, ll_d2, env, coef)
scores = scores + dxi_dbeta * ll_dl
}
return(as.matrix(scores))
}
getDummies = function(mu, exp_mu, env, coef, final = FALSE){
# function built to get all the dummy variables
# We retrieve past dummies (that are likely to be good
# starting values)
mu_dummies = get(".savedDummy", env)
family = get(".family", env)
eps.cluster = get(".eps.cluster", env)
verbose = get(".verbose", env)
if(verbose >= 2) ptm = proc.time()
#
# Dynamic precision
#
iterCluster = get(".iterCluster", env)
evolutionLL = get(".evolutionLL", env)
nobs = get("nobs", env)
iter = get(".iter", env)
iterLastPrecisionIncrease = get(".iterLastPrecisionIncrease", env)
nbIterOne = get(".nbIterOne", env)
if(iterCluster <= 2){
nbIterOne = nbIterOne + 1
} else { # we reinitialise
nbIterOne = 0
}
assign(".nbIterOne", nbIterOne, env)
# nber of times LL almost didn't increase
nbLowIncrease = get(".nbLowIncrease", env)
if(evolutionLL/nobs < 1e-8){
nbLowIncrease = nbLowIncrease + 1
} else { # we reinitialise
nbLowIncrease = 0
}
assign(".nbLowIncrease", nbLowIncrease, env)
if(!final && eps.cluster > .Machine$double.eps*10000 && iterCluster <= 2 && nbIterOne >= 2 && nbLowIncrease >= 2 && (iter - iterLastPrecisionIncrease) >= 3){
eps.cluster = eps.cluster/10
if(verbose >= 2) cat("Precision increased to", eps.cluster, "\n")
assign(".eps.cluster", eps.cluster, env)
assign(".iterLastPrecisionIncrease", iter, env)
# If the precision increases, we must also increase the precision of the dummies!
if(family %in% c("negbin", "logit")){
assign(".eps.NR", eps.cluster / 100, env)
}
# we also set acceleration to on
assign(".useAcc", TRUE, env)
} else if(final){
# we don't need ultra precision for these last dummies
eps.cluster = eps.cluster * 10**(iterLastPrecisionIncrease != 0)
if(family %in% c("negbin", "logit")){
assign(".eps.NR", eps.cluster / 100, env)
}
}
iterMax = get(".itermax.cluster", env)
nbCluster = get(".nbCluster", env)
Q = length(nbCluster)
# whether we use the eponentiation of mu
useExp_clusterCoef = family %in% c("poisson")
if(useExp_clusterCoef){
mu_in = exp_mu * mu_dummies
} else {
mu_in = mu + mu_dummies
}
#
# Computing the optimal mu
#
useAcc = get(".useAcc", env)
carryOn = FALSE
# Finding the complexity of the problem
firstRunCluster = get(".firstRunCluster", env)
if(firstRunCluster && Q >= 3){
# First iteration: we check if the problem is VERY difficult (for Q = 3+)
useAcc = TRUE
assign(".useAcc", TRUE, env)
res = convergence(coef, mu_in, env, iterMax = 15)
if(res$iter == 15){
assign(".difficultConvergence", TRUE, env)
carryOn = TRUE
}
} else if(useAcc){
res = convergence(coef, mu_in, env, iterMax)
if(res$iter <= 2){
# if almost no iteration => no acceleration next time
assign(".useAcc", FALSE, env)
}
} else {
res = convergence(coef, mu_in, env, iterMax = 15)
if(res$iter == 15){
carryOn = TRUE
}
}
if(carryOn){
# the problem is difficult => acceleration on
useAcc = TRUE
assign(".useAcc", TRUE, env)
res = convergence(coef, res$mu_new, env, iterMax)
}
mu_new = res$mu_new
iter = res$iter
#
# Retrieving the value of the dummies
#
if(useExp_clusterCoef){
mu_dummies = mu_new / exp_mu
} else {
mu_dummies = mu_new - mu
}
# Warning messages if necessary:
if(iter == iterMax) warning("[Getting cluster coefficients] iteration limit reached (", iterMax, ").", call. = FALSE, immediate. = TRUE)
assign(".iterCluster", iter, env)
# we save the dummy:
assign(".savedDummy", mu_dummies, env)
if(verbose >= 2){
acc_info = ifelse(useAcc, "+Acc. ", "-Acc. ")
cat("Cluster Coef.: ", (proc.time()-ptm)[3], "s (", acc_info, "iter:", iter, ")\t", sep = "")
}
# we update the flag
assign(".firstRunCluster", FALSE, env)
mu_dummies
}
deriv_xi = function(jacob.mat, ll_d2, env, coef){
# Derivative of the cluster coefficients
# data:
iterMax = get(".itermax.deriv", env)
nbCluster = get(".nbCluster", env)
Q = length(nbCluster)
verbose = get(".verbose", env)
if(verbose >= 2) ptm = proc.time()
#
# initialisation of dxi_dbeta
#
if(Q >= 2){
# We set the initial values for the first run
if(!".sum_deriv" %in% names(env)){
# init of the sum of the derivatives => 0
dxi_dbeta = matrix(0, nrow(jacob.mat), ncol(jacob.mat))
} else {
dxi_dbeta = get(".sum_deriv", env)
}
} else {
# no need if only 1, direct solution
dxi_dbeta = NULL
}
#
# Computing the optimal dxi_dbeta
#
accDeriv = get(".accDeriv", env)
carryOn = FALSE
# Finding the complexity of the problem
firstRunDeriv = get(".firstRunDeriv", env)
if(firstRunDeriv){
# set accDeriv: we use information on cluster deriv
iterCluster = get(".iterCluster", env)
diffConv = get(".difficultConvergence", env)
if(iterCluster < 20 & !diffConv){
accDeriv = FALSE
assign(".accDeriv", FALSE, env)
}
}
if(firstRunDeriv && accDeriv && Q >= 3){
# First iteration: we check if the problem is VERY difficult (for Q = 3+)
assign(".accDeriv", TRUE, env)
res = dconvergence(dxi_dbeta, jacob.mat, ll_d2, env, iterMax = 15)
if(res$iter == 15){
assign(".derivDifficultConvergence", TRUE, env)
carryOn = TRUE
}
} else if(accDeriv){
res = dconvergence(dxi_dbeta, jacob.mat, ll_d2, env, iterMax)
if(res$iter <= 10){
# if almost no iteration => no acceleration next time
assign(".accDeriv", FALSE, env)
}
} else {
res = dconvergence(dxi_dbeta, jacob.mat, ll_d2, env, iterMax = 50)
if(res$iter == 50){
carryOn = TRUE
}
}
if(carryOn){
# the problem is difficult => acceleration on
accDeriv = TRUE
assign(".accDeriv", TRUE, env)
res = dconvergence(res$dxi_dbeta, jacob.mat, ll_d2, env, iterMax)
}
dxi_dbeta = res$dxi_dbeta
iter = res$iter
if(iter == iterMax) warning("[Getting cluster derivatives] Maximum iterations reached (", iterMax, ").")
assign(".firstRunDeriv", FALSE, env)
assign(".sum_deriv", dxi_dbeta, env)
if(verbose >= 2){
acc_info = ifelse(accDeriv, "+Acc. ", "-Acc. ")
cat(" Derivatives: ", (proc.time()-ptm)[3], "s (", acc_info, "iter:", iter, ")\n", sep = "")
}
return(dxi_dbeta)
}
deriv_xi_other = function(ll_dx_dother, ll_d2, env, coef){
# derivative of the dummies wrt an other parameter
dumMat_cpp = get(".dumMat_cpp", env)
nbCluster = get(".nbCluster", env)
dum_all = get(".dum_all", env)
eps.deriv = get(".eps.deriv", env)
tableCluster_all = get(".tableCluster", env)
orderCluster_all = get(".orderCluster", env)
Q = length(dum_all)
iterMax = 5000
if(Q==1){
dum = dum_all[[1]]
k = max(dum)
S_Jmu = cpp_tapply_vsum(k, ll_dx_dother, dum)
S_mu = cpp_tapply_vsum(k, ll_d2, dum)
dxi_dother = - S_Jmu[dum] / S_mu[dum]
} else {
# The cpp way:
N = length(ll_d2)
# We set the initial values for the first run
if(!".sum_deriv_other" %in% names(env)){
init = rep(0, N)
} else {
init = get(".sum_deriv_other", env)
}
dxi_dother <- cpp_partialDerivative_other(iterMax, Q, N, epsDeriv = eps.deriv, ll_d2, ll_dx_dother, init, dumMat_cpp, nbCluster)
# we save the values
assign(".sum_deriv_other", dxi_dother, env)
}
as.matrix(dxi_dother)
}
####
#### Convergence ####
####
convergence = function(coef, mu_in, env, iterMax){
# computes the new mu wrt the cluster coefficients
nbCluster = get(".nbCluster", env)
Q = length(nbCluster)
useAcc = get(".useAcc", env)
diffConv = get(".difficultConvergence", env)
if(useAcc && diffConv && Q > 2){
# in case of complex cases: it's more efficient
# to initialize the first two clusters
res = conv_acc(coef, mu_in, env, iterMax, only2 = TRUE)
mu_in = res$mu_new
}
if(Q == 1){
mu_new = conv_single(coef, mu_in, env)
iter = 1
} else if(Q >= 2){
# Dynamic setting of acceleration
if(!useAcc){
res = conv_seq(coef, mu_in, env, iterMax = iterMax)
} else if(useAcc){
res = conv_acc(coef, mu_in, env, iterMax = iterMax)
}
mu_new = res$mu_new
iter = res$iter
}
# we return a list with: new mu and iterations
list(mu_new = mu_new, iter = iter)
}
conv_single = function(coef, mu_in, env){
# convergence for a single cluster
# it returns: the new mu (NOT mu_dummies)
# Loading all the required variables
lhs = get(".lhs", env)
nbCluster = get(".nbCluster", env)
dum_vector = get(".dum_vector", env)
tableCluster_vector = get(".tableCluster_vector", env)
sum_y_vector = get(".sum_y_vector", env)
cumtable_vector = get(".cumtable_vector", env)
obsCluster_vector = get(".obsCluster_vector", env)
nbThreads = get(".CORES", env)
eps.NR = get(".eps.NR", env)
family = get(".familyConv", env)
family_nb = switch(family, poisson=1, negbin=2, logit=3, gaussian=4, lpoisson=5)
theta = ifelse(family == "negbin", coef[".theta"], 1)
mu_new = update_mu_single_cluster(family = family_nb, nb_cluster = nbCluster, theta = theta, diffMax_NR = eps.NR, mu_in = mu_in, lhs = lhs, sum_y = sum_y_vector, dum = dum_vector, obsCluster = obsCluster_vector, table = tableCluster_vector, cumtable = cumtable_vector, nbThreads = nbThreads)
return(mu_new)
}
conv_seq = function(coef, mu_in, env, iterMax){
# convergence of cluster coef without acceleration
# Now all in cpp
# Loading all the required variables
lhs = get(".lhs", env)
nbCluster = get(".nbCluster", env)
dum_vector = get(".dum_vector", env)
tableCluster_vector = get(".tableCluster_vector", env)
sum_y_vector = get(".sum_y_vector", env)
cumtable_vector = get(".cumtable_vector", env)
obsCluster_vector = get(".obsCluster_vector", env)
nbThreads = get(".CORES", env)
eps.cluster = get(".eps.cluster", env)
eps.NR = get(".eps.NR", env)
family = get(".familyConv", env)
family_nb = switch(family, poisson=1, negbin=2, logit=3, gaussian=4, lpoisson=5)
theta = ifelse(family == "negbin", coef[".theta"], 1)
Q = length(nbCluster)
if(family == "lpoisson"){
# we transform the mu_in into a non exponential form
mu_in = log(mu_in)
}
if(Q == 2 & family == "poisson"){
# Required Variables
setup_poisson_fixedcost(env)
info = get(".fixedCostPoisson", env)
res = cpp_conv_seq_poi_2(n_i = info$n_i, n_j = info$n_j, n_cells = info$n_cells, index_i = info$index_i, index_j = info$index_j, order = info$order, dum_vector = dum_vector, sum_y_vector = sum_y_vector, iterMax = iterMax, diffMax = eps.cluster, exp_mu_in = mu_in)
} else if(Q == 2 & family == "gaussian"){
# Required variables
setup_gaussian_fixedcost(env)
info = get(".fixedCostGaussian", env)
invTableCluster_vector = get(".invTableCluster", env)
res = cpp_conv_seq_gau_2(n_i = info$n_i, n_j = info$n_j, n_cells = info$n_cells, r_mat_row = info$mat_row, r_mat_col = info$mat_col, r_mat_value_Ab = info$mat_value_Ab, r_mat_value_Ba = info$mat_value_Ba, dum_vector = dum_vector, lhs = lhs, invTableCluster_vector = invTableCluster_vector, iterMax = iterMax, diffMax = eps.cluster, mu_in = mu_in)
} else {
res = cpp_conv_seq_gnl(family = family_nb, iterMax = iterMax, diffMax = eps.cluster, diffMax_NR = eps.NR, theta = theta, lhs = lhs, nb_cluster_all = nbCluster, mu_init = mu_in, dum_vector = dum_vector, tableCluster_vector = tableCluster_vector, sum_y_vector = sum_y_vector, cumtable_vector = cumtable_vector, obsCluster_vector = obsCluster_vector, nbThreads = nbThreads)
}
if(family == "lpoisson"){
# we transform the mu_in into an exponential form
res$mu_new = exp(res$mu_new)
}
return(res)
}
conv_acc = function(coef, mu_in, env, iterMax, only2 = FALSE){
# convergence of cluster coef without acceleration
# Now all in cpp
# Loading all the required variables
lhs = get(".lhs", env)
nbCluster = get(".nbCluster", env)
dum_vector = get(".dum_vector", env)
tableCluster_vector = get(".tableCluster_vector", env)
sum_y_vector = get(".sum_y_vector", env)
cumtable_vector = get(".cumtable_vector", env)
obsCluster_vector = get(".obsCluster_vector", env)
nbThreads = get(".CORES", env)
eps.cluster = get(".eps.cluster", env)
eps.NR = get(".eps.NR", env)
family = get(".familyConv", env)
family_nb = switch(family, poisson=1, negbin=2, logit=3, gaussian=4, lpoisson=5)
theta = ifelse(family == "negbin", coef[".theta"], 1)
if(only2){
# means we compute the CC of the first two FE
# we recreate the values we send
nbCluster = nbCluster[1:2]
nb_keep = sum(nbCluster)
tableCluster_vector = tableCluster_vector[1:nb_keep]
sum_y_vector = sum_y_vector[1:nb_keep]
cumtable_vector = cumtable_vector[1:nb_keep]
dum_vector = dum_vector[1:(2*length(lhs))]
obsCluster_vector = obsCluster_vector[1:(2*length(lhs))]
}
Q = length(nbCluster)
if(family == "lpoisson"){
# we transform the mu_in into a non exponential form
mu_in = log(mu_in)
}
if(Q == 2 & family == "poisson"){
# Required Variables
setup_poisson_fixedcost(env)
info = get(".fixedCostPoisson", env)
res = cpp_conv_acc_poi_2(n_i = info$n_i, n_j = info$n_j, n_cells = info$n_cells, index_i = info$index_i, index_j = info$index_j, order = info$order, dum_vector = dum_vector, sum_y_vector = sum_y_vector, iterMax = iterMax, diffMax = eps.cluster, exp_mu_in = mu_in)
} else if(Q == 2 & family == "gaussian"){
# Required variables
setup_gaussian_fixedcost(env)
info = get(".fixedCostGaussian", env)
invTableCluster_vector = get(".invTableCluster", env)
res = cpp_conv_acc_gau_2(n_i = info$n_i, n_j = info$n_j, n_cells = info$n_cells, r_mat_row = info$mat_row, r_mat_col = info$mat_col, r_mat_value_Ab = info$mat_value_Ab, r_mat_value_Ba = info$mat_value_Ba, dum_vector = dum_vector, lhs = lhs, invTableCluster_vector = invTableCluster_vector, iterMax = iterMax, diffMax = eps.cluster, mu_in = mu_in)
} else {
res = cpp_conv_acc_gnl(family = family_nb, iterMax = iterMax, diffMax = eps.cluster, diffMax_NR = eps.NR, theta = theta, lhs = lhs, nb_cluster_all = nbCluster, mu_init = mu_in, dum_vector = dum_vector, tableCluster_vector = tableCluster_vector, sum_y_vector = sum_y_vector, cumtable_vector = cumtable_vector, obsCluster_vector = obsCluster_vector, nbThreads = nbThreads)
}
if(family == "poisson" && res$any_negative_poisson){
# we need to switch to log poisson
assign(".familyConv", "lpoisson", env)
verbose = get(".verbose", env)
if(verbose >= 3) cat("Switch to log-poisson (to cope with high valued FEs).\n")
res = conv_acc(coef, mu_in, env, iterMax, only2)
# we switch back to original poisson
assign(".familyConv", "poisson", env)
}
if(family == "lpoisson"){
# we transform the mu_in into an exponential form
res$mu_new = exp(res$mu_new)
}
return(res)
}
####
#### Convergence Deriv cpp ####
####
dconvergence = function(dxi_dbeta, jacob.mat, ll_d2, env, iterMax){
nbCluster = get(".nbCluster", env)
Q = length(nbCluster)
accDeriv = get(".accDeriv", env)
derivDiffConv = get(".derivDifficultConvergence", env)
if(accDeriv && derivDiffConv && Q > 2){
# in case of complex cases: it's more efficient
# to initialize the first two clusters
res = dconv_acc(dxi_dbeta, jacob.mat, ll_d2, env, iterMax, only2 = TRUE)
dxi_dbeta = res$dxi_dbeta
}
if(Q == 1){
# calculer single en cpp
dxi_dbeta = dconv_single(jacob.mat, ll_d2, env)
iter = 1
} else {
# The convergence algorithms
if(accDeriv){
res = dconv_acc(dxi_dbeta, jacob.mat, ll_d2, env, iterMax)
dxi_dbeta = res$dxi_dbeta
iter = res$iter
} else {
res = dconv_seq(dxi_dbeta, jacob.mat, ll_d2, env, iterMax)
dxi_dbeta = res$dxi_dbeta
iter = res$iter
}
}
return(list(dxi_dbeta = dxi_dbeta, iter = iter))
}
dconv_single = function(jacob.mat, ll_d2, env){
# data:
jacob_vector = as.vector(jacob.mat)
n_vars = ncol(jacob.mat)
nb_cluster_all = get(".nbCluster", env)
dum_vector = get(".dum_vector", env)
nb_coef = nb_cluster_all[[1]]
dxi_dbeta = update_deriv_single(n_vars, nb_coef, ll_d2, jacob_vector, dum_vector)
return(dxi_dbeta)
}
dconv_seq = function(dxi_dbeta, jacob.mat, ll_d2, env, iterMax){
# Parameters
jacob_vector = as.vector(jacob.mat)
n_vars = ncol(jacob.mat)
nb_cluster_all = get(".nbCluster", env)
dum_vector = get(".dum_vector", env)
deriv_init_vector = as.vector(dxi_dbeta)
eps.deriv = get(".eps.deriv", env)
Q = length(nb_cluster_all)
if(Q == 2){
setup_poisson_fixedcost(env)
info = get(".fixedCostPoisson", env)
res <- cpp_derivconv_seq_2(iterMax = iterMax, diffMax = eps.deriv, n_vars = n_vars, nb_cluster_all = nb_cluster_all, n_cells = info$n_cells, index_i = info$index_i, index_j = info$index_j, order = info$order, ll_d2 = ll_d2, jacob_vector = jacob_vector, deriv_init_vector = deriv_init_vector, dum_vector = dum_vector)
} else {
res <- cpp_derivconv_seq_gnl(iterMax = iterMax, diffMax = eps.deriv, n_vars, nb_cluster_all, ll_d2, jacob_vector, deriv_init_vector, dum_vector)
}
return(list(dxi_dbeta = res$dxi_dbeta, iter = res$iter))
}
dconv_acc = function(dxi_dbeta, jacob.mat, ll_d2, env, iterMax, only2 = FALSE){
# Parameters
jacob_vector = as.vector(jacob.mat)
n_vars = ncol(jacob.mat)
nb_cluster_all = get(".nbCluster", env)
dum_vector = get(".dum_vector", env)
deriv_init_vector = as.vector(dxi_dbeta)
eps.deriv = get(".eps.deriv", env)
if(only2){
# we update everything needed
nb_cluster_all = nb_cluster_all[1:2]
dum_vector = dum_vector[1:(2*nrow(jacob.mat))]
}
Q = length(nb_cluster_all)
if(Q == 2){
setup_poisson_fixedcost(env)
info = get(".fixedCostPoisson", env)
res <- cpp_derivconv_acc_2(iterMax = iterMax, diffMax = eps.deriv, n_vars = n_vars, nb_cluster_all = nb_cluster_all, n_cells = info$n_cells, index_i = info$index_i, index_j = info$index_j, order = info$order, ll_d2 = ll_d2, jacob_vector = jacob_vector, deriv_init_vector = deriv_init_vector, dum_vector = dum_vector)
} else {
res <- cpp_derivconv_acc_gnl(iterMax = iterMax, diffMax = eps.deriv, n_vars = n_vars, nb_cluster_all = nb_cluster_all, ll_d2 = ll_d2, jacob_vector = jacob_vector, deriv_init_vector = deriv_init_vector, dum_vector = dum_vector)
}
return(list(dxi_dbeta = res$dxi_dbeta, iter = res$iter))
}
####
#### Misc FE ####
####
setup_poisson_fixedcost = function(env){
# We set up only one
if(".fixedCostPoisson" %in% names(env)){
return(NULL)
}
ptm = proc.time()
dum_all = get(".dum_all",env)
dum_A = as.integer(dum_all[[1]])
dum_B = as.integer(dum_all[[2]])
myOrder = order(dum_A, dum_B)
index_i = dum_A[myOrder] - 1L
index_j = dum_B[myOrder] - 1L
n_cells = get_n_cells(index_i, index_j)
res = list(n_i = max(dum_A), n_j = max(dum_B), n_cells = n_cells, index_i = index_i, index_j = index_j, order = myOrder - 1L)
assign(".fixedCostPoisson", res, env)
verbose = get(".verbose", env)
if(verbose >= 2) cat("Poisson fixed-cost setup: ", (proc.time()-ptm)[3], "s\n", sep = "")
}
setup_gaussian_fixedcost = function(env){
# We set up only one
if(".fixedCostGaussian" %in% names(env)){
return(NULL)
}
ptm = proc.time()
lhs = get(".lhs", env)
invTableCluster_vector = get(".invTableCluster", env)
dum_vector = get(".dum_vector", env) # already minus 1
dum_all = get(".dum_all", env)
dum_i = as.integer(dum_all[[1]])
dum_j = as.integer(dum_all[[2]])
n_i = max(dum_i)
n_j = max(dum_j)
myOrder = order(dum_i, dum_j)
index_i = dum_i[myOrder] - 1L
index_j = dum_j[myOrder] - 1L
n_cells = get_n_cells(index_i, index_j)
res = cpp_fixed_cost_gaussian(n_i, n_cells, index_i, index_j, myOrder - 1L, invTableCluster_vector, dum_vector)
res$n_i = n_i
res$n_j = n_j
res$n_cells = n_cells
assign(".fixedCostGaussian", res, env)
verbose = get(".verbose", env)
if(verbose >= 2) cat("Gaussian fixed-cost setup: ", (proc.time()-ptm)[3], "s\n", sep = "")
}
####
#### Parallel Functions ####
####
# In this section, we create all the functions that will be parallelized
rpar_exp = function(x, env){
# fast exponentiation
isMulticore = get(".isMulticore", env)
FENmlm_CORES = get(".CORES", env)
if(!isMulticore){
# simple exponentiation
return(exp(x))
} else {
# parallelized one
return(cpppar_exp(x, FENmlm_CORES))
}
}
rpar_log = function(x, env){
# fast log
isMulticore = get(".isMulticore", env)
FENmlm_CORES = get(".CORES", env)
if(!isMulticore){
# simple log
return(log(x))
} else {
# parallelized one
return(cpppar_log(x, FENmlm_CORES))
}
}
rpar_lgamma = function(x, env){
# fast lgamma
isMulticore = get(".isMulticore", env)
FENmlm_CORES = get(".CORES", env)
if(!isMulticore){
# lgamma via cpp is faster
return(cpp_lgamma(x))
} else {
# parallelized one
return(cpppar_lgamma(x, FENmlm_CORES))
}
}
rpar_digamma = function(x, env){
isMulticore = get(".isMulticore", env)
FENmlm_CORES = get(".CORES", env)
if(!isMulticore){
# digamma via cpp is as fast => no need
return(digamma(x))
} else {
# parallelized one
return(cpppar_digamma(x, FENmlm_CORES))
}
}
rpar_trigamma = function(x, env){
isMulticore = get(".isMulticore", env)
FENmlm_CORES = get(".CORES", env)
if(!isMulticore){
# trigamma via cpp is as fast => no need
return(trigamma(x))
} else {
# parallelized one
return(cpppar_trigamma(x, FENmlm_CORES))
}
}
rpar_log_a_exp = function(a, mu, exp_mu, env){
# compute log_a_exp in a fast way
isMulticore = get(".isMulticore", env)
FENmlm_CORES = get(".CORES", env)
if(!isMulticore){
# cpp is faster
return(cpp_log_a_exp(a, mu, exp_mu))
} else {
# parallelized one
return(cpppar_log_a_exp(FENmlm_CORES, a, mu, exp_mu))
}
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.