lsm: Estimation of the log Likelihood of the Saturated Model
In lsm: Estimation of the log Likelihood of the Saturated Model

lsm	R Documentation

Estimation of the log Likelihood of the Saturated Model

Description

When the values of the outcome variable Y are either 0 or 1, the function lsm() calculates the estimation of the log likelihood in the saturated model. This model is characterized by Llinas (2006, ISSN:2389-8976) in section 2.3 through the assumptions 1 and 2. If Y is dichotomous and the data are grouped in J populations, it is recommended to use the function lsm() because it works very well for all K.

Usage

lsm(formula, family = binomial, data = environment(formula), ...)

Arguments

`formula`	An expression of the form y ~ model, where y is the outcome variable (binary or dichotomous: its values are 0 or 1).
`family`	an optional funtion for example binomial.
`data`	an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which `lsm()` is called.
`...`	further arguments passed to or from other methods.

Details

Estimation of the log Likelihood of the Saturated Model

An expression of the form y ~ model is interpreted as a specification that the response y is modelled by a linear predictor specified symbolically by model (systematic component). Such a model consists of a series of terms separated by + operators. The terms themselves consist of variable and factor names separated by : operators. Such a term is interpreted as the interaction of all the variables and factors appearing in the term. Here, y is the outcome variable (binary or dichotomous: its values are 0 or 1).

Value

lsm returns an object of class "lsm".

An object of class "lsm" is a list containing at least the following components:

`coefficients`	Vector of coefficients estimations (intercept and slopes).
`coef`	Vector of coefficients estimations (intercept and slopes).
`Std.Error`	Vector of the coefficients’s standard error (intercept and slopes).
`ExpB`	Vector with the exponential of the coefficients (intercept and slopes).
`Wald`	Value of the Wald statistic (with chi-squared distribution).
`DF`	Degree of freedom for the Chi-squared distribution.
`P.value`	P-value calculated with the Chi-squared distribution.
`Log_Lik_Complete`	Estimation of the log likelihood in the complete model.
`Log_Lik_Null`	Estimation of the log likelihood in the null model.
`Log_Lik_Logit`	Estimation of the log likelihood in the logistic model.
`Log_Lik_Saturate`	Estimation of the log likelihood in the saturate model.
`Populations`	Number of populations in the saturated model.
`Dev_Null_vs_Logit`	Value of the test statistic (Hypothesis: null vs logistic models).
`Dev_Logit_vs_Complete`	Value of the test statistic (Hypothesis: logistic vs complete models).
`Dev_Logit_vs_Saturate`	Value of the test statistic (Hypothesis: logistic vs saturated models).
`Df_Null_vs_Logit`	Degree of freedom for the test statistic’s distribution (Hypothesis: null vs logistic models).
`Df_Logit_vs_Complete`	Degree of freedom for the test statistic’s distribution (Hypothesis: logistic vs saturated models).
`Df_Logit_vs_Saturate`	Degree of freedom for the test statistic’s distribution (Hypothesis: logistic vs saturated models).
`P.v_Null_vs_Logit`	P-value for the hypothesis test: null vs logistic models.
`P.v_Logit_vs_Complete`	P-value for the hypothesis test: logistic vs complete models.
`P.v_Logit_vs_Saturate`	P-value for the hypothesis test: logistic vs saturated models.
`Logit`	Vector with the log-odds.
`p_hat_complete`	Vector with the probabilities that the outcome variable takes the value 1, given the `jth` population (estimated with the complete model and without the logistic model).
`p_hat_null`	Vector with the probabilities that the outcome variable takes the value 1, given the `jth` population (estimated with the null model and without the logistic model).
`p_j`	Vector with the probabilities that the outcome variable takes the value 1, given the `jth` population (estimated with the logistic model).
`odd`	Vector with the values of the odd in each `jth` population.
`OR`	Vector with the values of the odd ratio for each coefficient of the variables.
`z_j`	Vector with the values of each `Zj` (the sum of the observations in the `jth` population).
`n_j`	Vector with the `nj` (the number of the observations in each `jth` population).
`p_j_tilde`	Vector with the estimation of each `pj` (the probability of success in the `jth` population) in the saturated model (without estimate the logistic parameters).
`v_j`	Vector with the variance of the Bernoulli variables in the `jth` population.
`m_j`	Vector with the expected values of `Zj` in the `jth` population.
`V_j`	Vector with the variances of `Zj` in the `jth` population.
`V`	Variance and covariance matrix of `Z`, the vector that contains all the `Zj`.
`S_p`	Score vector in the saturated model.
`I_p`	Information matrix in the saturated model.
`Zast_j`	Vector with the values of the standardized variable of `Zj`.
`mcov`	Variance and covariance matrix for coefficient estimates.
`mcor`	Correlation matrix for coefficient estimates.
`Esm`	Data frame with estimates in the saturated model. It contains for each population `j`: the value of the explanatory variables, `nj`, `Zj`, `pj` and Log-Likelihood `Lj_tilde`.
`Elm`	Data frame with estimates in the logistic model. It contains for each population `j`: the value of the explanatory variables, `nj`, `Zj`, `pj`, Log-Likelihood `Lj`, `Logit_pj` and the variance of logit (`var.logit`).
`call`	It displays the original call that was used to fit the model lsm.
`data`	data envarironment.
`...`	Additional arguments to be passed to methods.

Author(s)

Dr. rer. nat. Humberto LLinás Solano [aut] (Universidad del Norte, Barranquilla-Colombia); MSc. Omar Fábregas Cera [aut] (Universidad del Norte, Barranquilla-Colombia); MSc. Jorge Villalba Acevedo [cre, aut] (Universidad Tecnológica de Bolívar, Cartagena-Colombia).

References

[1] LLinás, H. J. (2006). Precisiones en la teoría de los modelos logísticos. Revista Colombiana de Estadística, 29(2), 239–265. https://revistas.unal.edu.co/index.php/estad/article/view/29310

[2] Hosmer, D.W., Lemeshow, S. and Sturdivant, R.X. (2013). Applied Logistic Regression, 3rd ed., New York: Wiley.

[3] Chambers, J. M. and Hastie, T. J. (1992). Statistical Models in S. Wadsworth & Brooks/Cole.

Examples

#library(lsm)

#1. AGE and Coronary Heart Disease (CHD) Status of 20 subjects:

   #AGE <- c(20,23,24,25,25,26,26,28,28,29,30,30,30,30,30,30,30,32,33,33)
   #CHD <- c(0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0)
   #data <- data.frame (CHD,  AGE )
   #lsm(CHD ~ AGE , data)

#2.You can use the following notation:

   #lsm(y~., data)

#3. Other example:

   #y <- c(1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1)
   #x1 <- c(2, 2, 2, 5, 5, 5, 5, 8, 8, 11, 11, 11)
   #data <- data.frame (y, x1)
   #ELAINYS <-lsm(y ~ x1, data)
   #summary(ELAINYS)

#4. Other example:

   #y <- as.factor(c(1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1))
   #x1 <- as.factor(c(2, 2, 2, 5, 5, 5, 5, 8, 8, 11, 11, 11))
   #data <- data.frame (y, x1)
   #ELAINYS1 <-lsm(y ~ x1, family=binomial, data)
   #summary(ELAINYS1)

lsm documentation built on June 8, 2025, 12:40 p.m.