lsm: Estimation of the log Likelihood of the Saturated Model

View source: R/lsm.R

lsmR Documentation

Estimation of the log Likelihood of the Saturated Model

Description

When the values of the outcome variable Y are either 0 or 1, the function lsm() calculates the estimation of the log likelihood in the saturated model. This model is characterized by Llinas (2006, ISSN:2389-8976) in section 2.3 through the assumptions 1 and 2. If Y is dichotomous and the data are grouped in J populations, it is recommended to use the function lsm() because it works very well for all K.

Usage

lsm(formula, family = binomial, data = environment(formula), ...)

Arguments

formula

An expression of the form y ~ model, where y is the outcome variable (binary or dichotomous: its values are 0 or 1).

family

an optional funtion for example binomial.

data

an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which lsm() is called.

...

further arguments passed to or from other methods.

Details

Estimation of the log Likelihood of the Saturated Model

An expression of the form y ~ model is interpreted as a specification that the response y is modelled by a linear predictor specified symbolically by model (systematic component). Such a model consists of a series of terms separated by + operators. The terms themselves consist of variable and factor names separated by : operators. Such a term is interpreted as the interaction of all the variables and factors appearing in the term. Here, y is the outcome variable (binary or dichotomous: its values are 0 or 1).

Value

lsm returns an object of class "lsm".

An object of class "lsm" is a list containing at least the following components:

coefficients

Vector of coefficients estimations (intercept and slopes).

coef

Vector of coefficients estimations (intercept and slopes).

Std.Error

Vector of the coefficients’s standard error (intercept and slopes).

ExpB

Vector with the exponential of the coefficients (intercept and slopes).

Wald

Value of the Wald statistic (with chi-squared distribution).

DF

Degree of freedom for the Chi-squared distribution.

P.value

P-value calculated with the Chi-squared distribution.

Log_Lik_Complete

Estimation of the log likelihood in the complete model.

Log_Lik_Null

Estimation of the log likelihood in the null model.

Log_Lik_Logit

Estimation of the log likelihood in the logistic model.

Log_Lik_Saturate

Estimation of the log likelihood in the saturate model.

Populations

Number of populations in the saturated model.

Dev_Null_vs_Logit

Value of the test statistic (Hypothesis: null vs logistic models).

Dev_Logit_vs_Complete

Value of the test statistic (Hypothesis: logistic vs complete models).

Dev_Logit_vs_Saturate

Value of the test statistic (Hypothesis: logistic vs saturated models).

Df_Null_vs_Logit

Degree of freedom for the test statistic’s distribution (Hypothesis: null vs logistic models).

Df_Logit_vs_Complete

Degree of freedom for the test statistic’s distribution (Hypothesis: logistic vs saturated models).

Df_Logit_vs_Saturate

Degree of freedom for the test statistic’s distribution (Hypothesis: logistic vs saturated models).

P.v_Null_vs_Logit

P-value for the hypothesis test: null vs logistic models.

P.v_Logit_vs_Complete

P-value for the hypothesis test: logistic vs complete models.

P.v_Logit_vs_Saturate

P-value for the hypothesis test: logistic vs saturated models.

Logit

Vector with the log-odds.

p_hat_complete

Vector with the probabilities that the outcome variable takes the value 1, given the jth population (estimated with the complete model and without the logistic model).

p_hat_null

Vector with the probabilities that the outcome variable takes the value 1, given the jth population (estimated with the null model and without the logistic model).

p_j

Vector with the probabilities that the outcome variable takes the value 1, given the jth population (estimated with the logistic model).

odd

Vector with the values of the odd in each jth population.

OR

Vector with the values of the odd ratio for each coefficient of the variables.

z_j

Vector with the values of each Zj (the sum of the observations in the jth population).

n_j

Vector with the nj (the number of the observations in each jth population).

p_j_tilde

Vector with the estimation of each pj (the probability of success in the jth population) in the saturated model (without estimate the logistic parameters).

v_j

Vector with the variance of the Bernoulli variables in the jth population.

m_j

Vector with the expected values of Zj in the jth population.

V_j

Vector with the variances of Zj in the jth population.

V

Variance and covariance matrix of Z, the vector that contains all the Zj.

S_p

Score vector in the saturated model.

I_p

Information matrix in the saturated model.

Zast_j

Vector with the values of the standardized variable of Zj.

mcov

Variance and covariance matrix for coefficient estimates.

mcor

Correlation matrix for coefficient estimates.

Esm

Data frame with estimates in the saturated model. It contains for each population j: the value of the explanatory variables, nj, Zj, pj and Log-Likelihood Lj_tilde.

Elm

Data frame with estimates in the logistic model. It contains for each population j: the value of the explanatory variables, nj, Zj, pj, Log-Likelihood Lj, Logit_pj and the variance of logit (var.logit).

call

It displays the original call that was used to fit the model lsm.

data

data envarironment.

...

Additional arguments to be passed to methods.

Author(s)

Dr. rer. nat. Humberto LLinás Solano [aut] (Universidad del Norte, Barranquilla-Colombia); MSc. Omar Fábregas Cera [aut] (Universidad del Norte, Barranquilla-Colombia); MSc. Jorge Villalba Acevedo [cre, aut] (Universidad Tecnológica de Bolívar, Cartagena-Colombia).

References

[1] LLinás, H. J. (2006). Precisiones en la teoría de los modelos logísticos. Revista Colombiana de Estadística, 29(2), 239–265. https://revistas.unal.edu.co/index.php/estad/article/view/29310

[2] Hosmer, D.W., Lemeshow, S. and Sturdivant, R.X. (2013). Applied Logistic Regression, 3rd ed., New York: Wiley.

[3] Chambers, J. M. and Hastie, T. J. (1992). Statistical Models in S. Wadsworth & Brooks/Cole.

See Also

lsm

Examples

#library(lsm)

#1. AGE and Coronary Heart Disease (CHD) Status of 20 subjects:

   #AGE <- c(20,23,24,25,25,26,26,28,28,29,30,30,30,30,30,30,30,32,33,33)
   #CHD <- c(0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0)
   #data <- data.frame (CHD,  AGE )
   #lsm(CHD ~ AGE , data)

#2.You can use the following notation:

   #lsm(y~., data)

#3. Other example:

   #y <- c(1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1)
   #x1 <- c(2, 2, 2, 5, 5, 5, 5, 8, 8, 11, 11, 11)
   #data <- data.frame (y, x1)
   #ELAINYS <-lsm(y ~ x1, data)
   #summary(ELAINYS)

#4. Other example:

   #y <- as.factor(c(1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 1))
   #x1 <- as.factor(c(2, 2, 2, 5, 5, 5, 5, 8, 8, 11, 11, 11))
   #data <- data.frame (y, x1)
   #ELAINYS1 <-lsm(y ~ x1, family=binomial, data)
   #summary(ELAINYS1)

lsm documentation built on June 22, 2024, 10:31 a.m.

Related to lsm in lsm...