ObtainModelEstimates: Estimating a contingency table using model-based approaches

Description Usage Arguments Value Note Author(s) References See Also Examples

View source: R/models.R

Description

This function provides several alternative estimating methods to the IPFP when estimating a multiway table subject to known constrains/totals: maximum likelihood method (ML), minimum chi-squared (CHI2) and weighted least squares (WLSQ). Note that the resulting estimators are probabilities.

The covariance matrix of the estimated proportions (as defined by Little and Wu, 1991) are also provided. Also in the case of the ML method, the covariance matrix defined by Lang (2004) is also returned.

Usage

1
2
ObtainModelEstimates(seed, target.list, target.data, method="ml", 
                     tol.margins = 1e-10, replace.zeros = 1e-10, ...)

Arguments

seed

The initial multi-dimensional array to be updated. Each cell must be non-negative.

target.list

A list of the target margins provided in target.data. Each component of the list is an array whose cells indicates which dimension the corresponding margin relates to.

target.data

A list containing the data of the target margins. Each component of the list is an array storing a margin. The list order must follow the one defined in target.list. Note that the cells of the arrays must be non-negative.

method

Determine the model to be used for estimating the contingency table. By default the method is ml (maximum likelihood); other options available are chi2 (minimum chi-squared) and lsq (least squares).

tol.margins

Tolerance for the margins consistency. Default is 1e-10.

replace.zeros

Constant that is added to zero cell found in the seed, as procedures require strictly positive cells. Default value is 1e-10.

...

Additional parameters that can be passed to control the optimisation process (see solnp from the package Rsolnp).

Value

A list containing the final estimated table as well as the covariance matrix of the estimated proportion and other convergence informations.

x.hat

Array of the estimated table frequencies.

p.hat

Array of the estimated table probabilities.

error.margins

For each list element of target.data, check.margins shows the maximum absolute deviation between the element and the corresponding estimated margin. Note that the deviations should approximate zero, otherwise the target margins are not met.

solnp.res

The estimation process uses the solnp optimisation function from the R package Rsolnp and solnp.res is the corresponding object returned by the solver.

conv

A boolean indicating whether the algorithm converged to a solution.

method

The selected method for estimation.

call

The matched call.

Note

It is important to note that if the margins given in target.list are not consistent (i.e. the sums of their cells are not equals), the input data is then normalised by considering probabilities instead of frequencies:

Author(s)

Thomas Suesse

Maintainer: Johan Barthelemy <johan@uow.edu.au>.

References

Lang, J.B. (2004) Multinomial-Poisson homogeneous models for contingency tables. Annals of Statistics 32(1): 340-383.

Little, R. J., Wu, M. M. (1991) Models for contingency tables with known margins when target and sampled populations differ. Journal of the American Statistical Association 86 (413): 87-95.

See Also

solnp function documentation of the package Rsolnp for the details of the solnp.res object returned by the function.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# set-up an initial 3-way table of dimension (2 x 2 x 2)
seed <- Vector2Array(c(80, 60, 20, 20, 40, 35, 35, 30), dim = c(c(2, 2, 2)))

# building target margins
margins12 <- c(2000, 1000, 1500, 1800)
margins12.array <- Vector2Array(margins12, dim=c(2, 2))
margins3 <- c(4000,2300)
margins3.array <- Vector2Array(margins3, dim = 2) 
target.list <- list(c(1, 2), 3)
target.data <- list(margins12.array, margins3.array)

# estimating the new contingency table using the ml method
results.ml <- ObtainModelEstimates(seed, target.list, target.data, 
                                   compute.cov = TRUE)
print(results.ml)

# estimating the new contingency table using the chi2 method
results.chi2 <- ObtainModelEstimates(seed, target.list, target.data, 
                                     method = "chi2", compute.cov = TRUE)
print(results.chi2)

# estimating the new contingency table using the lsq method
results.lsq <- ObtainModelEstimates(seed, target.list, target.data,
                                    method = "lsq", compute.cov = TRUE)
print(results.lsq)

Example output

Loading required package: cmm
Loading required package: Rsolnp
Loading required package: numDeriv

Call:
ObtainModelEstimates(seed = seed, target.list = target.list, 
    target.data = target.data, compute.cov = TRUE)

Method:  ml - convergence:  TRUE 

Estimates:
        
V1.V2.V3  Estimate
   1.1.1 1269.2017
   2.1.1  934.3825
   1.2.1  613.0915
   2.2.1 1183.3243
   1.1.2  730.7983
   2.1.2  565.6175
   1.2.2  386.9085
   2.2.2  616.6757

Call:
ObtainModelEstimates(seed = seed, target.list = target.list, 
    target.data = target.data, method = "chi2", compute.cov = TRUE)

Method:  chi2 - convergence:  TRUE 

Estimates:
        
V1.V2.V3  Estimate
   1.1.1 1229.9127
   2.1.1  925.2633
   1.2.1  626.0346
   2.2.1 1218.7894
   1.1.2  770.0873
   2.1.2  574.7367
   1.2.2  373.9654
   2.2.2  581.2106

Call:
ObtainModelEstimates(seed = seed, target.list = target.list, 
    target.data = target.data, method = "lsq", compute.cov = TRUE)

Method:  lsq - convergence:  TRUE 

Estimates:
        
V1.V2.V3  Estimate
   1.1.1 1397.6662
   2.1.1  938.7294
   1.2.1  574.3193
   2.2.1 1089.2851
   1.1.2  602.3338
   2.1.2  561.2706
   1.2.2  425.6807
   2.2.2  710.7149

mipfp documentation built on May 2, 2019, 6:01 a.m.