Description Usage Arguments Value Note Author(s) References See Also Examples
View source: R/ipfp_multi_dim.R
This function implements the iterative proportional fitting (IPFP) procedure. This procedure updates an initial N-dimensional array (referred as the seed) with respect to given target marginal distributions. Those targets can also be multi-dimensional. This procedure is also able to estimate a (multi-dimensional) contingency table (encoded as an array) matching a given set of (multi-dimensional) margins. In that case, each cell of the seed must simply be set to 1.
The IPFP is also known as the RAS algorithm in economics and matrix raking or matrix scaling in computer science.
1 2 |
seed |
The initial multi-dimensional array to be updated. Each cell must be non-negative. |
target.list |
A list of dimensions of the marginal target constrains in
|
target.data |
A list containing the data of the target marginal tables. Each
component of the list is an array storing a margin.
The list order must follow the ordering defined in |
print |
Verbose parameter: if TRUE prints the current iteration number and the associated value of the stopping criterion. Default is FALSE. |
iter |
Stopping criterion. The maximum number of iteration allowed; must be greater than 0. Default is 1000. |
tol |
Stopping criterion. If the maximum absolute difference between two iteration
is lower than the value specified by |
tol.margins |
Tolerance for the margins consistency. Default is 1e-10. |
na.target |
If set to TRUE, allows the targets to have NA cells. Note that in that particular case the margins consistency is not checked. |
A list containing the final updated array as well as other convergence informations.
x.hat |
An array with the same dimension of |
p.hat |
An array with the same dimension of |
evol.stp.crit |
The evolution of the stopping criterion over the iterations. |
conv |
A boolean indicating whether the algorithm converged to a solution. |
error.margins |
A list returning, for each margin, the absolute maximum deviation between the desired and generated margin. |
method |
The selected method for estimation (here it will always be |
call |
The matched call. |
It is important to note that if the margins given in target.list
are
not consistent (i.e. the sums of their cells are not equals), the input data
is then normalised by considering probabilities instead of frequencies:
the cells of the seed are divided by sum(seed)
;
the cells of each margin i
of the list target.data
are
divided by sum(target.data[[i]])
.
Johan Barthelemy.
Maintainer: Johan Barthelemy johan@uow.edu.au.
Bacharach, M. (1965). Estimating Nonnegative Matrices from Marginal Data. International Economic Review (Blackwell Publishing) 6 (3): 294-310.
Bishop, Y. M. M., Fienberg, S. E., Holland, P. W. (1975). Discrete Multivariate Analysis: Theory and Practice. MIT Press. ISBN 978-0-262-02113-5.
Deming, W. E., Stephan, F. F. (1940). On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known. Annals of Mathematical Statistics 11 (4): 427-444.
Fienberg, S. E. (1970). An Iterative Procedure for Estimation in Contingency Tables. Annals of Mathematical Statistics 41 (3): 907-917.
Stephan, F. F. (1942). Iterative method of adjusting frequency tables when expected margins are known. Annals of Mathematical Statistics 13 (2): 166-178.
The documentation of IpfpCov
provide details on the
the covariance matrices determination.
ObtainModelEstimates
for alternatives
to the IPFP.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | # Example 1: 2-way table (V1,V2) of dim=(2,2)
# generating an intial 2-way table to be updated
seed.2d <- array(1,dim=c(2,2))
# desired targets (margins) : V1 and V2
target.row <- c(50,50)
target.col <- c(30,70)
# storing the margins in a list
tgt.data.2d <- list(target.col, target.row)
# list of dimensions of each marginal constrain
tgt.list.2d <- list(1,2)
# calling the Ipfp function
res.2d <- Ipfp(seed.2d, tgt.list.2d, tgt.data.2d)
# Example 2: 3-way table (V1,V2,V3) of dim=(2,4,2)
# seed
seed.3d <- array(1,c(2,4,2))
seed.3d[1,1,1] <- 4
seed.3d[1,3,1] <- 10
seed.3d[1,4,2] <- 6
# desired targets (margins) : V1 and (V2,V3)
target.V1 <- c(50, 16)
target.V2.V3 <- array(4, dim=c(4,2))
target.V2.V3[1,1] <- 10
target.V2.V3[3,1] <- 22
target.V2.V3[4,2] <- 14
# list of dimensions of each marginal constrain
tgt.data.3d <- list(target.V1, target.V2.V3)
# storing the description of target data in a list
tgt.list.3d <- list( 1, c(2,3) )
# calling the Ipfp function
res.3d <- Ipfp(seed.3d, tgt.list.3d, tgt.data.3d, iter=50, print=TRUE, tol=1e-5)
# Example 3: 2-way table (V1,V2) of dim=(2,3) with missing values in the targets
# generating an intial 2-way table to be updated
seed.2d.na <- array(1,dim=c(2,3))
# desired targets (margins) : V1 and V2
target.row.na <- c(40,60)
target.col.na <- c(NA,10,NA)
# storing the margins in a list
tgt.data.2d.na <- list(target.row.na, target.col.na)
# storing the description of target data in a list
tgt.list.2d.na <- list(1,2)
# calling the Ipfp function
res.2d.na <- Ipfp(seed.2d.na, tgt.list.2d.na, tgt.data.2d.na, na.target=TRUE)
|
Loading required package: cmm
Loading required package: Rsolnp
Loading required package: numDeriv
Margins consistency checked!
... ITER 1
stoping criterion: 10
... ITER 2
stoping criterion: 0
Convergence reached after 2 iterations!
Warning message:
In Ipfp(seed.2d.na, tgt.list.2d.na, tgt.data.2d.na, na.target = TRUE) :
Missing values allowed in the target margins.
Computation of the covariance matrices set to FALSE!
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.