# Tune: Tuning Function for Ecological Inference for Sets of R x C... In RxCEcolInf: 'R x C Ecological Inference With Optional Incorporation of Survey Information'

## Description

This function tunes the markov chain monte carlo algorithm used to fit a hierarchical model to ecological data in which the underlying contigency tables can have any number of rows or columns. The user supplies the data and may specify hyperprior values. The function's primary output is a vector of multipliers, called `rhos`, used to adjust the covariance matrix of the multivariate t_4 distribution used to propose new values of intermediate-level parameters (denoted THETAS).

## Usage

 ```1 2 3 4 5 6 7``` ```Tune(fstring, data=NULL, num.runs=12, num.iters=10000, rho.vec=rep(0.05, ntables), kappa=10, nu=(mu.dim+6), psi=mu.dim, mu.vec.0=rep(log((.45/(mu.dim-1))/.55), mu.dim), mu.vec.cu=runif(mu.dim, -3, 0), nolocalmode=50, sr.probs=NULL, sr.reps=NULL, numscans=1, Diri=100, dof=4, debug=1) ```

## Arguments

 `fstring` String: model formula of contingency tables' column totals versus row totals. Must be in specified format (an R character string and NOT a true R formula). See Details and Examples. `data` Data frame. `num.runs` Positive integer: The number of runs or times (each of `num.iters` iterations) the tuning algorthm will be implemented. `num.iters` Positive integer: The number of iterations in each run of the tuning algorithm. `rho.vec` Vector of dimension I = number of contigency tables = number of rows in `data`: initial values of multipliers (usually in (0,1)) to the covariance matrix of the proposal distribution for the draws of the intermediate level parameters. The purpose of this `Tune` function is to adjust these values so as to achieve acceptance ratios of between .2 and .5 in the MCMC draws of the `THETA`s. `kappa` Scalar: The diagonal of the covariance matrix for the (normal) hyperprior distribution for the mu parameter. `nu` Scalar: The degrees of freedom for the (Inverse-Wishart) hyperprior distriution for the `SIGMA` parameter. `psi` Scalar: The diagonal of the matrix parameter of the (Inverse-Wishart) hyperprior distribution for the `SIGMA` parameter. `mu.vec.0` Vector: mean of the (normal) hyperprior distribution for the mu parameter. `mu.vec.cu` Vector of dimension R*(C-1), where R(C) is the number of rows(columns) in each contigency table: Optional starting values for mu parameter. `nolocalmode` Positive integer: How often an alternative drawing method for the contigency table internal cell counts will be used. Use of default value recommended. `sr.probs` Matrix of dimension I x R: Each value represents the probability of selecting a particular contingency table's row as the row to be calculated deterministically in (product multinomial) proposals for Metropolis draws of the internal cell counts. For example, if R = 3 and row 2 of position `sr.probs` = c(.1, .5, .4), then in the third contingency table (correspoding to the third row of `data`), the proposal algorithm for the interior cell counts will calculate the third contingency table's first row deterministically with probability .1, the second row with probability .5, and the third row with probability .4. Use of default (generated internally) recommended. `sr.reps` Matrix of dimension I x R: Each value represents the number of times the (product multinomial proposal) Metropolis algorithm will be attempted when, in drawing the internal cell counts, the proposal for the corresponding contingency table row is to be calculated deterministically. sr.reps has the same structure as sr.probs, i.e., position [3,1] of sr.reps corresponds to the third contingency table's first row. Use of default (generated internally) recommended. `numscans` Positive integer: How often the algorithm to draw the contingency table internal cell counts will be implemented before new values of the other parameters are drawn. Use of default value recommended. `Diri` Positive integer: How often a product Dirichlet proposal distribution will be used to draw the contingency table row probability vectors (the THETAS). `dof` Positive integer: The degrees of freedom of the multivariate t proposal distribution used in drawing the contingency table row probability vectors (the THETAS). `debug` Integer: Akin to `verbose` in some packages. If set to 1, certain status information (including rough notification regarding the number of iterations completed) will be written to the screen.

## Details

Tune is a necessary precursor function to `Analyze`, the workhorse function in fitting the R x C ecological inference model described in Greiner & Quinn (2009). The details of this model are discussed in the documentation accompanying `Analyze`.

One of the stages of the Gibbs sampler used to fit the Greiner & Quinn ecological inference model involves sampling from the conditional posterior distribution of the vector of probabilities associated with each contingency table (precinct, in voting applications). There are R separate sets of probabilities (each of which must sum to one) associated with each contingency table. Each such θ_r undergoes a multidimensional logistic transformation, using the last (right-most) column as the reference category. This results in R transformed vectors of dimension (C-1); the transformed vectors, denoted omega_r's, are stacked to form a single omega vector corresponding to that contingency table. The omega vectors are assumed to follow (i.i.d.) a multivariate normal distribution.

The posterior distribution of the THETAs/OMEGAs are in non-standard form. To sample from the posterior, the algorithm uses a Metropolis-Hastings step with a multivariate t_4 proposal distribution. The covariance matrix of this multivariate t_4 must be expanded or shrunk to achieve acceptance ratios of between .2 and .5. Tune implements `num.runs` sets of `num.iters` iterations of the Gibbs sampler. At the end of each set of iterations, Tune examines the acceptance ratios in each precinct and adjusts a shrinkage factor (a scalar multiplied to the covariance matrix of the t_4 proposal) upwards or downwards. When finished, Tune returns a vector of length `I` = the number of contingency tables in `data`, This vector, called `rhos`, should be fed into the `Analyze` function. See Examples here and accompanying `Analze`.

## Value

A list with the following elements.

 `rhos` A vector of length `I` = number of contingency tables: each element of the `rhos` vector is a multiplier used in the proposal distribution of for draws from the conditional posterior of the THETAs, as described above. Feed this vector into the `Analyze` function. `acc.t` Matrix of dimension `I` x `num.runs`: Each column of `acc.t` contains the acceptance fractions for the Metropolis-Hastings algorithm, with a multivariate t_4 proposal distribution, used to draw from the conditional posterior of the `THETA`s. If `Tune` has worked properly, all elements of the final column of this matrix should be between .2 and .5. `acc.Diri` Matrix of dimension `I` x `num.runs`: Each column of `acc.t` contains the acceptance fractions for the Metropolis-Hastings algorithm, with independent Dirichlet proposals, used to draw from the conditional posterior of the `THETA`s. `Tune` does not alter this algorithm. `vld.NNs` A list of length `num.runs`: Each element of `vld.NNs` is a matrix of dimension `I` by `R`, with each element of the list corresponding to one of the `num.iters` sets of iterations run by `Tune`. To draw from the conditional posterior of the internal cell counts of a contigency table, the `Tune` function draws R-1 vectors of lenth C from multinomial distributions. In then calculates the counts in the additional row (denote this row as r') deterministically. This procedure can result in negative values in row r', in which case the overall proposal for the interior cell counts is outside the parameter space (and thus invalid). Each matrix of vld.NNs keeps track of the percentage of proposals drawn in this manner that are valid (i.e., not invalid). Each row of such a matrix corresponds to a contingency table. Each column in the matrix corresponds to a row in the a contingency table. Each entry specifies the percentage of multinomial proposals that are valid when the specified contingency table row serves as the r' row. For instance, in position 5,2 of vld.NNs is the fraction of valid proposals for the 5th contingency table when the second contigency table row is the r'th row. A value of “NaN” means that `Tune` chose to use a different (slower) method of drawing the internal cell counts because it suspected that the multinomial method would behave badly. `acc.NNs` A list of length `num.runs`: Same as vld.NNs, except the entries represent the fraction of proposals accepted (instead of the fraction that are in the permissible parameter space).

## Author(s)

D. James Greiner, Paul D. Baines, \& Kevin M. Quinn

## References

D. James Greiner \& Kevin M. Quinn. 2009. “R x C Ecological Inference: Bounds, Correlations, Flexibility, and Transparency of Assumptions.” J.R. Statist. Soc. A 172:67-81.

## Examples

 ```1 2 3 4 5 6 7 8 9``` ```## Not run: library(RxCEcolInf) data(stlouis) Tune.stlouis <- Tune("Bosley, Roberts, Ribaudo, Villa, NoVote ~ bvap, ovap", data = stlouis, num.iters = 10000, num.runs = 15) ## End(Not run) ```

RxCEcolInf documentation built on Nov. 6, 2021, 5:07 p.m.