impute.visibility_mle: Estimates each person's personal visibility based on their...
In RDS: Respondent-Driven Sampling

impute.visibility_mle

R Documentation

Estimates each person's personal visibility based on their self-reported degree and the number of their (direct) recruits. It uses the time the person was recruited as a factor in determining the number of recruits they produce.

Description

Estimates each person's personal visibility based on their self-reported degree and the number of their (direct) recruits. It uses the time the person was recruited as a factor in determining the number of recruits they produce.

Usage

impute.visibility_mle(
  rds.data,
  max.coupons = NULL,
  type.impute = c("distribution", "mode", "median", "mean"),
  recruit.time = NULL,
  include.tree = FALSE,
  unit.scale = NULL,
  unit.model = c("cmp", "nbinom"),
  optimism = FALSE,
  guess = NULL,
  reflect.time = TRUE,
  maxit = 100,
  K = NULL,
  verbose = TRUE
)

Arguments

`rds.data`	An rds.data.frame
`max.coupons`	The number of recruitment coupons distributed to each enrolled subject (i.e. the maximum number of recruitees for any subject). By default it is taken by the attribute or data, else the maximum recorded number of coupons.
`type.impute`	The type of imputation based on the conditional distribution. It can be of type `distribution`,`mode`,`median`, or `mean` with the first , the default, being a random draw from the conditional distribution.
`recruit.time`	vector; An optional value for the data/time that the person was interviewed. It needs to resolve as a numeric vector with number of elements the number of rows of the data with non-missing values of the network variable. If it is a character name of a variable in the data then that variable is used. If it is NULL then the sequence number of the recruit in the data is used. If it is NA then the recruitment is not used in the model. Otherwise, the recruitment time is used in the model to better predict the visibility of the person.
`include.tree`	logical; If `TRUE`, augment the reported network size by the number of recruits and one for the recruiter (if any). This reflects a more accurate value for the visibility, but is not the self-reported degree. In particular, it typically produces a positive visibility (compared to a possibility zero self-reported degree).
`unit.scale`	numeric; If not `NULL` it sets the numeric value of the scale parameter of the distribution of the unit sizes. For the negative binomial, it is the multiplier on the variance of the negative binomial compared to a Poisson (via the Poisson-Gamma mixture representation). Sometimes the scale is unnaturally large (e.g. 40) so this give the option of fixing it (rather than using the MLE of it). The model is fit with the parameter fixed at this passed value.
`unit.model`	The type of distribution for the unit sizes. It can be of `nbinom`, meaning a negative binomial. In this case, `unit.scale` is the multiplier on the variance of the negative binomial compared to a Poisson of the same mean. The alternative is `cmp`, meaning a Conway-Maxwell-Poisson distribution. In this case, `unit.scale` is the scale parameter compared to a Poisson of the same mean (values less than one mean under-dispersed and values over one mean over-dispersed). The default is `cmp`.
`optimism`	logical; If `TRUE` then add a term to the model allowing the (proportional) inflation of the self-reported degrees relative to the unit sizes.
`guess`	vector; if not `NULL`, the initial parameter values for the MLE fitting.
`reflect.time`	logical; If `FALSE` then the `recruit.time` is the time before the end of the study (instead of the time since the survey started or chronological time).
`maxit`	integer; The maximum number of iterations in the likelihood maximization. By default it is 100.
`K`	integer; The maximum degree. All self-reported degrees above this are recorded as being at least K. By default it is the 95th percentile of the self-reported network sizes.
`verbose`	logical; if this is `TRUE`, the program will print out additional

References

McLaughlin, K.R., M.S. Handcock, and L.G. Johnston, 2015. Inference for the visibility distribution for respondent-driven sampling. In JSM Proceedings. Alexandria, VA: American Statistical Association. 2259-2267.

Examples

## Not run: 
data(fauxmadrona)
# The next line fits the model for the self-reported personal
# network sizes and imputes the personal network sizes 
# It may take up to 60 seconds.
visibility <- impute.visibility(fauxmadrona)
# frequency of estimated personal visibility
table(visibility)

## End(Not run)

RDS documentation built on Sept. 11, 2024, 8:13 p.m.