RDS_2S_PSE: Estimate population size from two lists where the second list...

Description Usage Arguments Details Value References Examples

View source: R/RDS_2S_PSE.R

Description

Estimate the size of some population from two-source (capture-and-recapture) data. The first list is comprised of the individuals who received some mark or identifying object or, equivalently, the roster of individuals who received some service or attended some event. The second list (recapture data) is comprised of the individuals who were recruited by respondent-driven sampling, and the respondent-driven sampling weights must be available for each individual on that list.

Usage

1
2
3
RDS_2S_PSE(n.list1 = NULL, data = NULL, list2 = NULL,
  weight = NULL, bootreps = 2000, conf.level = 0.95,
  include.naive = FALSE, seed = Sys.time())

Arguments

n.list1

the total number of individuals who received some mark or identifying object, or the total number of individuals listed on some roster.

data

a data frame containg at least two columns containing data on individuals who were recruited to a respondent-driven sampling survey. Each row represents one unique respondent. One column is a logical (TRUE/FALSE) or binary(1/0) indicator variable for receipt of the mark (unique object). The other required column contains the individual respondent-driven sampling weights. All other columns are ignored.

list2

a character string which identifies the column name in the data for the logical (TRUE/FALSE) or binary (1/0) indicator for whether each respondent-driven survey respondent received the mark (unique object). A value of FALSE or 0 in that data frame column indicates that the respondent did not receive the mark (unique object) or, equivalently, was not listed on a service roster, and a value of TRUE or 1 indicates receipt of the mark or inclusion in the roster.

weight

a charater string which identifies the column name in data for the numeric respondent-driven survey weights.

bootreps

a numeric value for the number of bootstrap replications required for estimation of confidence intervals.

conf.level

a numeric value in the open interval (0, 1) for the desired confidence level.

include.naive

a logical value indicating whether a naive (unweighted) estimate of populations size is computed. Unweighted estimates are provided for comparison only, and should not be used for estimation from respondent-driven sampling.

seed

a numeric seed for the pseudo-random number generator used for bootstrap replication. Defaults to system time. Set to a particular value to generate repeatable replications.

Details

This function can also be used for "multipier-method" sampling, in which the multiplier is the fraction of the individuals included in the second list who are listed on some roster or service list.

Value

A data frame containing seven columns:

Type

RDS-weighted or naive estimate

Estimate

Point estimate of population size

conf.level

Confidence level

lower

Lower confidence limit

upper

Upper confidence limit

CI_type

Confidence interval type

reps

Number of bootstrap replicates

References

Berchenko Y, Frost SDW. Capture-recapture methods and respondent-driven sampling; their potential and limitations. Sexually Transmitted Infections 2011; 87(4):267-268.

Examples

1
2
3
4
5
6
7
8
9
data(FSW)
help(FSW)
## Estimate the number of female sex workers
RDS_2S_PSE(1000, data = FSW, list2 = "I.object", weight = "Weight")
## Unweighted estimates are not appropriate for data from respondent-
## driven sampling, but suppose you want to compare to see the effect
## of weighting
RDS_2S_PSE(1000, data = FSW, list2 = "I.object", weight = "Weight",
           include.naive = TRUE)

sgutreuter/RDSpopsize documentation built on Nov. 20, 2019, 4:16 p.m.