matching: Predicts unknown responses by matching
In NonProbEst: Estimation in Nonprobability Sampling

Description Usage Arguments Details Value References Examples

It uses the matching method introduced by Rivers (2007). The idea is to model the relationship between y_k and x_k using the convenience sample in order to predict y_k for the reference sample. You can then predict the total using the 'total_estimation' method.

matching(
  convenience_sample,
  reference_sample,
  covariates,
  estimated_var,
  positive_label = NULL,
  algorithm = "glm",
  proc = NULL,
  ...
)

`convenience_sample`	Data frame containing the non-probabilistic sample.
`reference_sample`	Data frame containing the probabilistic sample.
`covariates`	String vector specifying the common variables to use for training.
`estimated_var`	String specifying the variable to estimate.
`positive_label`	String specifying the label to be considered positive if the estimated variable is categorical. Leave it as the default NULL otherwise.
`algorithm`	A string specifying which classification or regression model to use (same as caret's method).
`proc`	A string or vector of strings specifying if any of the data preprocessing techniques available in train function from 'caret' package should be applied to data prior to the propensity estimation. By default, its value is NULL and no preprocessing is applied.
`...`	Further parameters to be passed to the train function.

Training of the models is done via the 'caret' package. The algorithm specified in algorithm must match one of the names in the list of algorithms supported by 'caret'. If the estimated variable is categorical, probabilities are returned.

A vector containing the estimated responses for the reference sample.

Rivers, D. (2007). Sampling for Web Surveys. Presented in Joint Statistical Meetings, Salt Lake City, UT.

#Simple example with default parameters
N = 50000
covariates = c("education_primaria", "education_secundaria")
if (is.numeric(sampleNP$vote_gen))
   sampleNP$vote_gen = factor(sampleNP$vote_gen, c(0, 1), c('F', 'T'))
estimated_votes = data.frame(
   vote_gen = matching(sampleNP, sampleP, covariates, "vote_gen", 'T')
)
total_estimation(estimated_votes, N / nrow(estimated_votes), c("vote_gen"), N)

Loading required package: lattice
Loading required package: ggplot2
vote_gen 
 2418869

NonProbEst documentation built on July 1, 2020, 6:08 p.m.

NonProbEst index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

NonProbEst
Estimation in Nonprobability Sampling

matching: Predicts unknown responses by matching
In NonProbEst: Estimation in Nonprobability Sampling

Description

Usage

Arguments

Details

Value

References

Examples

Example output

Related to matching in NonProbEst...

R Package Documentation

Browse R Packages

We want your feedback!

NonProbEst Estimation in Nonprobability Sampling

matching: Predicts unknown responses by matching In NonProbEst: Estimation in Nonprobability Sampling

Description

Usage

Arguments

Details

Value

References

Examples

Example output

Related to matching in NonProbEst...

R Package Documentation

Browse R Packages

We want your feedback!

NonProbEst
Estimation in Nonprobability Sampling

matching: Predicts unknown responses by matching
In NonProbEst: Estimation in Nonprobability Sampling