Nothing
nonprobsvy News and Updates
pop.size
, controlSel
, controlOut
and controlInf
were renamed to pop_size
, control_sel
, control_out
and
control_inf
respectively.genSimData
removed completely as it is not used anywhere
in the package.maxLik_method
renamed to maxlik_method
in the
control_sel
function.control_out
function:predictive_match
renamed to pmm_match_type
to align with the
PMM (Predictive Mean Matching) estimator naming convention,
where all related parameters start with pmm_
control_sel
function:method
removed as it was not usedest_method_sel
renamed to est_method
h
renamed to gee_h_fun
to make this more readable
to the userstart_type
now accepts only zero
and mle
(for gee
models
only).control_inf
function:bias_inf
renamed to vars_combine
and type changed to
logical
. TRUE
if variables (its levels) should be combined
after variable selection algorithm for the doubly robust
approach.pi_ij
-- argument removed as it is not used.nonprobsvy
class renamed to nonprob
and all related method
adjusted to this changelogit_model_nonprobsvy
, probit_model_nonprobsvy
and
cloglog_model_nonprobsvy
removed in the favour of more readable
method_ps
function that specifies the propensity score modelcontrol_inference=control_inf(vars_combine=TRUE)
which
allows doubly robust estimator to combine variables prior estimation
i.e. if selection=~x1+x2
and y~x1+x3
then the following models
are fitted selection=~x1+x2+x3
and y~x1+x2+x3
. By default we set
control_inference=control_inf(vars_combine=FALSE)
. Note that this
behaviour is assumed independently from variable selection.nonprob(weights=NULL)
replaced to nonprob(case_weights=NULL)
to stress that this refer to case weights not sampling or other weights
in non-probability samplejvs
(Job Vacancy
Survey; a probability sample survey) and admin
(Central Job Offers
Database; a non-probability sample survey). The units and auxiliary
variables have been aligned in a way that allows the data to be
integrated using the methods implemented in this package.check_balance
function was added to check the balance in the
totals of the variables based on the weighted weights between the
non-probability and probability samples.na_action
with default na.omit
weights
-- returns IPW weightsupdate
-- allows to update the nonprob
class objectmethod_ps
-- for modelling propensity scoremethod_glm
-- for modelling y using glm
functionmethod_nn
-- for the NN methodmethod_pmm
-- for the PMM methodmethod_npar
-- for the non-parametric methodprint.nonprob
, summary.nonprob
and print.nonprob_summary
methods> result_mi
A nonprob object
- estimator type: mass imputation
- method: glm (gaussian)
- auxiliary variables source: survey
- vars selection: false
- variance estimator: analytic
- population size fixed: false
- naive (uncorrected) estimators:
- variable y1: 3.1817
- variable y2: 1.8087
- selected estimators:
- variable y1: 2.9498 (se=0.0420, ci=(2.8674, 3.0322))
- variable y2: 1.5760 (se=0.0326, ci=(1.5122, 1.6399))
number of digits can be changed using print(x, digits)
as shown below
> print(result_mi,2)
A nonprob object
- estimator type: mass imputation
- method: glm (gaussian)
- auxiliary variables source: survey
- vars selection: false
- variance estimator: analytic
- population size fixed: false
- naive (uncorrected) estimators:
- variable y1: 3.18
- variable y2: 1.81
- selected estimators:
- variable y1: 2.95 (se=0.04, ci=(2.87, 3.03))
- variable y2: 1.58 (se=0.03, ci=(1.51, 1.64))
> summary(result_mi) |> print(digits=2)
A nonprob_summary object
- call: nonprob(data = subset(population, flag_bd1 == 1), outcome = y1 +
y2 ~ x1 + x2, svydesign = sample_prob)
- estimator type: mass imputation
- nonprob sample size: 693011 (69.3%)
- prob sample size: 1000 (0.1%)
- population size: 1000000 (fixed: false)
- detailed information about models are stored in list element(s): "outcome"
----------------------------------------------------------------
- distribution of outcome residuals:
- y1: min: -4.79; mean: 0.00; median: 0.00; max: 4.54
- y2: min: -4.96; mean: -0.00; median: -0.07; max: 12.25
- distribution of outcome predictions (nonprob sample):
- y1: min: -2.72; mean: 3.18; median: 3.04; max: 16.28
- y2: min: -1.55; mean: 1.81; median: 1.58; max: 13.92
- distribution of outcome predictions (prob sample):
- y1: min: -0.46; mean: 2.95; median: 2.84; max: 10.31
- y2: min: -0.58; mean: 1.58; median: 1.39; max: 7.87
----------------------------------------------------------------
formula.tools
strata
is not supported for the time being.maxit
argument from controlSel
function to internally used nleqslv
functionvector
in model_frame
when predicting
y_hat
in mass imputation glm
model when X is based in one
auxiliary variable only - fix provided converting it to data.frame
object.summary
about quality of estimation basing on
difference between estimated and known total values of auxiliary
variablescontrolOut
function by switching values
for predictive_match
argument. From now on, the
predictive_match = 1
means $\hat{y}-\hat{y}$ in predictive mean
matching imputation and predictive_match = 2
corresponds to
$\hat{y}-y$ matching.div
option when variable selection (more in
documentation) for doubly robust estimation.nonprob
output such as gradient, hessian
and jacobian derived from IPW estimation for mle
and gee
methods
when IPW
or DR
model executed.nonprob
output when
IPW
or DR
model executed.model_frame
matrix data from probability sample used for
mass imputation to nonprob
when MI
or DR
model executed.logit
, complementary log-log
and probit
link functions.generalized linear models
, nearest neighbours
and
predictive mean matching
methods for Mass ImputationSCAD
, LASSO
and MCP
penalization equationsanalytic
and bootstrap
(with parallel computation -
doSNOW
package) variance for described estimatorsnonprob
class such asnobs
for samples sizepop.size
for population size estimationresiduals
for residuals of the inverse probability weighting
modelcooks.distance
for identifying influential observations that
have a significant impact on the parameter estimateshatvalues
for measuring the leverage of individual
observationslogLik
for computing the log-likelihood of the model,AIC
(Akaike Information Criterion) for evaluating the model
based on the trade-off between goodness of fit and complexity,
helping in model selectionBIC
(Bayesian Information Criterion) for a similar purpose as
AIC but with a stronger penalty for model complexityconfint
for calculating confidence intervals around parameter
estimatesvcov
for obtaining the variance-covariance matrix of the
parameter estimatesdeviance
for assessing the goodness of fit of the modelR-cmd
checknonprob
function.Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.