ml_problem: Create an instance of a fitting problem

Description Usage Arguments Value Examples

View source: R/ml_problem.R

Description

The ml_problem() function is the first step for fitting a reference sample to known control totals with mlfit. All algorithms (see ml_fit()) expect an object created by this function (or optionally processed with flatten_ml_fit_problem()).

The special_field_names() function is useful for the field_names argument to ml_problem.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
ml_problem(
  ref_sample,
  controls = list(individual = individual_controls, group = group_controls),
  field_names,
  individual_controls,
  group_controls,
  prior_weights = NULL,
  geo_hierarchy = NULL
)

is_ml_problem(x)

## S3 method for class 'ml_problem'
format(x, ...)

## S3 method for class 'ml_problem'
print(x, ...)

special_field_names(
  groupId,
  individualId,
  individualsPerGroup = NULL,
  count = NULL,
  zone = NULL,
  region = NULL
)

Arguments

ref_sample

The reference sample

controls

Control totals, by default initialized from the individual_controls and group_controls arguments

field_names

Names of special fields, construct using special_field_names()

individual_controls, group_controls

Control totals at individual and group level, given as a list of data frames where each data frame defines a control

prior_weights

Prior (or design) weights at group level; by default a vector of ones will be used, which corresponds to random sampling of groups

geo_hierarchy

A table shows mapping between a larger zoning level to many zones of a smaller zoning level. The column name of the larger level should be specified in field_names as 'region' and the smaller one as 'zone'.

x

An object

...

Ignored.

groupId, individualId

Name of the column that defines the ID of the group or the individual

individualsPerGroup

Obsolete.

count

Name of control total column in control tables (use first numeric column in each control by default).

region, zone

Name of the column that defines the region of the reference sample or the zone of the controls. Note that region is a larger area that contains more than one zone.

Value

An object of class ml_problem or a list of them if geo_hierarchy was given, essentially a named list with the following components:

refSample

The reference sample, a data.frame.

controls

A named list with two components, individual and group. Each contains a list of controls as data.frames.

fieldNames

A named list with the names of special fields.

is_ml_problem() returns a logical.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# Create example from Ye et al., 2009

# Provide reference sample
ye <- tibble::tribble(
  ~HHNR, ~PNR, ~APER, ~HH_VAR, ~P_VAR,
  1, 1, 3, 1, 1,
  1, 2, 3, 1, 2,
  1, 3, 3, 1, 3,
  2, 4, 2, 1, 1,
  2, 5, 2, 1, 3,
  3, 6, 3, 1, 1,
  3, 7, 3, 1, 1,
  3, 8, 3, 1, 2,
  4, 9, 3, 2, 1,
  4, 10, 3, 2, 3,
  4, 11, 3, 2, 3,
  5, 12, 3, 2, 2,
  5, 13, 3, 2, 2,
  5, 14, 3, 2, 3,
  6, 15, 2, 2, 1,
  6, 16, 2, 2, 2,
  7, 17, 5, 2, 1,
  7, 18, 5, 2, 1,
  7, 19, 5, 2, 2,
  7, 20, 5, 2, 3,
  7, 21, 5, 2, 3,
  8, 22, 2, 2, 1,
  8, 23, 2, 2, 2
)
ye

# Specify control at household level
ye_hh <- tibble::tribble(
  ~HH_VAR, ~N,
  1,       35,
  2,       65
)
ye_hh

# Specify control at person level
ye_ind <- tibble::tribble(
  ~P_VAR, ~N,
  1, 91,
  2, 65,
  3, 104
)
ye_ind

ye_problem <- ml_problem(
  ref_sample = ye,
  field_names = special_field_names(
    groupId = "HHNR", individualId = "PNR", count = "N"
  ),
  group_controls = list(ye_hh),
  individual_controls = list(ye_ind)
)
ye_problem

fit <- ml_fit_dss(ye_problem)
fit$weights

mlfit documentation built on Oct. 8, 2021, 9:09 a.m.