createMRGobject: Create a single object containing all necessary objects for...

View source: R/createMRGobject.R

createMRGobjectR Documentation

Create a single object containing all necessary objects for multiResGrid functions

Description

Create a single object containing all necessary objects for multiResGrid functions

Prints MRG-objects

Usage

createMRGobject(
  ifg,
  ress = c(1, 5, 10, 20, 40) * 1000,
  geovar = c("GEO_LCT", "geometry"),
  lnames = NULL,
  vars = NULL,
  weights = NULL,
  mincount = 10,
  countFeatureOrTotal = "feature",
  nlarge = 2,
  plim = 0.85,
  verbose = FALSE,
  nclus = 1,
  clusType = NULL,
  domEstat = TRUE,
  consistencyCheck = FALSE,
  outfile = NULL,
  splitlim = 5e+07,
  checkDominance = TRUE,
  checkReliability = FALSE,
  userfun = NULL,
  strat = NULL,
  confrules = "individual",
  suppresslim = 0,
  sumsmall = FALSE,
  suppresslimSum = 0,
  reliabilitySplit = TRUE,
  pseudoreg = NULL,
  plotIntermediate = FALSE,
  addIntermediate = FALSE,
  locAdj = "LL",
  postProcess = TRUE,
  rounding = -1,
  remCols = TRUE,
  ...
)

## S3 method for class 'MRG'
print(x, ...)

Arguments

ifg

Either a data.frame or tibble or sf-object with the locations and the data of the survey or census data, or a list of such objects.

ress

A vector with the different resolutions

geovar

Name of geodata variable in the objects. Must me the same for all of the surveys/censuses, if the data sets are not submitted as sf-objects

lnames

Names for the different surveys or censuses if ifg is a list. Typically it could be survey years

vars

Variable(s) of interest that should be aggregated (necessary when ifg is used for individual farm specific anonymization rules)

weights

Extrapolation factor(s) (weights) wi of unit i in the sample of units nc falling into a specific cell c. Weights are used for disclosure control measures. A weight of 1 will be used if missing. If only one weight is given, it will be used for all variables. If the length is more than one, the length has to be equal to the number of variables. If the same weight is used for several variables, it must be repeated in the weights-vector

mincount

The minimum number of farms for a grid cell (threshold rule)

countFeatureOrTotal

Should the frequency limit be applied on records with a positive value for a certain feature, or on all records, independent of value of feature

nlarge

Parameter to be used if the nlarge(st) farms should count for maximum plim percent of the total value for the variable in the grid cell (see details of gridData)

plim

See nlarge

verbose

Indicates if some extra output should be printed. Usually TRUE/FALSE, but can also have a value of 2 for multiResGrid for even more output.

nclus

Number of clusters to use for parallel processing. No parallelization is used for nclus = 1.

clusType

The type of cluster; see makeCluster for more details. The default of makeCluster is used if type is missing or NA

domEstat

Should the dominance rule be applied as in the IFS handbook (TRUE), where the weights are rounded before finding the first nlarge contributors, or should it be the first nlarge contributors*weight, where also fractions are considered (FALSE)?

consistencyCheck

logical; whether consistency between the gridded values and the similar values from ifg should be checked. The gridded value is derived from rasterize and the second one from st_join. The two methods can in some cases treat border cases between grid cells differently.

outfile

File to direct the output in case of parallel processing, see makeCluster for more details.

splitlim

For large dataset - split the data set in batches of more or less splitlim size

checkDominance

Logical - should the dominance rule be applied?

checkReliability

Logical - should the prediction variance be checked, and used for the aggregation? This considerably increases computation time

userfun

This gives the possibility to add a user defined function with additional confidentiality rules which the grid cell has to pass

strat

Column name defining the strata for stratified sampling, used if checkReliability is TRUE

confrules

Should the frequency rule (number of holdings) refer to the number of holdings with a value of the individual vars above zero ("individual") or the total number of holdings in the data set ("total")?

suppresslim

Parameter that can be used to avoid that almost empty grid cells are merged with cells with considerably higher number of observations. The value is a minimum share of the total potential new cell for a grid cell to be aggregated. See below for more details.

sumsmall

Logical; should the suppresslimSum value be applied on the sum of small grid cells within the lower resolution grid cell? Note that different combinations of suppreslim and suppreslimSum values might not give completely intuitive results.For instance, if both are equal, then a higher value can lead to more grid cells being left unaggregated for smaller grid sizes, leading to aggregation for a large grid cell

suppresslimSum

Parameter similar to suppreslim, but affecting the total of grid cells to be suppressed

reliabilitySplit

Logical or number - parameter to be used in calculation of the reliability (if checkReliability = TRUE). It can either give the number of groups, or if TRUE, it will create groups of approdcimately 50,000 records per group. If FALSE, the data set will not be split, independent on the size.

pseudoreg

A column with regions to be used to define pseudostrata if checkReliability is TRUE. This is used for the cases when one or more strata only has a single record (and the weight is different from one). This makes variance calculation impossible, so such strata are merged into a pseudostrata. If pseudoreg is given (for example a column with the country name, or NUTS2 region), the pseudostrata will be created separately for each pseudoreg region.

plotIntermediate

Logical or number - make a simple plot showing which grid cells have already passed the frequency rule. plotintermediate = TRUE, the function will wait 5 seconds after plotting before continuing, otherwise it will wait plotintermediate seconds.

addIntermediate

Logical; will add a list of all intermediate himgs and lohs (overlay of himg and the lower resolution grid) as an attribute to the object to be returned

locAdj

parameter to adjust the coordinates if they are exactly on the borders between grid cells. The values can either be FALSE, or "jitter" (adding a small random value to the coordinates, essentially spreading them randomly around the real location), "UR", "UL", "LR" or "LL", to describe which corner of the grid cell the location belong (upper right, upper left, lower right or lower left).

postProcess

Logical; should the postprocessing be done as part of creation of the multiresolution grid (TRUE), or be done in a separate step afterwards (FALSE). The second option is useful when wanting to check the confidential grid cells of the final map

rounding

either logical (FALSE) or an integer indicating the number of decimal places to be used. Negative values are allowed (such as the default value rounding to the closest 10). See also the details for digits in round.

remCols

Logical; Should intermediate columns be removed? Can be set to FALSE for further analyses. Temporary columns will not be removed if their names partly match the variable names of vars

...

Other parameters to underlying print functions

x

MRG-object, created by call to createMRGobject

Details

The function creates a single object, containing both the mapped data and the parameters for for further processing. This assures that all processing is done with the same variables.

Value

A list containing the necessary elements for further processing with the MRG-package.

Examples


library(sf)

# These are SYNTHETIC agricultural FSS data 
data(ifs_dk) # Census data

# Create spatial data
ifg = fssgeo(ifs_dk, locAdj = "LL")

ress = 1000*2^(1:7)
MRGobject = createMRGobject(ifg = ifg, ress = ress, var = "UAA")
# Run the adaptive grid function only with farm number as con, then plot results
himg1 = multiResGrid(MRGobject)

himg1 = multiResGrid(MRGobject)
# Parameters can be updated in the object or in the call to multiResGrid
MRGobject$suppresslim = 0.02
himg2 = multiResGrid(MRGobject)
himg3 = multiResGrid(MRGobject, suppresslim = 0.05)

 



MRG documentation built on Oct. 28, 2024, 5:07 p.m.

Related to createMRGobject in MRG...