View source: R/createMRGobject.R
createMRGobject | R Documentation |
Create a single object containing all necessary objects for multiResGrid functions
Prints MRG-objects
createMRGobject(
ifg,
ress = c(1, 5, 10, 20, 40) * 1000,
geovar = c("GEO_LCT", "geometry"),
lnames = NULL,
vars = NULL,
weights = NULL,
mincount = 10,
countFeatureOrTotal = "feature",
nlarge = 2,
plim = 0.85,
verbose = FALSE,
nclus = 1,
clusType = NULL,
domEstat = TRUE,
consistencyCheck = FALSE,
outfile = NULL,
splitlim = 5e+07,
checkDominance = TRUE,
checkReliability = FALSE,
userfun = NULL,
strat = NULL,
confrules = "individual",
suppresslim = 0,
sumsmall = FALSE,
suppresslimSum = 0,
reliabilitySplit = TRUE,
pseudoreg = NULL,
plotIntermediate = FALSE,
addIntermediate = FALSE,
locAdj = "LL",
postProcess = TRUE,
rounding = -1,
remCols = TRUE,
...
)
## S3 method for class 'MRG'
print(x, ...)
ifg |
Either a data.frame or tibble or sf-object with the locations and the data of the survey or census data, or a list of such objects. |
ress |
A vector with the different resolutions |
geovar |
Name of geodata variable in the objects. Must me the same for all of the surveys/censuses, if the data sets are not submitted as sf-objects |
lnames |
Names for the different surveys or censuses if ifg is a list. Typically it could be survey years |
vars |
Variable(s) of interest that should be aggregated (necessary when ifg is used for individual farm specific anonymization rules) |
weights |
Extrapolation factor(s) (weights) wi of unit i in the sample of units nc falling into a specific cell c. Weights are used for disclosure control measures. A weight of 1 will be used if missing. If only one weight is given, it will be used for all variables. If the length is more than one, the length has to be equal to the number of variables. If the same weight is used for several variables, it must be repeated in the weights-vector |
mincount |
The minimum number of farms for a grid cell (threshold rule) |
countFeatureOrTotal |
Should the frequency limit be applied on records with a positive value for a certain feature, or on all records, independent of value of feature |
nlarge |
Parameter to be used if the nlarge(st) farms should count for maximum plim percent of
the total value for the variable in the grid cell (see details of |
plim |
See nlarge |
verbose |
Indicates if some extra output should be printed. Usually TRUE/FALSE, but can also have
a value of 2 for |
nclus |
Number of clusters to use for parallel processing. No parallelization is used
for |
clusType |
The type of cluster; see |
domEstat |
Should the dominance rule be applied as in the IFS handbook (TRUE), where the weights are rounded before finding the first nlarge contributors, or should it be the first nlarge contributors*weight, where also fractions are considered (FALSE)? |
consistencyCheck |
logical; whether consistency between the gridded values and the similar values from ifg should be checked. The gridded value is derived from rasterize and the second one from st_join. The two methods can in some cases treat border cases between grid cells differently. |
outfile |
File to direct the output in case of parallel processing,
see |
splitlim |
For large dataset - split the data set in batches of more or less splitlim size |
checkDominance |
Logical - should the dominance rule be applied? |
checkReliability |
Logical - should the prediction variance be checked, and used for the aggregation? This considerably increases computation time |
userfun |
This gives the possibility to add a user defined function with additional confidentiality rules which the grid cell has to pass |
strat |
Column name defining the strata for stratified sampling, used if checkReliability is TRUE |
confrules |
Should the frequency rule (number of holdings) refer to the number of holdings with a value of the individual vars above zero ("individual") or the total number of holdings in the data set ("total")? |
suppresslim |
Parameter that can be used to avoid that almost empty grid cells are merged with cells with considerably higher number of observations. The value is a minimum share of the total potential new cell for a grid cell to be aggregated. See below for more details. |
sumsmall |
Logical; should the suppresslimSum value be applied on the sum of small grid cells within the lower resolution grid cell? Note that different combinations of suppreslim and suppreslimSum values might not give completely intuitive results.For instance, if both are equal, then a higher value can lead to more grid cells being left unaggregated for smaller grid sizes, leading to aggregation for a large grid cell |
suppresslimSum |
Parameter similar to suppreslim, but affecting the total of grid cells to be suppressed |
reliabilitySplit |
Logical or number - parameter to be used in calculation of the reliability (if checkReliability = TRUE). It can either give the number of groups, or if TRUE, it will create groups of approdcimately 50,000 records per group. If FALSE, the data set will not be split, independent on the size. |
pseudoreg |
A column with regions to be used to define pseudostrata if checkReliability is TRUE. This is used for the cases when one or more strata only has a single record (and the weight is different from one). This makes variance calculation impossible, so such strata are merged into a pseudostrata. If pseudoreg is given (for example a column with the country name, or NUTS2 region), the pseudostrata will be created separately for each pseudoreg region. |
plotIntermediate |
Logical or number - make a simple plot showing which grid cells have already passed the frequency rule. plotintermediate = TRUE, the function will wait 5 seconds after plotting before continuing, otherwise it will wait plotintermediate seconds. |
addIntermediate |
Logical; will add a list of all intermediate himgs and lohs (overlay of himg and the lower resolution grid) as an attribute to the object to be returned |
locAdj |
parameter to adjust the coordinates if they are exactly on the borders between grid cells. The values can either be FALSE, or "jitter" (adding a small random value to the coordinates, essentially spreading them randomly around the real location), "UR", "UL", "LR" or "LL", to describe which corner of the grid cell the location belong (upper right, upper left, lower right or lower left). |
postProcess |
Logical; should the postprocessing be done as part of creation of the multiresolution grid (TRUE), or be done in a separate step afterwards (FALSE). The second option is useful when wanting to check the confidential grid cells of the final map |
rounding |
either logical (FALSE) or an integer indicating the number
of decimal places
to be used. Negative values are allowed (such as the default
value rounding to the closest 10). See also the details
for |
remCols |
Logical; Should intermediate columns be removed? Can be set
to FALSE for further analyses. Temporary columns will not be removed if their names
partly match the variable names of |
... |
Other parameters to underlying print functions |
x |
MRG-object, created by call to |
The function creates a single object, containing both the mapped data and the parameters for for further processing. This assures that all processing is done with the same variables.
A list containing the necessary elements for further processing
with the MRG
-package.
library(sf)
# These are SYNTHETIC agricultural FSS data
data(ifs_dk) # Census data
# Create spatial data
ifg = fssgeo(ifs_dk, locAdj = "LL")
ress = 1000*2^(1:7)
MRGobject = createMRGobject(ifg = ifg, ress = ress, var = "UAA")
# Run the adaptive grid function only with farm number as con, then plot results
himg1 = multiResGrid(MRGobject)
himg1 = multiResGrid(MRGobject)
# Parameters can be updated in the object or in the call to multiResGrid
MRGobject$suppresslim = 0.02
himg2 = multiResGrid(MRGobject)
himg3 = multiResGrid(MRGobject, suppresslim = 0.05)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.