pop.aggregate | R Documentation |
Aggregation of existing countries' population projections into projections of given regions, and accessing such aggregations.
pop.aggregate(pop.pred, regions,
input.type = c("country", "region"), name = input.type,
inputs = list(e0F.sim.dir = NULL, e0M.sim.dir = "joint_", tfr.sim.dir = NULL),
my.location.file = NULL, verbose = FALSE, ...)
get.pop.aggregation(sim.dir = NULL, pop.pred = NULL, name = NULL,
write.to.cache = TRUE)
pop.aggregate.subnat(pop.pred, regions, locations, ..., verbose = FALSE)
pop.pred |
Object of class |
regions |
Vector of numerical codes of regions. It should correspond to values in the column “country_code” in the |
input.type |
There are two methods for aggregating projections depending on the type of inputs, “country”- and “region”-based, see Details. |
name |
Name of the aggregation. It becomes a part of a directory name where aggregation results are stored. |
inputs |
This argument is only used when the “region”-based method is selected. It is a list of inputs of probabilistic components of the projection:
|
my.location.file |
User-defined location file that can contain other agreggation groups than the default UN location file. It should have the same structure as the |
verbose |
Logical switching log messages on and off. |
sim.dir |
Simulation directory where aggregation is stored. It is the same directory used for creating the |
write.to.cache |
Logical controlling if functions operating on this object are allowed to write into its cache (see Details of |
locations |
Name of a tab-delimited file that contains definitions of the sub-regions. It should be the same file as used for the |
... |
Additional arguments. For a country-type aggregation, it can be logical |
Function pop.aggregate
triggers an aggregations over countries while function pop.aggregate.subnat
is used for aggregation over sub-regions to a country. The following details refer to the use of pop.aggregate
. For sub-national aggregation see Example in pop.predict.subnat
.
The dataset UNlocations
or my.location.file
is used to determine countries to be aggregated, in particular the field “location_type” of the entries with “country_code” given in the regions
argument. One can aggregate over the following location types: Type 0 means aggregating all countries of the world (or in the file), type 2 is aggregating over continents, type 3 is aggregating over regions within continents, and any other integer (except 4) correponds to user-defined aggregations. Note that type 4 is reserved as a location type of countries and thus, all aggregations are performed over entries of this type. For type 2, countries are matched using the “area_code” column; for type 3 the matching is done using the “reg_code” column of the UNlocations
dataset. E.g., if regions=908
(Europe) which has location type 2 in the default UNlocations
dataset, all countries are aggregated for which values of 908 are found in the “area_code” column. If the location type is other than 0, 2, 3 and 4, there must be a column in the file called “agcode_x
” with x
being the location type. This column is then used to match the countries to be aggregated.
Consider the following example. Say we want to pair four countries (Germany [DE], France [FR], Netherlands [NL], Italy [IT]) in two different ways, so we have two overlapping groupings, each of which has two groups (A,B):
group A = (DE, FR), group B = (NL, IT)
group A = (DE, NL), group B = (FR, IT)
Then, my.location.file
should have the following entries:
country_code | name | location_type | agcode_98 | agcode_99 |
1001 | grouping1_groupA | 98 | -1 | -1 |
1002 | grouping1_groupB | 98 | -1 | -1 |
1003 | grouping2_groupA | 99 | -1 | -1 |
1004 | grouping2_groupB | 99 | -1 | -1 |
276 | Germany | 4 | 1001 | 1003 |
250 | France | 4 | 1001 | 1004 |
258 | Netherlands | 4 | 1002 | 1003 |
380 | Italy | 4 | 1002 | 1004 |
1005 | all | 0 | -1 | -1 |
The “country_code” of the groups is user-specific, but it must be unique within the file. Values of “country_code” for countries must match those in the prediction object. To run the aggregation for the four groups above we set regions=1001:1004
. Having “location_type” being 98 and 99, it is expected the file to have columns “agcode_98” and “agcode_99” containing assignements to each of the two groupings. Values in this columns corresponding to groups are not used and thus can have any value. For aggregating over all four countries, set regions=1005
which has “location_type” equal 0 and thus, it is aggregated over all entries with “location_type” equals 4.
There are two methods available for generating aggregations of population projection:
Aggregations are created by summing trajectories over countries of the given region.
The aggregation is generated using the same algorithm as population projections for single countries (function pop.predict
), but it operates on aggregated input components. These are created as follows. Here c
denotes countries over which we aggregate a region R
, s \in \{m, f\}
, a
, and t
denote sex, age category and time, respectively. t=P
denotes the present year of the prediction. N_{s,a,t}^c
and M_{s,a,t}^c
, respectively, denotes the historical population count and the Bayesian predictive median of population, respectively, of sex s
, in age category a
at time t
for country c
(refer to the links in parentheses for description of the data):
N_{s,a,t=P}^R = \sum_c N_{s,a,t=P}^c
mx_{s,a,t}^R = \frac{\sum_c(mx_{s,a,t}^c \cdot N_{s,a,t})}{\sum_c N_{s,a,t}}
SRB_t^R = \frac{\sum_c M_{s=m,a=1,t}^c}{\sum_c M_{s=f,a=1,t}^c}
PASFR_{a,t}^R = \frac{\sum_c(PASFR_{a,t}^c \cdot M_{s=f,a,t})}{\sum_c M_{s=f,a,t}}
Aggregated migration code is the code of maximum counts over aggregated countries weighted by N_{t=P}^c
. Migration start year is the maximum of start years over aggregated countries.
mig_{s,a,t}^R = \sum_c mig_{s,a,t}^c
We assume an aggregation of life expectancy for the given regions was generated prior to this call, using the run.e0.mcmc.extra
and e0.predict.extra
functions of the bayesLife package.
We assume an aggregation of total fertility for the given regions was generated prior to this call, using the run.tfr.mcmc.extra
and tfr.predict.extra
functions of the bayesTFR package.
Results of the aggregations are stored in the same top directory as the pop.pred
object, in a sudirectory called ‘aggregations_
name’. They can be accessed using the function get.pop.aggregation
. Note that multiple runs of this function with the same name will overwrite previous aggregations results of the same name.
Object of class bayesPop.prediction
containing the aggregated results. In addition it contains elements aggregation.method
giving the input.type
used, and aggregated.countries
which is a list of countries aggregated for each region.
Hana Sevcikova, Adrian Raftery
H. Sevcikova, A. E. Raftery (2016). bayesPop: Probabilistic Population Projections. Journal of Statistical Software, 75(5), 1-29. doi:10.18637/jss.v075.i05
pop.predict
, tfr.predict.extra
, e0.predict.extra
## Not run:
sim.dir <- tempfile()
pred <- pop.predict(countries=c(528,218,450), output.dir=sim.dir)
aggr <- pop.aggregate(pred, 900) # aggregating World (i.e. all countries available in pred)
pop.trajectories.plot(aggr, 900, sum.over.ages=TRUE)
# countries over which we aggregated:
subset(UNlocations, country_code %in% aggr$aggregated.countries[["900"]])
unlink(sim.dir, recursive=TRUE)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.