pop.predict.subnat: Subnational Probabilistic Population Projection

Description Usage Arguments Details Value Acknowledgment Author(s) See Also Examples

View source: R/predict_subnat.R

Description

Generates trajectories of probabilistic population projection for subregions of a given country.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
pop.predict.subnat(end.year = 2060, start.year = 1950, present.year = 2020, 
        wpp.year = 2019, output.dir = file.path(getwd(), "bayesPop.output"), 
        locations = NULL, default.country = NULL, 
        inputs = list(
            popM = NULL, popF = NULL, 
            mxM = NULL, mxF = NULL, srb = NULL, 
            pasfr = NULL, patterns = NULL, 
            migM = NULL, migF = NULL, 
            e0F.file = NULL, e0M.file = NULL, tfr.file = NULL, 
            e0F.sim.dir = NULL, e0M.sim.dir = NULL, tfr.sim.dir = NULL, 
            migMtraj = NULL, migFtraj = NULL
        ), 
        nr.traj = 1000, keep.vital.events = FALSE, 
        fixed.mx = FALSE, fixed.pasfr = FALSE, 
        replace.output = FALSE, verbose = TRUE)

Arguments

end.year

End year of the projection.

start.year

First year of the historical data on mortality rates. It determines the length of the historical time series used in the Lee-Carter estimation.

present.year

Year for which initial population data is to be used.

wpp.year

Year for which WPP data is used. The function loads a package called wppx where x is the wpp.year and uses its data (corresponding to the default.country) as default datasets if region-specific alternatives are not given (see more details below).

output.dir

Output directory of the projection.

locations

Name of a tab-delimited file that contains definitions of the subregions. It has a similar structure as UNlocations, with mandatory columns reg_code (unique identifier of the subregions) and name (name of the subregions). Optionally, location_type should be set to 4 for subregions to be processed. Column country_code can be included with the numerical code of the corresponding country. A row with location_type of 0 determines the country that the subregions belong to and is used for extracting default "national" datasets if the argument default.country is missing. In such a case, the code of the default country is taken from its column country_code. This is a mandatory argument.

default.country

Numerical code of a country to which the subregions belong to. It is used for extracting default datasets from the wpp package if some region-specific input datasets are missing. Alternatively, it can be also included in the locations file, see above. In either case, the code must exists in the UNlocations dataset.

inputs

A list of file names where input data is stored. Unless otherwise noted, these are tab delimited ASCII files with a mandatory column reg_code giving the numerical identifier of the subregions. If an element of this list is NULL, usually a default dataset corresponding to default.country is extracted from the wpp package. Names of these default datasets are shown in brackets. This list contains the following elements:

popM, popF

Initial male/female age-specific population (at time present.year). Mandatory items, no defaults. Must contain columns reg_code and age and be of the same structure as popM from wpp.

mxM, mxF

Historical data and (optionally) projections of male/female age-specific death rates [mxM, mxF] (see also argument fixed.mx).

srb

Projection of sex ratio at birth. [sexRatio]

pasfr

Historical data and (optionally) projections of percentage age-specific fertility rate [percentASFR] (see also argument fixed.pasfr).

patterns

Information on region's specifics regarding migration type, base year of the migration, mortality and fertility age patterns as defined in [vwBaseYear]. In addition, it can contain columns defining migration shares between the subregions, see Details below.

migM, migF

Projection of male/female age-specific migration as net counts on the same scale as initital population. It should have the same format as migrationM. If not available, the migration schedules are constructed from total migration counts of the default.country derived from migration using Rogers Castro for age distribution. Migration shares between subregions (including sex-specific shares) can be given in the patterns file, see above and Details below.

e0F.file

Comma-delimited CSV file with projected female life expectancy. It has the same structure as the file “ascii_trajectories.csv” generated using bayesLife::convert.e0.trajectories (which currently works for country-level results only). Required columns are “LocID”, “Year”, “Trajectory”, and “e0”. If e0F.file is NULL, data from the corresponding wpp package (for default.country) is taken, namely the median projections as one trajectory and the low and high variants (if available) as second and third trajectory. Alternatively, this element can be the keyword “median_” in which case only the median is taken.

e0M.file

Comma-delimited CSV file containing projections of male life expectancy of the same format as e0F.file. As in the female case, if e0M.file is NULL, data for default.country from the corresponding wpp package is taken.

tfr.file

Comma-delimited CSV file with results of total fertility rate (generated using bayesTFR, function convert.tfr.trajectories, file “ascii_trajectories.csv”). Required columns are “LocID”, “Year”, “Trajectory”, and “TF”. If this element is not NULL, the argument tfr.sim.dir is ignored. If both tfr.file and tfr.sim.dir are NULL, data for default.country from the corresponding wpp package is taken (median and the low and high variants as three trajectories). Alternatively, this argument can be the keyword “median_” in which case only the wpp median is taken.

e0F.sim.dir

Simulation directory with results of female life expectancy. Since bayesLife does not support subnational projections yet, this element should not be used. Instead use e0F.file if region-specific e0 projections are avaialable. Alternatively, it can be set to the keyword “median_” which has the same effect as when e0F.file is “median_”.

e0M.sim.dir

This is analogous to e0F.sim.dir, here for male life expectancy. Use e0M.file instead of this item.

tfr.sim.dir

Simulation directory with projections of total fertility rate (generated using bayesTFR::tfr.predict.subnat). It is only used if tfr.file is NULL.

migMtraj, migFtraj

Comma-delimited CSV file with male/female age-specific migration trajectories. If present, it replaces deterministic projections given by the migM and migF items. It has a similar format as e.g. e0M.file with columns “LocID”, “Year”, “Trajectory”, “Age” and “Migration”. The “Age” column must have values “0-4”, “5-9”, “10-14”, ..., “95-99”, “100+”.

nr.traj, keep.vital.events, fixed.mx, fixed.pasfr, replace.output, verbose

These arguments have the same meaning as in pop.predict.

Details

Population projection for subnational units (regions) is performed by applying the cohort component method to subnational datasets on projected fertility (TFR), mortality and net migration, starting from given sex- and age-specific population counts. The only required inputs are the initial sex- and age-specific population counts in each region (popM and popF elements of the inputs argument) and a file with a set of locations (argument locations). If no other input datasets are given, those datasets are replaced by the corresponding "national" values, taken from the corresponding wpp package. The argument default.country determines the country for those default "national" values. The default country can be also included in the locations file as a record with location.type being set to 0.

The TFR component can be given as a set of trajectories generated using the tfr.predict.subnat function of the bayesTFR package (tfr.sim.dir element). Alternatively, trajectories can be given in an ASCII file (tfr.file). Having a set of subnational TFR trajectories, the cohort component method is applied to each of them to yield a distribution of future subnational population.

Net migration can either be given as disaggregated sex- and age-specific datasets migM and migF. Alternatively, it can be given as shares between regions as columns in the patterns dataset. These are: inmigrationM_share, inmigrationF_share, outmigrationM_share, outmigrationF_share. The sex specification and/or direction specification (in/out) can be omitted, e.g. it can be simply migration_share. The function extracts the values of net migration projection on the national level and distributes it to regions according to the given shares. For positive (national) values, it uses the in-migration shares; for negative values it uses the out-migration shares. If the in/out prefix is omitted in the column names, the given migartion shares are used for both, positive and negative net migration projection. By default, if no migM and migF neither region-specific shares are given, the distribution between regions is proportional to the size of population. The age-specific schedules follow by default the Rogers-Castro age schedules. Note that when handling migration using shares as described here, it only affects the distribution of international migration into regions. It does not take into account between-region migration.

The package contains example datasets for Canada. Use these as templates for your own data. See Example below.

Value

Object of class bayesPop.prediction containing the subnational projections. Note that this object can be use in the various bayesPop functions exactly the same way as an object with national projections. However, the meaning of the argument country in many of these functions (e.g. in pop.trajectories.plot) changes to an identification of the region (either as a numerical code or name as defined in the locations file).

Acknowledgment

We are greatful to Patrice Dion from Statistics Canada for providing us with example data. Note that the example datasets included in the package are not official STATCAN data - they only serve the purpose of illustration and templates. Data for the time period 2015-2020 has been imputed by the author.

Author(s)

Hana Sevcikova

See Also

pop.predict, tfr.predict.subnat, pop.aggregate.subnat

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
## Not run: 
# Subnational projections for Canada
#########
data.dir <- file.path(find.package("bayesPop"), "extdata")

# Use national data for tfr and e0
###
sim.dir <- tempfile()
pred <- pop.predict.subnat(output.dir = sim.dir,
            locations = file.path(data.dir, "CANlocations.txt"),
            inputs = list(popM = file.path(data.dir, "CANpopM.txt"),
                          popF = file.path(data.dir, "CANpopF.txt"),
                          tfr.file = "median_"
                        ),
            verbose = TRUE)
pop.trajectories.plot(pred, "Alberta", sum.over.ages = TRUE)
unlink(sim.dir, recursive=TRUE)

# Use subnational TFR simulation
###
# Subnational TFR projections for Canada (from ?tfr.predict.subnat)
my.subtfr.file <- file.path(find.package("bayesTFR"), 'extdata', 'subnational_tfr_template.txt')
tfr.nat.dir <- file.path(find.package("bayesTFR"), "ex-data", "bayesTFR.output")
tfr.reg.dir <- tempfile()
tfr.preds <- tfr.predict.subnat(124, my.tfr.file = my.subtfr.file,
    sim.dir = tfr.nat.dir, output.dir = tfr.reg.dir, start.year = 2013)
 
# Pop projections
sim.dir <- tempfile()
pred <- pop.predict.subnat(output.dir = sim.dir,
            locations = file.path(data.dir, "CANlocations.txt"),
            inputs = list(popM = file.path(data.dir, "CANpopM.txt"),
                          popF = file.path(data.dir, "CANpopF.txt"),
                          patterns = file.path(data.dir, "CANpatterns.txt"),
                          tfr.sim.dir = file.path(tfr.reg.dir, "subnat", "c124")
                        ),
            verbose = TRUE)
pop.trajectories.plot(pred, "Alberta", sum.over.ages = TRUE)
pop.pyramid(pred, "Manitoba", year = 2050)
get.countries.table(pred)

# Aggregate to country level
aggr <- pop.aggregate.subnat(pred, regions = 124, 
            locations = file.path(data.dir, "CANlocations.txt"))
pop.trajectories.plot(aggr, "Canada", sum.over.ages = TRUE)

unlink(sim.dir, recursive = TRUE)
unlink(tfr.reg.dir, recursive = TRUE)

## End(Not run)

bayesPop documentation built on July 2, 2020, 3 a.m.