age.specific.migration: Generate Sex- and Age-specific Migration
In bayesPop: Probabilistic Population Projection

age.specific.migration

R Documentation

Generate Sex- and Age-specific Migration

Description

Creates sex- and age-specific net migration datasets out of the total net migration using different methods. The age.specific.migration is a legacy function that distributes UN 5-year totals into ages using a residual method. The migration.totals2age distribute given totals using Rogers-Castro and the Flow Difference Method (FDM).

Usage

age.specific.migration(wpp.year = 2019, years = seq(1955, 2100, by = 5), 
    countries = NULL, smooth = TRUE, rescale = TRUE, ages.to.zero = 18:21,
    write.to.disk = FALSE, directory = getwd(), file.prefix = "migration", 
    depratio = wpp.year == 2015, verbose = TRUE)
    
migration.totals2age(df, ages = NULL, annual = FALSE, time.periods = NULL, 
    scale = 1, method = "rc", sex = "M",
    id.col = "country_code", mig.is.rate = FALSE, 
    rc.data = NULL, pop = NULL, pop.glob = NULL, ...)
    
rcastro.schedule(annual = FALSE)

Arguments

`wpp.year`	Integer determining which wpp package should be used to get the necessary data from. That package is required to have a dataset on total net migration (called `migration`).
`years`	Array of years that the reconstruction should be made for. This should be a subset of years for which the total net migration is available.
`countries`	Numerical country codes to do the reconstruction for. By default it is performed on all countries included in the `migration` dataset where aggregations are excluded.
`smooth`	Logical controlling if smoothing of the reconstructed curves is required. Due to rounding issues the residual method often yields unrealistic zig-zags on migration curves by age. Smoothing usually improves their look.
`rescale`	Logical controlling if the resulting migration should be rescaled to match the total migration.
`ages.to.zero`	Indices of age groups where migration should be set to zero. Default is 85 and older.
`write.to.disk`	If `TRUE` results are written to disk.
`directory`	Directory where to write the results if `write.to.disk` is `TRUE`.
`file.prefix`	If `write.to.disk` is `TRUE` results are written into two text files with this prefix, a letter “M” and “F” determining the sex, and concluded by the “.txt” suffix. By default “migrationM.txt” and “migrationF.txt”.
`depratio`	If it is `TRUE` it will use an internal dataset on migration dependency ratios to adjust the first three age groups. It can also be a name of a binary file containing such dataset. The default dataset is only available for 2015.
`verbose`	Logical controlling the amount of output messages.
`df`	data.frame, marix or data.table containing total migration counts or rates. Columns correspond to time, rows correspond to locations. Column “country_code” (or column identified by `id.col`) contains identifiers of the locations. Names of the time columns should be either single years if `annual` is `TRUE`, e.g. “2018”, “2019” etc., or five year time periods if `annual` is `FALSE`, e.g. “2010-2015”, “2015-2020” etc.
`ages`	Labels of age groups into which the total migration is to be disaggregated. If it is missing, default age groups are determined depending on the argument `annual`.
`annual`	Logical determining if the age groups are 5-year age groups (`FALSE`) or 1-year ages (`TRUE`) on which the choice of the default schedule is dependent, if `schedule` is missing. It also determines the expected syntax of the names of time columns in `df`.
`time.periods`	Character vector determining which columns should be considered in the `df` dataset. It should be a subset of column names in `df`. By default, all time columns in `df` are considered.
`scale`	The migration schedule is multiplied by this number. It can be used for example, if total migration needs to be distributed between sexes.
`method`	Method to use for the distribution of totals into age groups. The “rc” method uses either a basic Rogers-Castro disaggregation via the function `rcastro.schedule`, or a schedule given in the `rc.data` argument. The “fdmp” and “fdmnop” methods use the Flow Difference Method, where “fdmp” weights the flows by population.
`sex`	“M” or “F” determining the sex of this schedule. It only impacts the FDM methods.
`id.col`	Name of the unique identifier of the locations.
`mig.is.rate`	Logical indicating if the data in `df` should be interpreted as rates. If `FALSE`, `df` represent counts.
`rc.data`	data.table containing either a family of Rogers-Castro proportions if `method = "rc"`, or various inputs for the FDM methods if `method` is either “fdmp” or “fdmnop”. For the “rc” method, mandatory columns are “age” and “prop”. Optionally, it can have a column “mig_sign” with values “Inmigration” and “Emigration” (distinguishing schedules to be applied for positive and negative migration, respectively) and a column “sex” with values “Female” and “Male”. The format corresponds to the dataset `DemoTools::mig_un_families`, subset to a single family. For the FDM methods, it has columns contained in the `rcFDM` dataset, as well as columns “beta0” (intercept), “beta1” (slope), “min” (minimum rate), “in_sex_factor” (inflow female proportion), and “out_sex_factor” (outflow female proportion), used in the FDM methods. These columns correspond to columns “MigFDMb0”, “MigFDMb1”, “MigFDMmin”, “MigFDMsrin” and “MigFDMsrout”, respectively, in the `vwBaseYear` dataset.
`pop`	data.table with population counts needed for the FDM methods. It should have a location identifier column of the same name as `id.col`, further columns “year”, “age”, and “pop”.
`pop.glob`	data.table with global population needed for the weighted FDM method (“fdmp”). It should have columns “year”, “age”, and “pop”.
`...`	Further arguments passed to the underlying functions.

Details

Function `age.specific.migration`

Unlike in wpp2012, for the four releases of the WPP between 2015 and 2022, the wpp2015, wpp2017, wpp2019, and wpp2022, the UN Population Division did not publish the sex- and age-specific net migration counts, only the totals. However, since the sex- and age-schedules are needed for population projections, the age.specific.migration function attempts to reconstruct those missing datasets. It uses the published population projections by age and sex, fertility and mortality projections from the wpp package. It computes the population projection without migration and sets the residual to the published population projection as the net migration. By default such numbers are then scaled so that the sum over sexes and ages corresponds to the total migration count.

If smooth is TRUE a smoothing procedure is performed over ages where necessary. Also, for simplicity, we set migration of old ages to zero (default is 85+). Both is done before the scaling. If it is desired to obtain raw residuals without any additional processing, set smooth=FALSE, rescale=FALSE, ages.to.zero=c().

This function works only for 5-year data.

Function `migration.totals2age`

This function should be used when working with annual data or data from wpp2022 and wpp2024. It allows users to disagregate total migration counts or rates (for multiple time periods and multiple locations) into age-specific ones by either a schedule similar to the one used by the UN in WPP2024 (method = "fdmnop"), a Rogers-Castro (method = "rc"), or by FDM weighted by population (method = "fdmp") as described in Sevcikova et al (2024). The FDM method needs additional info passed via the arguments rc.data, pop and pop.glob. The default Rogers-Castro schedule can be accessed via the function rcastro.schedule where the annual argument specifies if it is for 1-year or 5-year age groups. Alternatively, an external schedule can be given via the rc.data argument, where one can distinguish between schedules for each sex, as well as for positive and negative net migration. It has the same structure as the dataset DemoTools::mig_un_families, but it should be a subset for a single family and converted to data.table.

Value

Function age.specific.migration returns a list of two data frames (male and female), each having the same structure as migrationM.

Function migration.totals2age returns a data.table with the disaggregated counts.

Function rcastro.schedule returns a vector of proportions for each age group.

Warning

Due to rounding issues and slight differences in the methodology, the functions do not reproduce the unpublished UN datasets exactly. It is only an approximation! Especially, the first age groups might be more off than other ages.

Note

These functions are called automatically from pop.predict if needed, depending on the inputs. Thus, only users that need sex- and age-specific migration for other purposes, or modify the defaults, will need to call these functions explicitly.

Further note that the wpp2024 package does contain the age-specific net migration for projected years (datasets migprojAge1dt, migprojAge5dt). Thus, if running pop.predict with wpp.year = 2024 and the default migration totals, no disagregation is necessary for the projected time periods. The disaggregation is only triggerered for the past time periods, or in a case when user-specific net migration totals are used.

Author(s)

Hana Sevcikova

References

H. Sevcikova, J. Raymer J., A. E. Raftery (2024). Forecasting Net Migration By Age: The Flow-Difference Approach. arXiv:2411.09878.

Examples

## Not run: 
asmig <- age.specific.migration()
head(asmig$male)
head(asmig$female)
## End(Not run)

# simple disaggregation for one location
totmig <- c(30, -50, -100)
names(totmig) <- 2018:2020
asmig.simple <- migration.totals2age(totmig, annual = TRUE, method = "rc")
head(asmig.simple)

## Not run: 
# disaggregate WPP 2019 migration for all countries, one sex
data(migration, package = "wpp2019")
# assuming equal sex migration ratio
asmig.all <- migration.totals2age(migration, scale = 0.5, method = "rc") 
# plot result for the US in 2095-2100
mig1sex.us <- subset(asmig.all, country_code == 840)[["2095-2100"]]
plot(ts(mig1sex.us))
# check that the sum is half of the original total
sum(mig1sex.us) == subset(migration, country_code == 840)[["2095-2100"]]/2
## End(Not run)

bayesPop documentation built on April 12, 2025, 1:24 a.m.

bayesPop index

Package overview

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

bayesPop
Probabilistic Population Projection

age.specific.migration: Generate Sex- and Age-specific Migration
In bayesPop: Probabilistic Population Projection

Generate Sex- and Age-specific Migration

Description

Usage

Arguments

Details

Function `age.specific.migration`

Function `migration.totals2age`

Value

Warning

Note

Author(s)

References

See Also

Examples

Related to age.specific.migration in bayesPop...

R Package Documentation

Browse R Packages

We want your feedback!

bayesPop Probabilistic Population Projection

age.specific.migration: Generate Sex- and Age-specific Migration In bayesPop: Probabilistic Population Projection

Generate Sex- and Age-specific Migration

Description

Usage

Arguments

Details

Function age.specific.migration

Function migration.totals2age

Value

Warning

Note

Author(s)

References

See Also

Examples

Related to age.specific.migration in bayesPop...

R Package Documentation

Browse R Packages

We want your feedback!

bayesPop
Probabilistic Population Projection

age.specific.migration: Generate Sex- and Age-specific Migration
In bayesPop: Probabilistic Population Projection

Function `age.specific.migration`

Function `migration.totals2age`