lingpg: Linearization of the gender pay (wage) gap.
In vardpoor: Variance Estimation for Sample Surveys by the Ultimate Cluster Method

Description Usage Arguments Value References See Also Examples

Estimation of gender pay (wage) gap and computation of linearized variables for variance estimation.

lingpg(
  Y,
  gender = NULL,
  id = NULL,
  weight = NULL,
  sort = NULL,
  Dom = NULL,
  period = NULL,
  dataset = NULL,
  var_name = "lin_gpg",
  checking = TRUE
)

`Y`	Study variable (for example the gross hourly earning). One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`gender`	Numerical variable for gender, where 1 is for males, but 2 is for females. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`id`	Optional variable for unit ID codes. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`weight`	Optional weight variable. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`sort`	Optional variable to be used as tie-breaker for sorting. One dimensional object convertible to one-column `data.table` or variable name as character, column number.
`Dom`	Optional variables used to define population domains. If supplied, estimation and linearization of gender pay (wage) gap is done for each domain. An object convertible to `data.table` or variable names as character vector, column numbers.
`period`	Optional variable for survey period. If supplied, estimation and linearization of gender pay (wage) gap is done for each time period. Object convertible to `data.table` or variable names as character, column numbers.
`dataset`	Optional survey data object convertible to `data.table`.
`var_name`	A character specifying the name of the linearized variable.
`checking`	Optional variable if this variable is TRUE, then function checks data preparation errors, otherwise not checked. This variable by default is TRUE.

A list with two objects are returned:

value - a data.table containing the estimated gender pay (wage) gap (in percentage).
lin - a data.table containing the linearized variables of the gender pay (wage) gap (in percentage) for variance estimation.

Working group on Statistics on Income and Living Conditions (2004) Common cross-sectional EU indicators based on EU-SILC; the gender pay gap. EU-SILC 131-rev/04, Eurostat.
Guillaume Osier (2009). Variance estimation for complex indicators of poverty and inequality. Journal of the European Survey Research Association, Vol.3, No.3, pp. 167-195, ISSN 1864-3361, URL https://ojs.ub.uni-konstanz.de/srm/article/view/369.
Jean-Claude Deville (1999). Variance estimation for complex statistics and estimators: linearization and residual techniques. Survey Methodology, 25, 193-203, URL https://www150.statcan.gc.ca/n1/pub/12-001-x/1999002/article/4882-eng.pdf.

linqsr, lingini, varpoord , vardcrospoor, vardchangespoor

library("data.table")
library("laeken")
data("ses")
dataset1 <- data.table(ID = paste0("V", 1 : nrow(ses)), ses)

dataset1[, IDnum := .I]

setnames(dataset1, "sex", "sexf")
dataset1[sexf == "male", sex:= 1]
dataset1[sexf == "female", sex:= 2]
  
# Full population
gpgs1 <- lingpg(Y = "earningsHour", gender = "sex",
                id = "IDnum", weight = "weights",
                dataset = dataset1)
gpgs1$value
  
## Not run: 
# Domains by education
gpgs2 <- lingpg(Y = "earningsHour", gender = "sex",
                id = "IDnum", weight = "weights",
                Dom = "education", dataset = dataset1)
gpgs2$value
    
# Sort variable
gpgs3 <- lingpg(Y = "earningsHour", gender = "sex",
                id = "IDnum", weight = "weights",
                sort = "IDnum", Dom = "education",
                dataset = dataset1)
gpgs3$value
    
# Two survey periods
dataset1[, year := 2010]
dataset2 <- copy(dataset1)
dataset2[, year := 2011]
dataset1 <- rbind(dataset1, dataset2)

gpgs4 <- lingpg(Y = "earningsHour", gender = "sex",
                id = "IDnum", weight = "weights", 
                sort = "IDnum", Dom = "education",
                period = "year", dataset = dataset1)
gpgs4$value
names(gpgs4$lin)
## End(Not run)