Description Usage Arguments Value Varying slopes Examples

User-level access to internal demeaning algorithm of `fixest`

.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ```
demean(
X,
f,
slope.vars,
slope.flag,
data,
weights,
nthreads = getFixest_nthreads(),
notes = getFixest_notes(),
iter = 2000,
tol = 1e-06,
na.rm = TRUE,
as.matrix = is.atomic(X),
im_confident = FALSE
)
``` |

`X` |
A matrix, vector, data.frame or a list OR a formula. If equal to a formula, then the argument |

`f` |
A matrix, vector, data.frame or list. The factors used to center the variables in argument |

`slope.vars` |
A vector, matrix or list representing the variables with varying slopes. Matrices will be coerced using |

`slope.flag` |
An integer vector of the same length as the number of variables in |

`data` |
A data.frame containing all variables in the argument |

`weights` |
Vector, can be missing or NULL. If present, it must contain the same number of observations as in |

`nthreads` |
Number of threads to be used. By default it is equal to |

`notes` |
Logical, whether to display a message when NA values are removed. By default it is equal to |

`iter` |
Number of iterations, default is 2000. |

`tol` |
Stopping criterion of the algorithm. Default is |

`na.rm` |
Logical, default is |

`as.matrix` |
Logical, if |

`im_confident` |
Logical, default is |

It returns a data.frame of the same number of columns as the number of variables to be centered.

If `na.rm = TRUE`

, then the number of rows is equal to the number of rows in input minus the number of NA values (contained in `X`

, `f`

, `slope.vars`

or `weights`

). The default is to have an output of the same number of observations as the input (filled with NAs where appropriate).

A matrix can be returned if `as.matrix = TRUE`

.

You can add variables with varying slopes in the fixed-effect part of the formula. The syntax is as follows: fixef_var[var1, var2]. Here the variables var1 and var2 will be with varying slopes (one slope per value in fixef_var) and the fixed-effect fixef_var will also be added.

To add only the variables with varying slopes and not the fixed-effect, use double square brackets: fixef_var[[var1, var2]].

In other words:

fixef_var[var1, var2] is equivalent to fixef_var + fixef_var[[var1]] + fixef_var[[var2]]

fixef_var[[var1, var2]] is equivalent to fixef_var[[var1]] + fixef_var[[var2]]

In general, for convergence reasons, it is recommended to always add the fixed-effect and avoid using only the variable with varying slope (i.e. use single square brackets).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 | ```
# Illustration of the FWL theorem
data(trade)
base = trade
base$ln_dist = log(base$dist_km)
base$ln_euros = log(base$Euros)
# We center the two variables ln_dist and ln_euros
# on the factors Origin and Destination
X_demean = demean(X = base[, c("ln_dist", "ln_euros")],
f = base[, c("Origin", "Destination")])
base[, c("ln_dist_dm", "ln_euros_dm")] = X_demean
est = feols(ln_euros_dm ~ ln_dist_dm, base)
est_fe = feols(ln_euros ~ ln_dist | Origin + Destination, base)
# The results are the same as if we used the two factors
# as fixed-effects
etable(est, est_fe, se = "st")
#
# Variables with varying slopes
#
# You can center on factors but also on variables with varying slopes
# Let's have an illustration
base = iris
names(base) = c("y", "x1", "x2", "x3", "species")
#
# We center y and x1 on species and x2 * species
# using a formula
base_dm = demean(y + x1 ~ species[x2], data = base)
# using vectors
base_dm_bis = demean(X = base[, c("y", "x1")], f = base$species,
slope.vars = base$x2, slope.flag = 1)
# Let's look at the equivalences
res_vs_1 = feols(y ~ x1 + species + x2:species, base)
res_vs_2 = feols(y ~ x1, base_dm)
res_vs_3 = feols(y ~ x1, base_dm_bis)
# only the small sample adj. differ in the SEs
etable(res_vs_1, res_vs_2, res_vs_3, keep = "x1")
#
# center on x2 * species and on another FE
base$fe = rep(1:5, 10)
# using a formula => double square brackets!
base_dm = demean(y + x1 ~ fe + species[[x2]], data = base)
# using vectors => note slope.flag!
base_dm_bis = demean(X = base[, c("y", "x1")], f = base[, c("fe", "species")],
slope.vars = base$x2, slope.flag = c(0, -1))
# Explanations slope.flag = c(0, -1):
# - the first 0: the first factor (fe) is associated to no variable
# - the "-1":
# * |-1| = 1: the second factor (species) is associated to ONE variable
# * -1 < 0: the second factor should not be included as such
# Let's look at the equivalences
res_vs_1 = feols(y ~ x1 + i(fe) + x2:species, base)
res_vs_2 = feols(y ~ x1, base_dm)
res_vs_3 = feols(y ~ x1, base_dm_bis)
# only the small sample adj. differ in the SEs
etable(res_vs_1, res_vs_2, res_vs_3, keep = "x1")
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.