estimate_matches: Run lasso estimation on the prepared matrixes
In EBukin/lassopmm: Create synthetic panels from cross sectional data

Description Usage Arguments Value Examples

Run lasso estimation on the prepared matrixes

1 2	estimate_matches(source_x_mat, source_y_mat, source_w_mat, target_x_mat, n_near = 5, n_folds = 10, force_lambda = NULL, reduced = TRUE)

`source_x_mat, source_y_mat, source_w_mat`	matrixes of independent, dependent variables and weights
`target_x_mat`	matrixes of independent variables for the prediction sample
`n_near`	number of the nearest observations to derive a random match. If 'n_near' is greater than 'length(match_vector)', minimum out of two is used to create a sample for selecting a random match value.
`n_folds`	number of folds for cross-validation
`force_lambda`	allows to specify one lamda value. Shoul be 0, when we want to switch to the linear regression.
`reduced`	if TRUE terurns reduced outpur without specific regression details.

In both reduced=TRUE and reduced=FALSE forms, the function returns a list with elements. In the form reduced=TRUE only results of matching are returned:

source_y_hat is the vecrtor of predicted values based on the result of the lasso regression with nfolds cross validation and "mse" measure for identifying minimizing value of lambda. It uses source_x_mat, source_y_mat, and source_w_mat for running regression and predictind source_y_hat.
target_y_hat is the vetor of predicted values produced using the source_y_hat regression results and target_x_mat independent variables matrix.
match is the dataframe with: columns target_id - index of each target_y_hat value; column target_y_hat - its' value; column source_id - position of the nearest match from the source_y_hat vector and column source_y_hat - value of the nearest match

In the not reduced form the list contains more elements:

items source_x_mat, source_y_mat, source_w_mat and target_x_mat from the inputs to the functions.
lambda_cv - result of the cv.glmnet.
fit - result of the glmnet

library(dplyr)
library(purrr)
library(glmnet)
library(lassopmm)

# Run estimate of a fake data
XX <- as.matrix(mtcars[, !names(mtcars) %in% "hp"])
YY <- as.matrix(mtcars[, "hp"])
WW <- matrix(rep(1, nrow(mtcars)), ncol = 1, nrow = nrow(mtcars))
XX1 <- as.matrix(mtcars[1:10, !names(mtcars) %in% "hp"])

# Running simple estimation and returning
a <- estimate_matches(source_x_mat = XX, source_y_mat = YY,
                      source_w_mat = WW, target_x_mat = XX1,
                      reduced = FALSE, n_near = 5)

# Extract regression coefficients
a$fit %>% coef()
str(a, max.level = 1)

# Developing a sample bootsrtap vector
perm_example <- purrr::map(1:5, ~ sample(1:nrow(YY), nrow(YY), TRUE))

# Running estimations on every single bootstrap permutation vector
a_boot <-
  perm_example %>%
  purrr::map(~ lassopmm::estimate_matches(
    source_x_mat = XX[.x, ],
    source_y_mat = YY[.x, ],
    source_w_mat = WW[.x, ],
    target_x_mat = XX1,
    reduced = FALSE,
    n_near = 5
  ))

# Exploring the structure
a_boot %>%
  str(max.level = 1)

# Accesssing fit of a specific bootstrap iteration
a_boot[[3]]$fit

# Doing the same with each single bootstrap iteration
a_boot %>%
  map("fit")

# Extracting coefficinets from each single bootstrap iteration
a_boot %>%
  map("fit") %>%
  map(coefficients)

# Combine all coefficients in a table
aa <-
  a_boot %>%
  map("fit") %>%
  map(broom::tidy) %>%
  map(~ select(.x, term, estimate))
map(seq_along(aa), .f = function(x) {
  rename_at(aa[[x]], vars(estimate), list(~ paste0(., "_", x)))}) %>%
  reduce(full_join, by = "term")