build_training_data | R Documentation |
These functions create a subset of the dataset data_rangers
and log-transform (+1) some of its columns.
build_initial_training_data(data, formula, survey, spatial = FALSE) build_final_training_data(data, formula, survey, spatial = FALSE) build_final_pred_data(data, formula, survey, spatial = FALSE, outliers = NULL) build_data(data, formula, type, spatial = FALSE) handle_PA_area(data, survey, formula = NULL, keep_details = FALSE) handle_outliers(data, outliers = NULL) handle_transform(data) handle_order(data) handle_na(data, response, NA_in_resp = NULL, NA_in_preds = NULL)
data |
the complete dataset |
formula |
the formula for the LMM or RF |
survey |
the criterion used to select rows depending on whether the focal number of personnel is: - completely unknown ("complete_unknown") - completely or partially unknown ("partial_unknown") - completely or partially known ("partial_known") - completely known ("complete") according to the choice, the variable PA_area is also adjusted. |
spatial |
whether or not keeping predictor for fitting spatial effects (default = FALSE) |
outliers |
a vector with the name of the countries/territories to discard |
type |
either "prediction" or "training" |
keep_details |
whether or not to keep variables used for construction (default = FALSE) |
response |
the unquoted name of the response variable |
NA_in_resp |
whether or not to keep only NA (TRUE) or discard them all (FALSE) in response variable (default = NULL -> do nothing) |
NA_in_preds |
whether or not to keep only NA (TRUE) or discard them all (FALSE) in predictor variables (default = NULL -> do nothing) |
a tibble
build_initial_training_data()
: build the initial training datasets
build_final_training_data()
: build the final training datasets
build_final_pred_data()
: build the final prediction datasets
build_data()
: internal function to build the training and prediction datasets
handle_PA_area()
: internal function to handle PA_area while building the datasets
handle_outliers()
: internal function to handle outliers while building the datasets
handle_transform()
: internal function to handle variable transformation while building the datasets
handle_order()
: internal function to handle order of variables while building the datasets
handle_na()
: internal function to handle missing data while building the datasets
## Not run: ## Here is how we created the data stored in this package: data_test <- build_initial_training_data(data_rangers, formula = staff_rangers ~ pop_density_log + lat + long + country_UN_subcontinent + PA_area_log + area_country_log + area_forest_pct + GDP_2019_log + GDP_capita_log + GDP_growth + unemployment_log + EVI + SPI + EPI_2020 + IUCN_1_4_prop + IUCN_1_2_prop, survey = "partial_known") data_test <- data_test[!is.na(data_test$staff_rangers_log), ] if (require(usethis)) { usethis::use_data(data_test, overwrite = TRUE) } ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.