knitr::opts_chunk$set( collapse = TRUE, comment = "#>", echo = TRUE#, # eval = FALSE )
Welcome, this document walks you through the process of sharing anonymized data for remote sensing analysis withe HQ team.
To follow this tutorial you will need to make sure that Rstudio is installed (you can download the free tier)
clone
button and then download zip
surveyGEER.rproj
workflow.Rmd
file inside the vignettes/
folderdecouple_coordinates.R
file inside of the R/
directoryYou have finished the initial set up. It is now time run the code that sets up the data.
surveyGEER.rproj
file in step 4 above, Rstudio should have opened up. workflow.Rmd
file you put int the vignettes/
folder in step 5 above. You can open that file up using your file browser or the file navigation pane that is the lower right corner of your R studio console..Rmd
document it allows you to write in plain text and while also writing R code in designated chunks. These can be knit
into reports like the one you are reading.## Sharing Data
section and look at run the first code chunk (labelled library_setup
). This loads required libraries/tools. If you receive any errors saying that you don't have the packages required, you can uncomment the code below the library calls to install those packages. Press the side-ways triangle in the top right corner of the chunk to run the entire chunk. If you need to run line by line you can bring the cursor to the line and press ctrl + enter
. The first chunk looks like this:devtools::load_all() library(tidyverse) library(here) # install.packages("tidyverse") # install.packages("here") # install.packages("sf")
On to the next code chunk
Here is where you must supply some parameters specific to your data set:
.csv
file to the data folder path_to_data_with_coords_csv <- file.path( Sys.getenv("NGA_MSNA2022"),"nga_nw_msna.xlsx") dat <- readxl::read_xlsx(here::here(path_to_data_with_coords_csv)) dat <- read_rds(here::here("data/hsmv/hsmv_bfa_car_drc_ssd_coords.rds")) # dat <- (here::here(path_to_data_with_coords_csv),locale = readr::locale(encoding = "latin1")) # dat$ latitude_col_name <- "latitude" longitude_col_name <- "longitude" uid_col_name<- "uuid" # column names longitude_col_name <- "enter here" latitude_col_name <- "enter here" uid_col_name <- "enter_here" cols_to_keep <- NULL
Run the chunk below it will:
- create a data_share
folder with anonymized coordinates
- create a vault
folder with a lookup table
- you can share the contents of the data_share
folder with the RS analyst
- keep the vault
folder and do not loose or share it
You are done. In a couple days or less the RS analyst will contact you to provide you with the RS data. When this happens jumpt down to the ## Merging Data
section below.
dat <- dat |> filter(!is.na(!!sym(latitude_col_name))) decouple_coordinates2(df = dat, uuid = uid_col_name, lon = longitude_col_name , lat = latitude_col_name , cols_to_keep= cols_to_keep, country_code = "hsmv_compiled" ) # read_rds("data_share/coords_anonymized.rds") # read_rds("vault/lookup.rds") # done
path
and uid
information into this chunk. +If you have a newer/cleaner data set please provide the new path and uid information below. In this case I also recommend saving the new/clean data as a .csv
in the data
folder. country_code <- "car" path_to_clean_data <- "20221019_CAR_clean_data_HSMV.csv" if(path_to_clean_data!="enter here"){ clean_data <- read_csv(path_to_clean_data,locale = readr::locale(encoding = "latin1")) } clean_data <- read_rds("vault/hsmv_compiled_lookup.rds") # clean_data <- dat # clean_data$ clean_data_uid_col_name <- "uuid"
data_share
folder.country_code="hsmv_compiled" df_with_indicators <- merge_indicators2(df = clean_data, country_code =country_code , df_uuid = clean_data_uid_col_name, hq=T) if(country_code=="hsmv_compiled"){ df_w_cc <- df_with_indicators |> separate(col = uuid,into = c("country_code", "uuid"),sep="_") df_w_cc |> count(country_code) split(df_w_cc,df_w_cc$country_code) |> purrr::imap( ~write_csv(.x,glue::glue("data_share/{.y}_hsmv_rs_extracted.csv")) ) } name_prefix <- Sys.Date() |> str_replace_all("-","") file_path <- glue::glue("data_share/{name_prefix}_msna_with_rs_{country_code}") file_path <- glue::glue("data_share/{name_prefix}_hsmv_with_rs_{country_code}") write_rds(df_with_indicators,here::here(glue::glue("{file_path}.rds"))) # write_csv(df_with_indicators,here::here(glue::glue("{file_path}.csv")))
Specific
# for car specifically df_with_indicators |> select(-contains ("GPS")) |> write_csv("C:\\Users\\zack.arno\\OneDrive - ACTED\\Zack_and_Matt\\MSNA_HSM\\HSM_Validation\\CAR\\20221111_hsmv_with_rs_car.csv")
Let's look at the outputs - first the anonymized coordinates The geodata is anonymous. - this file can be sent to RS analyst for RS extraction - the data keeper can should delete this folder after sending and keep only the vault folder
geodat <- read_rds(file = "data_share/coords_anonymized.rds")
lockup is also is anonymous.
lookup<- read_rds("vault/lookup.rds")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.