In lyh970817/gladfunctions: GLAD Functions

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = ">",
  results = "asis",
  prompt = FALSE,
  cache = FALSE,
  message = FALSE,
  warning = FALSE,
  echo = TRUE
)
library(googlesheets4)
sheets_auth(
  email = gargle::gargle_oauth_email(),
  path = "../../../GLAD-Questionnaires/sheet_scripts/glad-dict.json",
  scopes = "https://www.googleapis.com/auth/spreadsheets",
  cache = gargle::gargle_oauth_cache(),
  use_oob = gargle::gargle_oob_default(),
  token = NULL
)

# googlesheets4 option, see https://gargle.r-lib.org/articles/non-interactive-auth.html
options(gargle_oauth_email = TRUE)

List of functions in the package

GLAD_read: Read in all the Qualtrics exported raw csv files in the specified path.
GLAD_sheet: Read in the googlesheet dictionary sheet for a specified questionnaire. Note that the googlesheet must be read through this function and not by the googlesheets4 package directly.
GLAD_clean: Cleans all questionnaires or one specified questionnaire in 'dat_list' and creates exports.
GLAD_select: Exports selected variables from a data set by specifying their names in a text file.
GLAD_derive: Generates derived variables with names and formulae specified in the GLAD dictionary
GLAD_getdescr: Get Descripton (Title) for Selected Variables by specifying their names.
GLAD_plot: Generates plots for the specified variable in the GLAD data.
GLAD_vartest: Runs several variance tests (Levene's test, Fligner's test and Barlett's test) for comparing variance across gender.
GLAD_missing: Produces various summary statistics and plots for missingness examination of a questionnaire.
GLAD_qplot: Generates quantile plots for a specified variable in the GLAD data.

library(gladfunctions)

Clean Qualtrics exports

Specify the path to the directory containing the raw files exported directly from Qualtrics. This must contain the sign-up data and if an optional questionnaire is to be cleaned, the data of that optional questionnaire.
We have a python script to prepare raw data files (export Qualtrics data and remove participant personal information). Please speak to Henry Rogers in the BioResource office.

Raw data

raw_path <- "~/Data/GLAD/data_raw/"

Read in data files as a list from raw_path

dat_list <- GLAD_read(raw_path)

Clean data

Specify the path to export files to

clean_path <- "~/Data/GLAD/data_clean/"

Clean all the questionnaire and export to clean_path
- Specify limits = FALSE to avoid applying limits to continuous variables, so we could later examine the effects of limits with plotting functions.
- Specify rename = TRUE to rename all variable names to Easy.name (New.variable names if FALSE)
- Specify format which should be one of rds, feather, sav (for SPSS/JASP), sas (for SAS) and dta (for Stata).

GLAD_clean("ALL", dat_list, clean_path, limits = FALSE, rename = TRUE, format = "rds")

Clean a specified questionnaire and export to clean_path
On calling the function you will be prompted with a Google login page, after you agree for the credential file to be created. Please log in with an account that would allow you access to the GLAD dictionary.
Click "Allow" to grant permission to the Tidyverse API Packages.
"Click allow for the Tidyverse API Package to see, edit and delete your spreadsheets in Google Drive"
This will prompt you to enter a code. Please copy the code from the Google sign in and paste this where is says "Enter authorization code:" in your R console.

GLAD_clean("CIDID", dat_list, clean_path, limits = FALSE, rename = TRUE, format = "rds")

Select variables to export with text files containing the Easy.names. Each text file should be named as the acronym of a questionnaire (as in the dictionary tabs). Each variable name within the text file should be in a seperate line.

An empty text file with the questionnaire name will export all variables in that questionnaire.

Specify the data request directory for keeping request records and the person who requests the data

GLAD_select(clean_path,
  export_path = "./Data-Request/",
  person = "Leo",
  c("DEM.txt", "PAD.txt"), format = "sav"
)

Read in data file and dictionary sheet

rds is a fast loading file format that preserves variable type information and can be read like a csv.
I'm reading in the '_Renamed' version with Easy.name, but note that all the following should also work for the unrenamed version ( with New.variable names)

PAD <-
  readRDS(paste(clean_path, "rds_renamed/PAD_Renamed.rds", sep = "/"))

# `GLAD_sheet` is a vectorised function (vectorised functions take a vector
# and operate on all the items in a vector). This means that we can read in
# multiple sheets at once and store them in a list.

# We need to use [[1]] (with double square brackets) to extract the first
# element of an R list list.

sheet <- GLAD_sheet("PAD")[[1]]

Add derived variables to data

Please check sheet 'PTSD' on the dictionary sheet for a simple example and "AGP" for a more complicated one. Instructions are also available here.

PTSD <- readRDS(paste(clean_path, "rds_renamed/PTSD_Renamed.rds", sep = "/"))
sheet2 <- GLAD_sheet("PTSD")[[1]]

AGP <- readRDS(paste(clean_path, "rds_renamed/AGP_Renamed.rds", sep = "/"))
sheet3 <- GLAD_sheet("AGP")[[1]]

PTSD_withderived <- GLAD_derive(PTSD, sheet2)
AGP_withderived <- GLAD_derive(AGP, sheet3)

# These are the last few columns we just added
tail(colnames(PTSD_withderived), 2)
tail(colnames(AGP_withderived), 4)

Get variable descriptions

GLAD_getdescr(colnames(PAD)[5:8], sheet)

Special feature: `as.numeric` factor variables

We normally are unable to recode factor variables as numeric variable.

Applying as.numeric to the R built-in class factor only returns its internal integer representation.

unique(as.numeric(as.factor(PAD[["pad.anx_future_panic_attacks"]])))

However, the lfactor package preserves numeric values when recoding a numeric variable to factor. Therefore, we can use this package to recode our factor variables to numeric variables.

For example, for a Binary variable:

unique(as.numeric(PAD[["pad.anx_future_panic_attacks"]]))

Alternatively, you can use numeric copies of the factor variables. These numeric copies have been copied from the raw data and put into the cleaned data. These copy variables have the same names as the original variables but with "_numeric" at the end.

grep("numeric", colnames(PAD), value = T)

Plots

The GLAD_plot function returns plots for a specified variable, different plots are returned depending on the variable type, the information of which is provided through the googlesheet argument.

```r

Use `fig.width` to avoid the figure being truncated.

GLAD_plot(data = PAD, var = "Sex", googlesheet = sheet)

```r
GLAD_plot(
  data = PAD,
  var = "pad.anx_future_panic_attacks",
  googlesheet = sheet
)

With variables of Numeric/Continuous type, multiple plot objects are returned in a named list.
Note that for these variables, it is possible to supply a logical argument include_outlier. For variables that don't have maximum or minimum in the dictionary hence haven't been cleaned, this is useful for deciding cut-offs. When include_outlier is set to TRUE, it's also possible to specify your own limits for plots through the limit argument.
Set binwidth = "FD" to apply the Freedman-Diaconis rule for deciding optimal binwidth. The default is '1' and should be suitable for most variables in the data sets.

continuous_plots <- GLAD_plot(
  data = PAD,
  var = "pad.frequency_panic_attacks",
  googlesheet = sheet,
  include_outlier = FALSE,
  limits = c(1, 50),
  binwidth = "FD"
)

continuous_plots$point

continuous_plots$hist

continuous_plots$density

continuous_plots$densitybysex

This plot is for categorical variables that allow multiple options to be selected. Put in one of the variables representing the options.
Don't seem to have a general way to extract the title from the dictionary. Specify the title for the plot yourself.

GLAD_plot(
  data = PAD,
  var = "pad.sweating",
  title = "PAD Screening",
  googlesheet = sheet
)

Quantile plot

GLAD_qplot(data = PAD, var = "pad.frequency_panic_attacks", googlesheet = sheet)

Variance tests by sex

These tests allow us to explore whether variances differ across sex.

GLAD_vartest(data = PAD, var = "pad.frequency_panic_attacks")

For quantile plot and variance test, I also have a version that extracts all the continuous variables and loops through them. Would that be more desirable?

Missingness

The GLAD_missng function returns a named list of various missingness summaries and plots.

missingness <- GLAD_missing(data = PAD)
 ```

* Percentage of participants with no variable missing.

```r
missingness$percent
 ```

```r
missingness$descr

missingness$freq

The table below: a value of '1' indicates the percentage of participants missing for the corresponding variable.

print(missingness$table)

missingness$plot

lyh970817/gladfunctions documentation built on Sept. 19, 2021, 2:01 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

lyh970817/gladfunctions
GLAD Functions

In lyh970817/gladfunctions: GLAD Functions

List of functions in the package

Clean Qualtrics exports

Raw data

Clean data

Read in data file and dictionary sheet

Add derived variables to data

Get variable descriptions

Special feature: `as.numeric` factor variables

Plots

Use `fig.width` to avoid the figure being truncated.

Quantile plot

Variance tests by sex

Missingness

R Package Documentation

Browse R Packages

We want your feedback!

lyh970817/gladfunctions GLAD Functions

In lyh970817/gladfunctions: GLAD Functions

List of functions in the package

Clean Qualtrics exports

Raw data

Clean data

Read in data file and dictionary sheet

Add derived variables to data

Get variable descriptions

Special feature: as.numeric factor variables

Plots

Use fig.width to avoid the figure being truncated.

Quantile plot

Variance tests by sex

Missingness

R Package Documentation

Browse R Packages

We want your feedback!

lyh970817/gladfunctions
GLAD Functions

Special feature: `as.numeric` factor variables

Use `fig.width` to avoid the figure being truncated.