shsannualreport is an R package to prepare the results of the Scottish Household Survey(SHS) for publication as an R Shiny app. The package allows users to prepare source data for publication, and to then build an R Shiny app based on the processed data.
You can install the development version of shsannualreport from GitHub with:
# install.packages("devtools")
devtools::install_github("thomascrines/shsannualreport")
If you are working on SCOTS, or if the above does not work, you can install manually:
install.packages("C:/DownloadDirectory/shsannualreport-master/shsannualreport-master", repos = NULL,
type="source", lib = "C:/YourLibraryPath")
Library path can be seen by running .libPaths()
The package depends on having source data which is currently stored in a secure folder on Objective Connect.
If this data has not yet been published, it won’t be available outside the SHS team.
If the data has been published, please contact SHS@gov.scot to request access to the source data.
To prepare source data prior to running the functions in the package:
There are three public functions in the package:
shs_create_names_workbooks
- Creates two Excel workbooks,
column_names.xlsx
and variable_names.xlsx
, with names as they
appear in the source data and values to rename them to as they will
appear in the final app.shs_create_app_data
- Extracts all data and metadata from Excel
sources, processes the data, and saves in Rds format suitable for
use in the SHS Data Explorer app.shs_create_app
- Copies or creates all necessary files for the SHS
Data Explorer app, including the data output by
shs_create_app_data. The resulting app can then be run locally or
deployed.shs_create_names_workbooks
outputs two files, column_names.xlsx
and
variable_names.xlsx
, to the specified destination_directory
. Both
sheets will have the same stucture, with two columns: source_name
and
display_name
. The source_name
column lists all names as they appear
in the source data, the display_name
column can be populated with the
desired names to appear in the app.
In column_names.xlsx
, source_name
will be populated with all column
names in all files in the source_dataset_directory
, minus any names
specified in columns_to_remove
.
In variable_names.xlsx
, source_name
will be populated with all
values in the second column on all files in the
source_dataset_directory
. Here is important that the
columns_to_remove
values are correct, as the source data can contain
unnecessary columns which can result in the output being populated with
values from these columns rather than the desired columns.
In both output files, the display_name
value can be auto-populated
with previous values, if a path to an existing template file is provided
to the argument existing_column_names_path
and/or
existing_variable_names_path
.
In total, shs_create_names_workbooks
accepts five arguments:
destination_directory
: The path of the directory to write the
column names and variables names Excel files to. (Must be an
existing directory.)source_dataset_directory
: The path of the directory containing
source dataset in Excel format. (This should be the sub-directory
named dataset
following the structure described above.)columns_to_remove
: A list of unnecessary columns to remove from
the dataset.existing_column_names_path
(optional): A path to an existing Excel
file with the same structure. Any ‘display_name’ values will be
copied to the new output file.existing_variable_names_path
(optional): A path to an existing
Excel file with the same structure. Any ‘display_name’ values will
be copied to the new output file.Once the output files are created, the display_name
columns of both
output files can be manually edited to the desired values.
NOTE: some special characters can run locally, but break when deployed
due to encoding differences, e.g £
. There is currently no fix in the
package for these characters, so they will need to be changed to
something else when deploying to shinyapps.io.
shs_create_app_data
takes a source dataset and metadata in Excel
format, processes the data to make it suitable for the SHS Data Explorer
app, and saves the output files in Rds format.
The processing done in shs_create_app_data
includes:
source_data_directory
columns_to_remove
from the datasetcolumn_names_workbook_path
and
variable_names_workbook_path
respectivelyThere are also a number of other processing functions which are described in more detail in the vignette (to do), including:
In total shs_create_app_data
accepts five arguments:
destination_directory
: The path of the directory to write the
output dataset to. (Must be a pre-existing directory.)source_data_directory
: The path of the directory containing source
data in Excel format. (This should be the top level directory which
contains sub-directories dataset
and metadata
following the
structure described above.)columns_to_remove
: A list of unnecessary columns to remove from
the dataset.column_names_workbook_path
: A path to an Excel workbook with
details of the source names and display names of columns in the
source dataset. If you don’t have a suitable file,
shs_create_names_workbook
can be used to generate this file.variable_names_workbook_path
: A path to an Excel workbook with
details of the source names and display names of variables in the
source dataset. If you don’t have a suitable file,
shs_create_names_workbook
can be used to generate this file.When the data has been output to the destination_directory
it can then
be used to create a version of the SHS Data Explorer app using
shs_create_app
.
shs_create_app
will take data output by shs_create_app_data
and
generate a Shiny app based on the contained data. The app will be
written to the specified destination_directory
, from where it can be
run locally, deployed etc.
In total shs_create_app_data
accepts five arguments:
destination_directory
: The path of the directory to create the
final app in.data_directory
: The path of the directory containing all data and
metadata created by shs_create_app_data
.reports_start_year
: The first year to allow users to generate
reports for in the ‘Generate Report’ section of the app.reports_end_year
: The final year to allow users to generate
reports for in the ‘Generate Report’ section of the app.The following example is based on usage in SCOTS, with the source data downloaded and set up according to the instructions in ‘prerequisites’ in the ‘Downloads’ folder.
To use the examples on other computers, all that needs changed are the folder paths, which will vary by user.
As the destination_directory
paths for both
shs_create_names_workbooks
and shs_create_app_data
must exist before
running, directories for the example I created directories ‘names’ and
‘appdata’ in ‘Downloads’.
NOTE: Running functions with shsannualreport::
prefix alone seems not
to load dependencies, running library(shsannualreport)
before running
any functions will prevent errors.
library(shsannualreport)
destination_directory <- "C:/Users/[user's U number]/Downloads/names"
source_dataset_directory <- "C:/Users/[user's U number]/Downloads/data/dataset"
columns_to_remove <- c("sort", "_LABEL_", "var", "LABEL")
existing_column_names_path <- "C:/Users/[user's U number]/Downloads/old_column_names.xlsx"
existing_variable_names_path <- "C:/Users/[user's U number]/Downloads/old_variable_names.xlsx"
shs_create_names_workbooks(destination_directory = destination_directory,
source_dataset_directory = source_dataset_directory,
columns_to_remove = columns_to_remove,
existing_column_names_path = existing_column_names_path,
existing_variable_names_path = existing_variable_names_path)
NOTE: Although shs_create_names_workbooks
can be used to generate
column and variable names re-naming spreadsheets, it doesn’t have to be
run if you already have these sheets. The outputs can be saved off and
the paths passed to column_names_workbook_path
and
variable_names_workbook_path
. This example uses the sheets output in
the shs_create_names_workbooks
example above.
library(shsannualreport)
destination_directory <- "C:/Users/[user's U number]/Downloads/appdata"
source_data_directory <- "C:/Users/[user's U number]/Downloads/data"
columns_to_remove <- c("sort", "_LABEL_", "var", "LABEL")
column_names_workbook_path <- "C:/Users/[user's U number]/Downloads/names/column_names.xlsx"
variable_names_workbook_path <- "C:/Users/[user's U number]/Downloads/names/variable_names.xlsx"
shs_create_app_data(destination_directory = destination_directory,
source_data_directory = source_data_directory,
columns_to_remove = columns_to_remove,
column_names_workbook_path = column_names_workbook_path,
variable_names_workbook_path = variable_names_workbook_path)
NOTE: As above, if data has been previously generated by
shs_create_app_data
and the source data hasn’t changed,
shs_create_app
can be run without re-running shs_create_app_data
.
library(shsannualreport)
destination_directory <- "C:/Users/[user's U number]/Downloads"
data_directory <- "C:/Users/[user's U number]/Downloads/appdata/data"
reports_start_year <- 2013
reports_end_year <- 2019
shs_create_app(destination_directory = destination_directory,
data_directory = data_directory,
reports_start_year = reports_start_year,
reports_end_year = reports_end_year)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.