Dan Kary 2018-01-16
The saproj
package provides functions & file templates to setup and
maintain R-based Southwick analyses. It aims to enable:
The saproj
approach relies on Southwick-specific R installations,
available on Office 365 (R Software > Documents > Installations).
These provide a common set of (consistently versioned) R packages that
each Southwick analyst can access. The package library can be extended
on a project-by-project basis using functions in saproj
that provide:
Project Initialization:
new_project()
to begin a new project with a corresponding package
librarysetup_project()
to set a project libary for an in-progress
analysisupdate_project()
to adapt existing project code for a new projectProject Portability:
snapshot_library()
to record packages installed in the project
libraryrestore_library()
to restore snapshotted packages to another
computerFamiliarity with a few topics is recommended for effective R-based analysis:
R Basics. The Base R Cheatsheet provides a nice reference.
Familiarity with the tidyverse generally, and in particular dplyr (for data manipulation) and ggplot2 (for visualization)
RMarkdown which provides a notebook-based approach that emphasizes documentation
Additional Reading: R for Data Science provides in-depth treatment on many R analysis topics.
Once a project is initialized with saproj
, any calls to
install.packages()
will place packages in the appropriate project
library. This provides a means to utilize packages not included in the
Southwick R installation. Also, because saproj
specifies an R version
and package library for each project, the software dependencies of the
analysis can be automatically checked on R startup. This enables the
analysis to be easily (and reliably) re-run in the future or on another
computer.
The saproj
package makes it easy to start a project:
Create a new project in RStudio by clicking File > New Project (see RStudio Projects for an introduction).
Open the new-project.Rproj and run library(saproj)
followed
by new_project("your-project-name")
from the console. You can run
view_projects()
to check the availability of project names.
Running new_project()
populates the project with 3 folders and 2
files. This approach:
’
You can use R Markdown templates from saproj
when creating a new
script:
’
Run setup_project("project-name")
to create a project library for an
existing project (which saproj
accomplishes by making an .Rprofile
file).
For repeated projects (or those similar to another project) it can be
useful to copy an older project’s code and adapt it for a new project.
This is facilitated in saproj
by:
Copy the old project analysis folder to a new location (removing old project-specific files that won’t be needed such as html documentation and results output).
Run update_project("new-project-name")
to initialize a new project
library. Then restore_library()
can be run to bring any for the
original project’s packages into the new library.
Project-specific packages can be installed by simply running
install.packages()
in a project that was setup using saproj
. After
installing packages, you should run snapshot_library()
, which creates
(or updates) a snapshot-library.csv that serves as a record of
packages installed into the project library. The library can be restored
on another machine using restore_library()
.
Entire analysis in 3 scripts:
Organizing scripts in sub-folders (i.e., sections):
Note: Making sections can be automated with new_section()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.