Home

/

GitHub

/

In context-dependent/fsctemplates:

knitr::opts_chunk$set(echo = TRUE, dev = "CairoPNG", warning = FALSE, message = FALSE)
knitr::opts_knit$set(root.dir = here::here())

pacman::p_load(
  "tidyverse"
)

Questions, I know you've got em.

What is this document?

It's a README for this and other analysis projects. It tells you what's in the folder and how best to use it. We're still very early in the process of formalizing our analysis workflow at Blueprint, so watch for changes to the analysis project template.

Why can't I just do things my own way?

Because we all need to be on the same page about how projects are organized so that when analysis changes hands for feedback, review, or collaboration, the person jumping in on the project isn't lost. If you have questions, comments, or thoughts on how we could better organize projects, please share them. For the foreseeable future, this project template will be in the "alpha" stage of development, meaning that it can be flexible and responsive to updates.

Where does this come from?

This is adapted from a number of public resources on R project management, tailored to the specific needs of Blueprint. It is designed to be a version of a research compendium, as described by Marwick et al.

What's the point?

Review: facilitate the review of analysis code for coherence, performance, and quality assurance
Collaboration: allow projects to change hands without the new analyst feeling lost
Reproducibility: develop analysis projects such that they can be run from end-to-end on-demand to reproduce results from raw data.

What's a project?

fs::dir_tree()

Data

The data folder is where all your data should live, unless it lives online and it is preferable to occasionally refresh its contents through an API.

fs::dir_tree("data")

Cache

Holds the cached outputs of functions for conducting intensive data loading / manipulation tasks. Manipulate its contents only within functions.

Raw

Holds all raw data sources that are not accessible through an API.

Clean

Holds final clean datasets used in the analysis work.

Report

The report folder contains several templated reports that break the work of analyzing data down into manageable chunks. The specific report titles are suggestions, and the the overall set / number of reports is highly flexible. Feel free to express yourself here.

fs::dir_tree("report")

Output

The output folder contains two subfolders tab and fig. You should use these subfolders to store any analysis outputs that you're generating to include in word reports, or share as excel spreadsheets.

Reference

If you're using any methods drawn from literature, save that literature here as you go along.

R

The R folder holds all your source code, along with a CONTROL.R file which runs the full data loading, cleaning, and analysis pipeline.

fs::dir_tree("R")

What's the process?

1. build `load_data` function

2. complete report `01-data-cleaning.Rmd`

3. build `clean_data` to reflect decisions described in `01-data-cleaning.Rmd`

4. test data cleaning pipeline by running `CONTROL.R` and interactively examining the results

5. complete other reports using cleaned datasets

context-dependent/fsctemplates documentation built on April 5, 2020, 10:07 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

context-dependent/fsctemplates

In context-dependent/fsctemplates:

Questions, I know you've got em.

What is this document?

Why can't I just do things my own way?

Where does this come from?

What's the point?

What's a project?

Data

Cache

Raw

Clean

Report

Output

Reference

R

What's the process?

1. build `load_data` function

2. complete report `01-data-cleaning.Rmd`

3. build `clean_data` to reflect decisions described in `01-data-cleaning.Rmd`

4. test data cleaning pipeline by running `CONTROL.R` and interactively examining the results

5. complete other reports using cleaned datasets

R Package Documentation

Browse R Packages

We want your feedback!

context-dependent/fsctemplates

In context-dependent/fsctemplates:

Questions, I know you've got em.

What is this document?

Why can't I just do things my own way?

Where does this come from?

What's the point?

What's a project?

Data

Cache

Raw

Clean

Report

Output

Reference

R

What's the process?

1. build load_data function

2. complete report 01-data-cleaning.Rmd

3. build clean_data to reflect decisions described in 01-data-cleaning.Rmd

4. test data cleaning pipeline by running CONTROL.R and interactively examining the results

5. complete other reports using cleaned datasets

R Package Documentation

Browse R Packages

We want your feedback!

1. build `load_data` function

2. complete report `01-data-cleaning.Rmd`

3. build `clean_data` to reflect decisions described in `01-data-cleaning.Rmd`

4. test data cleaning pipeline by running `CONTROL.R` and interactively examining the results