wflow_start: Start a new workflowr project

View source: R/wflow_start.R

wflow_startR Documentation

Start a new workflowr project

Description

wflow_start creates a directory with the essential files for a workflowr project. The default behavior is to add these files to a new directory, but it is also possible to populate an existing directory. By default, the working directory is changed to the workflowr project directory.

Usage

wflow_start(
  directory,
  name = NULL,
  git = TRUE,
  existing = FALSE,
  overwrite = FALSE,
  change_wd = TRUE,
  disable_remote = FALSE,
  dry_run = FALSE,
  user.name = NULL,
  user.email = NULL
)

Arguments

directory

character. The directory where the workflowr project files will be added, e.g., "~/my-wflow-project". When existing = FALSE, the directory will be created.

name

character (default: NULL). The name of the project, e.g. "My Workflowr Project". When name = NULL, the project name is automatically determined based on directory. For example, if directory = "~/projects/my-wflow-project", then name is set to "my-wflow-project". The project name is displayed on the website's navigation bar and in the README.md file.

git

logical (default: TRUE). Should the workflowr files be committed with Git? If git = TRUE and no existing Git repository is detected, wflow_start will initialize the repository and make an initial commit. If a Git repository already exists in the chosen directory, wflow_start will commit any newly created or modified files to the existing repository (also need to set existing = TRUE). If git = FALSE, wflow_start will not perform any Git commands.

existing

logical (default: FALSE). Indicate whether directory already exists. This argument is added to prevent accidental creation of files in an existing directory; setting existing = FALSE prevents files from being created if the specified directory already exists.

overwrite

logical (default: FALSE). Similar to existing, this argument prevents files from accidentally being overwritten when overwrite = FALSE. When overwrite = TRUE, any existing file in directory that has the same name as a workflowr file will be replaced by the workflowr file. When git = TRUE, all the standard workflowr files will be added and committed (regardless of whether they were overwritten or still contain the original content).

change_wd

logical (default: TRUE). Change the working directory to the directory.

disable_remote

logical (default: FALSE). Create a Git pre-push hook that prevents pushing to a remote Git repository (i.e. using wflow_git_push). This is useful for extremely confidential projects that cannot be shared via an online Git hosting service (e.g. GitHub or GitLab). The hook is saved in the file .git/hooks/pre-push. If you change your mind and want to push the repository, you can delete that file. Note that this option is only available if git = TRUE. Note that this is currently only supported for Linux and macOS.

dry_run

logical (default: FALSE). When dry_run = TRUE, the actions are previewed without executing them.

user.name

character (default: NULL). The user name used by Git to sign commits, e.g., "Ada Lovelace". This setting only applies to the workflowr project being created. To specify the global setting for the Git user name, use wflow_git_config instead. When user.name = NULL, no user name is recorded for the project, and the global setting will be used. This setting can be modified later by running git config --local in the Terminal.

user.email

character (default: NULL). The email address used by Git to sign commits, e.g., "ada.lovelace@ox.ac.uk". This setting only applies to the workflowr project being created. To specify the global setting for the Git email address, use wflow_git_config instead. When user.name = NULL, no email address is recorded for the project, and the global setting will be used. This setting can be modified later by running git config --local in the Terminal.

Details

This is recommended function to set up the file infrastructure for a workflowr project. If you are using RStudio, you can also create a new workflowr project as an "RStudio Project Template". Go to "File" -> "New Project..." then select "workflowr project" from the list of project types. In the future, you can return to your project by choosing menu option "Open Project..." and selecting the .Rproj file located at the root of the workflowr project directory. In RStudio, opening this file will change the working directory to the appropriate location, set the file navigator to the workflowr project directory, and configure the Git pane.

wflow_start populates the chosen directory with the following files:

|--- .gitignore
|--- .Rprofile
|--- _workflowr.yml
|--- analysis/
|   |--- about.Rmd
|   |--- index.Rmd
|   |--- license.Rmd
|   |--- _site.yml
|--- code/
|   |--- README.md
|--- data/
|   |--- README.md
|--- docs/
|--- <directory>.Rproj
|--- output/
|   |--- README.md
|--- README.md

The two required subdirectories are analysis/ and docs/. These directories should never be removed from the workflowr project.

analysis/ contains all the source R Markdown files that implement the analyses for your project. It contains a special R Markdown file, index.Rmd, that typically does not include R code, and is will be used to generate index.html, the homepage for the project website. Additionally, this directory contains the important configuration file _site.yml. The website theme, navigation bar, and other properties can be controlled through this file (for more details see the documentation on R Markdown websites). Do not delete index.Rmd or _site.yml.

docs/ will contain all the webpages generated from the R Markdown files in analysis/. Any figures generated by rendering the R Markdown files are also stored here. Each figure is saved according to the following convention: docs/figure/<Rmd-filename>/<chunk-name>-#.png, where # corresponds to which of the plots the chunk generated (one chunk can produce several plots).

_workflowr.yml is an additional configuration file used only by workflowr. It is used to apply the workflowr reproducibility checks consistently across all R Markdown files. The most important setting is knit_root_dir which determines the directory where the scripts in analysis/ are executed. The default is to run code from the project root (i.e., "."). To execute the code from analysis/, for example, change the setting to knit_root_dir: "analysis". See wflow_html for more details.

Another required file is the RStudio project file (ending in .Rproj). Do not delete this file even if you do not use RStudio; among other essential tasks, it is used to determine the project root directory.

The optional directories are data/, code/, and output/. These directories are suggestions for organizing your workflowr project and can be removed if you do not find them relevant to your project.

data/ should be used to store "raw" (unprocessed) data files.

code/ should be used to store additional code that might not be appropriate to include in R Markdown files (e.g., code to preprocess the data, long-running scripts, or functions that are used in multiple R Markdown files).

output/ should be used to store processed data files and other outputs generated from the code and analyses. For example, scripts in code/ that pre-process raw data files from data/ should save the processed data files in output/.

All these subdirectories except for docs/ include a README file summarizing the contents of the subdirectory, and can be modified as desired, for example, to document the files stored in each directory.

.Rprofile is an optional file in the root directory of the workflowr project containing R code that is executed whenever the .Rproj file is loaded in RStudio, or whenever R is started up inside the project root directory. This file includes the line of code library("workflowr") to ensure that the workflowr package is loaded.

Finally, .gitignore is an optional file that indicates to Git which files should be ignored—that is, files that are never committed to the repository. Some suggested files to ignore such as .Rhistory and .Rdata are listed here.

Value

An object of class wflow_start, which is a list with the following elements:

directory

The input argument directory.

name

The input argument name.

git

The input argument git.

existing

The input argument existing.

overwrite

The input argument overwrite.

change_wd

The input argument change_wd.

disable_remote

The input argument disable_remote.

dry_run

The input argument dry_run.

user.name

The input argument user.name.

user.email

The input argument user.email.

commit

The object returned by git2r::commit, or NULL if git = FALSE.

Note

Do not delete the file .Rproj even if you do not use RStudio; workflowr will not work correctly unless this file is there.

See Also

vignette("wflow-01-getting-started")

Examples

## Not run: 

wflow_start("path/to/new-project")

# Provide a custom name for the project.
wflow_start("path/to/new-project", name = "My Project")

# Preview what wflow_start would do
wflow_start("path/to/new-project", dry_run = TRUE)

# Add workflowr files to an existing project.
wflow_start("path/to/current-project", existing = TRUE)

# Add workflowr files to an existing project, but do not automatically
# commit them.
wflow_start("path/to/current-project", git = FALSE, existing = TRUE)

## End(Not run)


jdblischak/workflowr documentation built on Nov. 26, 2024, 5:14 p.m.