knitr::opts_chunk$set( collapse = TRUE, comment = "#>", error = !identical(Sys.getenv("IN_PKGDOWN"), "true") ) project_path <- system.file("demo-project", package = "here")
The here package enables easy file referencing by using the top-level directory of a file project to easily build file paths.
This is in contrast to using setwd()
, which is fragile and dependent on the way you order your files on your computer.
Read more about project-oriented workflows:
What They Forgot to Teach You About R: "Project-oriented workflow" chapter by Jenny Bryan and Jim Hester
"Project-oriented workflow" blog post by Jenny Bryan
R for data science: "Workflow: projects" chapter by Hadley Wickham
For demonstration, this article uses a data analysis project that lives in `r project_path`
on my machine.
This is the project root.
The path will most likely be different on your machine, the here package helps deal with this situation.
The project has the following structure:
fs::dir_tree(project_path)
You can review the project on GitHub and also download a copy.
To start working on this project in RStudio, open the demo-project.Rproj
file.
This ensures that the working directory is set to `r project_path`
, the project root.
Opening only the .R
or the .Rmd
file may be insufficient!
Other development environments may have a different notion of a project. Either way, it is important that the working directory is set to the project root or a subdirectory of that path. You can check with:
setwd(project_path)
knitr::opts_knit$set(root.dir = project_path)
getwd()
(See vignette("rmarkdown")
for an example where the working directory is set to a subdirectory on start.)
The intended use is to add a call to here::i_am()
at the beginning of your script or in the first chunk of your rmarkdown report.[^legacy]
This achieves the following:
[^legacy]: Prior to version 1.0.0, it was recommended to attach the here package via library(here)
.
This still works, but is no longer the recommended approach.
[^print]: library(here)
no longer emits an informative message if here::i_am()
has been called before.
The first argument to here::i_am()
should be the path to the current file, relative to the project root.
The penguins.R
script uses:
here::i_am("prepare/penguins.R")
here::i_am()
displays the top-level directory of the current project.
Because the project has a prepare/
directory in its root that contains penguins.R
, it is correctly inferred as the project root.
After here::i_am()
, insert library(here)
to make the here()
function available:[^why-not-first]
[^why-not-first]: library(here)
emits a message that may be confusing if followed by the message from here::i_am()
.
library(here)
The top-level directory is also returned from the here()
function:
here()
One important distinction from the working directory is that this remains stable even if the working directory is changed:
setwd("analysis") getwd() here() setwd("..")
(I suggest to steer clear from ever changing the working directory. This may not always be feasible, in particular if the working directory is changed by code that you do not control.)
You can build a path relative to the top-level directory in order to build the full path to a file:
here("data", "penguins.csv") readr::read_csv( here("data", "penguins.csv"), col_types = list(.default = readr::col_guess()), n_max = 3 )
This works regardless of where the associated source file lives inside your project.
With here()
, the path will always be relative to the top-level project directory.
here()
works very similarly to file.path()
or fs::path()
, you can pass path components or entire subpaths:
here("data/penguins.csv")
As seen above, here()
returns absolute paths (starting with /
, <drive letter>:\
or \\
).
This makes it safe to pass these paths to other functions, even if the working directory is changed along the way.
As of version 1.0.0, absolute paths passed to here()
are returned unchanged.
This means that you can safely use both absolute and project-relative paths in here()
.
data_path <- here("data") here(data_path) here(data_path, "penguins.csv")
The dr_here()
function explains the reasoning behind choosing the project root:
dr_here()
The show_reason
argument can be set to FALSE
to reduce the output to one line:
dr_here(show_reason = FALSE)
The declaration of the active file via here::i_am()
also protects against accidentally running the script from a working directory outside of your project.
The example below calls here::i_am()
from the temporary directory, which is clearly outside our project:
withr::with_dir(tempdir(), { print(getwd()) here::i_am("prepare/penguins.R") })
This can also happen when a file has been renamed or moved without updating the here::i_am()
call.
In the future, a helper function will assist with installing and updating suitably formatted here::i_am()
calls in your scripts and reports.
Other packages also export a here()
function.
Loading these packages after loading here masks our here()
function:
library(plyr) here()
One way to work around this problem is to use here::here()
:
here::here()
The conflicted package offers an alternative: it detects that here()
is exported from more than one package and allows you to use neither until you indicate a preference.
library(conflicted) here() conflicted::conflict_prefer("here", "here") here()
To eliminate potential confusion, here::i_am()
accepts a uuid
argument.
The idea is that each script and report calls here::i_am()
very early (in the first 100 lines) with a universally unique identifier.
Even if a file location is reused across projects (e.g. two projects contain a "prepare/data.R" file), the files can be identified correctly if the uuid
argument in the here::i_am()
call is different.
If a uuid
argument is passed to here::i_am()
:
here::i_am()
call that passes this very uuid
is among those 100 lines, and will be matcheduuid
is not found in the textUse uuid::UUIDgenerate()
to create universally unique identifiers:
uuid::UUIDgenerate()
Ensure that the uuid
arguments are actually unique across your files!
In the future, a helper function will assist with installing and updating suitably formatted here::i_am()
calls in your scripts and reports.
It is advisable to start a fresh R session as often as possible, especially before focusing on another project. There still may be legitimate cases when it is desirable to reset the project root.
To start, let's create a temporary project for demonstration:
temp_project_path <- tempfile() dir.create(temp_project_path) scripts_path <- file.path(temp_project_path, "scripts") dir.create(scripts_path) script_path <- file.path(scripts_path, "script.R") writeLines( c( 'here::i_am("scripts/script.R")', 'print("Hello, world!")' ), script_path ) fs::dir_tree(temp_project_path) writeLines(readLines(script_path))
The script.R
file contains a call to here::i_am()
to declare its location.
Running it from the current working directory will fail:
source(script_path, echo = TRUE)
To reset the project root mid-session, change the working directory with setwd()
.
Now, the subsequent call to here::i_am()
from within script.R
works:
setwd(temp_project_path)
knitr::opts_knit$set(root.dir = temp_project_path)
source(script_path, echo = TRUE)
To reiterate: a fresh session is almost always the better, cleaner, safer, and more robust solution. Use this approach only as a last resort.
The here package has a very simple and restricted interface, by design. The underlying logic is provided by the much more powerful rprojroot package. If the default behavior of here does not suit your workflow for one reason or another, the rprojroot package may be a better alternative. It is also recommended to import rprojroot and not here from other packages.
The following example shows how to find an RStudio project starting from a directory:
library(rprojroot) find_root(is_rstudio_project, file.path(project_path, "analysis"))
Arbitrary criteria can be defined.
See vignette("rprojroot", package = "rprojroot")
for an introduction.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.