knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
The goal of devubesp is to automate analyses setup tasks that are otherwise performed manually. This includes setting up directories, supporting packages and projects.
You can install the development version from GitHub with the following procedure:
# install.packages("devtools") devtools::install_github("CorradoLanera/devubesp")
The package, de facto, provide a single function. The aim of the
function `create_ubesp_analysis()
is to setup all the folders'
structure, with all the needed and sample files to conduct analyses at
UBESP. Does not matter if they are for a thesis, a project, a simple
analyses, the initial structure for the project should be the same.
The aim is to drastically reduce (up to the phisical limit of write a single line of code) the time needed to set up a well-organized and standardized structure for all the project, from the simplest to the most complex.
My first hope is to have provaided a simple and effective instrument to reduce the direct time in the preparation of any project. The second, and more relevant, hope is that all of this will be shared between the colleagues in a way to reduce also the future time to understand each other projects structure and work-flow in conducting analyses with R within the same unit.
The main characteristics of the resulting structure of files and folders created are:
A main folder name which is not ambiguous in time, this is achieved
pre-appending the year to the name of the main project name.^[In the
minutes/
folder there where also an empty .txt
file named with
the current date (yymmdd
) and the project name, to permit to setup
initial official notes (time-trakked).]
presence of a first-level folder to collect and manage data-raw, which can be huge, and sensitive, hence often they cannot be shared or uploaded on the Net (e.g., GitHub). For this reason this folder is outside the one promoted to be VCS trakked by git.GitHub. On the other hand, the folder itself can be copy-pasted to a costumer folder, a supervisor folder, or to an external folder (e.g., SharePoint), if usefull and admitted, because it contains only the raw data and the script to import (merge and) import them in the analyses folder (if necessary).
The folder structure is very simple and it contains only a preconfigured script to import and save the data in the right position.
presence of a folder (analyses/
) that can be safely copy-pasted to
a costumer folder, to supervisor folder, or to an external folder
(e.g., SharePoint). This has to be happen without any worry to
brake some VCS controller (like git) nor to share unwanted
information, comments, notes, data that should remain private to the
analyst. It will contain all the reports, the outputs, and the
scripts useful to find the results and also to reproduce them.
The main folder structure is:
analyses/
|- R/ # R script for analyses and reports (.R)
|- reports/ # Final version of the reports (.docx)
|- outputs/ # Output to share for the analyses (.png)
|- docx-template/ # Template to fine knit the report (.docx)
|- bibtex-files/ # Reference cited in the report (.bib)
This folder, because it is an analyses one, is already preconfigured
as an RStudio .Rproj
.
presence of a package-folder infrastructure which can be use to implement and include any custom or additional function usefull for the analyses. Making able to test, document, trakking their changes, and share with everyone else easily.
This can be VCS with git and GitHub safely, without the risk to overfull the space at the disposal with messy and uncompressed raw data.
The folder analyses/
is stored in the first level inside of the
package-folder prjname/
, so it is easy to find, and use them for the
analyses. On the other hande, analyses/
is .Rbuildignore
d from
the package, this way it that can be git-tracked (by default it is)
but it is not provided in the package boundle.^[If you want to not
git trakking the analyses/
folder too, add it to .gitignore
.]
Hence, the package can be safely shared both through GitHub or CRAN
without risk to share data or analyses, and everyone (future-you
too) can easily use all the function you provide with the package
as simply as devtools::install_gitub("UBESP-DCTV/<project_name>")
.
presence of (standardized) additional common project and utility folders and files. These are all subfolders of the main project folder:
yyyy-prjname/
|- prjname/ # the R-package folder
|- analyses/ # the folder for the analyses
|- essential-bibliography/ # used literature (.pdf)
|- papers/ # maintaining ordered in writings
|- draft/ # from drafts (`.docx`)
|- submitted/ # to submissions (`.docx`, `.png`, `.pdf`)
|- published/ # up to the publishing (`.pdf`, only)
|- minutes/ # for the minutes (example `.txt` inside)
|- proposal.docx # requests (replace example w/ current!)
|- personal-notes.md # personal notes, not to be shared
The simplest example is all you need to view to learn how simple it is
to be used:^[the tilda ("~
"), here, expands to the home directory from
the "r point-of-view", i.e. the classical home/ on Unix systems, but to
the "user/documents" on windows (instead of the simple "user/").]
devubesp::create_ubesp_analysis("~/test") # next, read and follow the instructions appering on the screen :-)
You can also include longer path to aggregate similar kynd of projects:
devubesp::create_ubesp_analysis("~/theses/cl") devubesp::create_ubesp_analysis("~/department-analyses/cardio/xxx") devubesp::create_ubesp_analysis("~/whatever/whatelse")
Including a git project into a synchronized folder (e.g., OneDrive, G Drive or DropBox) can lead to issues because too many agents compete to have the control of what you can see and what is changed (e.g., git and OneDrive).
On the other side, if you develop your code using git, you do not need to have a second agent to track it! Anyway, all the part of your project, which is not under git control should be monitored and stored in the synchronized folder.
From here, the choice seems to be using two distinct folders, e.g., one
under OneDrive/
for all the project's files you do not want into git
(and the data, if they are protected or too big for GitHub) and one
under git version control on ~/Documents/
.
On the other hand, that choice can be annoying; you have to switch
continuously from one folder to another during the development of your
project. More important, you cannot access smoothly to folders in
OneDrive you do not want to git (e.g., data/
) from your R project
directory directly. Anyway, copying everything twice is incredibly
inefficient, at best!
A possible solution is to use symbolic links. They are simple pointers to a file or folder. To the user, they appear like shortcut links, but they are not: the former are only pointers (i.e., they are neither files nor folders), while the latter are files containing the address of their corresponding target files or folders.
Now you can:
put all your project-tree under, e.g., OneDrive/yyyy-prjname/
create an empty folder ~/Documets/yyyy-prjname/
move the projname/
folder containing all the development staff you
are going to develop in R into the ~/Documets/yyyy-prjname/
folder
track it under git
create a symbolic link from all the other files and folders from
OneDrive/yyyy-prjname
to ~/Documets/yyyy-prjname/
.
This way, ~/Documents/yyyy-prjname
/ folder will (apparently) contain
all the objects into OneDrive/yyyy-prjname/
, plus the R project
folder. Moreover, those symlinks do not use any space in your disk, and
they are safely stored and tracked by OneDrive
only. On the other
hand, thank symlinks, you can access them, from the R project folder
smoothly using standard relative paths as they were really there,
e.g., by calling here::here(../data-raw)
from the project working
directory!
To create symlinks you can have different ways depending of your needs and OS. Following what I find usefull to learn how to create them:
Windows 10: command line, powershell (see Example 7), GUI (mouse right click)
Linux: here
If you need some more features, please file an issue on github.
If you encounter a bug, please file a reprex (minimal reproducible example) on github.
Please note that the "devubesp" project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Most of the underlying function used to create this package comes directly or with slightly modifications from the usethis package, by Hadley Wickham and Jennifer Bryan.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.