README.md
In framework: Structured Data Science Project Scaffolding

Framework

An R package for structured, reproducible data analysis projects.

Status: Active development. APIs may change before version 1.0.

# Install from GitHub
remotes::install_github("table1/framework")

# One-time global setup (author info, preferences)
framework::setup()

# Create projects using your saved defaults
framework::new()
framework::new("my-analysis", "~/projects/my-analysis")
framework::new_presentation("quarterly-review", "~/talks/q4")
framework::new_course("stats-101", "~/teaching/stats")

project (default): Full-featured research projects with notebooks, scripts, organized data management, and documentation
project_sensitive: Like project, but with additional privacy protections for sensitive data
course: Teaching materials with slides, assignments, and modules
presentation: Single talks with one Quarto file and minimal setup

Example project structure:

project/
├── notebooks/              # Exploratory analysis
├── scripts/                # Production pipelines
├── inputs/
│   ├── raw/                # Raw data (gitignored)
│   ├── intermediate/       # Cleaned datasets (gitignored)
│   ├── final/              # Curated analytic datasets (gitignored)
│   └── reference/          # External documentation (gitignored)
├── outputs/
│   ├── private/            # Tables, figures, models, cache (gitignored)
│   └── public/             # Share-ready artifacts
├── functions/              # Custom functions
├── docs/                   # Documentation
├── settings.yml            # Project configuration
├── framework.db            # Metadata tracking database
└── .env                    # Secrets (gitignored)

Framework reduces boilerplate and enforces best practices:

Project scaffolding: Standardized directories, config-driven setup
Data management: Declarative data catalog, integrity tracking, encryption
Auto-loading: Load packages with one command; no more scattered library() calls
Pain-free renv: Reproducible package management without fighting renv
Caching: Smart caching for expensive computations
Database helpers: PostgreSQL, SQLite, DuckDB, MySQL with credential management
File formats: CSV, TSV, RDS, Stata (.dta), SPSS (.sav), SAS (.xpt, .sas7bdat)

library(framework)
scaffold()  # Loads packages, functions, config, standardizes working directory

# Quarto notebook (default)
make_notebook("exploration")    # → notebooks/exploration.qmd
make_qmd("analysis")            # Always Quarto
make_rmd("report")              # RMarkdown

# Presentations
make_revealjs("slides")         # reveal.js presentation

# Scripts
make_script("process-data")     # → scripts/process-data.R

# List available templates
stubs_list()

Custom stubs: Create a stubs/ directory with your own templates.

Via config (recommended):

# settings.yml
data:
  inputs:
    raw:
      survey:
        path: inputs/raw/survey.csv
        type: csv
        locked: true  # Errors if file changes

df <- data_load("inputs.raw.survey")

Direct path:

df <- data_load("inputs/raw/my_file.csv")       # CSV
df <- data_load("inputs/raw/stata_file.dta")    # Stata
df <- data_load("inputs/raw/spss_file.sav")     # SPSS

Every read is logged with a SHA-256 hash for integrity tracking.

model <- get_or_cache("model_v1", {
  expensive_model_fit(df)
}, expire_after = 1440)  # 24 hours

Save data files:

data_save(processed_df, "intermediate.cleaned_data")
# → saves to inputs/intermediate/cleaned_data.rds

data_save(final_df, "final.analysis_ready", type = "csv")
# → saves to inputs/final/analysis_ready.csv

Save analysis outputs:

result_save("regression_model", model, type = "model")
result_save("report", file = "report.html", type = "notebook", blind = TRUE)

# settings.yml
connections:
  db:
    driver: postgresql
    host: env("DB_HOST")
    database: env("DB_NAME")
    user: env("DB_USER")
    password: env("DB_PASS")

df <- query_get("SELECT * FROM users WHERE active = true", "db")

view_detail() provides rich, browser-based data exploration:

view_detail(mtcars)                    # Interactive table with search/filter/export
view_detail(config)                    # Tabbed YAML + R structure for lists
view_detail(ggplot(mtcars, aes(mpg, hp)) + geom_point())  # Interactive plots

Simple:

default:
  packages:
    - dplyr
    - ggplot2
  data:
    example: data/example.csv

Advanced (split files):

default:
  data: settings/data.yml
  packages: settings/packages.yml
  connections: settings/connections.yml

Secrets in .env:

DB_HOST=localhost
DB_PASS=secret

Reference in config:

connections:
  db:
    host: env("DB_HOST")
    password: env("DB_PASS", "default")

Framework creates instruction files for AI coding assistants:

framework::configure_ai_agents()

Supported: Claude Code (CLAUDE.md), GitHub Copilot, AGENTS.md

| Function | Purpose | |----------|---------| | scaffold() | Initialize session (load packages, functions, config) | | data_load() | Load data from path or config | | data_save() | Save data with integrity tracking | | view_detail() | Browser-based data viewer with search/export | | query_get() | Execute SQL query, return data | | query_execute() | Execute SQL command | | get_or_cache() | Lazy evaluation with caching | | result_save() | Save analysis output | | result_get() | Retrieve saved result | | scratch_capture() | Quick debug/temp file save | | renv_enable() | Enable renv for reproducibility | | packages_snapshot() | Save package versions to renv.lock | | packages_restore() | Restore packages from renv.lock | | security_audit() | Scan for data leaks and security issues |

Hash tracking: All data files tracked with SHA-256 hashes
Locked data: Flag files as read-only, errors on modification
Password-based encryption: Ansible Vault-style encryption for sensitive data
Gitignore by default: Private directories auto-ignored
Security audits: security_audit() detects data leaks

# Save encrypted data
data_save(sensitive_df, "private.data", encrypted = TRUE)

# Load (auto-detects encryption)
data <- data_load("private.data")

Password from ENCRYPTION_PASSWORD env var or interactive prompt.

audit <- security_audit()              # Full audit
audit <- security_audit(auto_fix = TRUE)  # Auto-fix .gitignore issues

Optional renv integration (off by default):

renv_enable()           # Enable for this project
packages_snapshot()     # Save current versions
packages_restore()      # Restore from renv.lock
renv_disable()          # Disable (keeps renv.lock)

Version pinning in settings.yml:

packages:
  - dplyr                    # Latest from CRAN
  - ggplot2@3.4.0           # Specific version
  - tidyverse/dplyr@main    # GitHub with branch

Better database support (DuckDB, MySQL, SQL Server)
Results publishing to S3
Enhanced results tracking with blinding support

Any scripts or data that you put into this service are public.

framework documentation built on Feb. 18, 2026, 1:07 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

framework
Structured Data Science Project Scaffolding

README.md
In framework: Structured Data Science Project Scaffolding

Framework

Quick Start

Project Types

Why Framework?

Core Workflow

1. Initialize Your Session

2. Create Notebooks & Scripts

3. Load Data

4. Cache Expensive Operations

5. Save Results

6. Query Databases

Enhanced Data Viewing

Configuration

AI Assistant Support

Key Functions

Data Integrity & Security

Encryption

Security Auditing

Reproducibility with renv

Roadmap

Try the framework package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

framework Structured Data Science Project Scaffolding

README.md In framework: Structured Data Science Project Scaffolding

Framework

Quick Start

Project Types

Why Framework?

Core Workflow

1. Initialize Your Session

2. Create Notebooks & Scripts

3. Load Data

4. Cache Expensive Operations

5. Save Results

6. Query Databases

Enhanced Data Viewing

Configuration

AI Assistant Support

Key Functions

Data Integrity & Security

Encryption

Security Auditing

Reproducibility with renv

Roadmap

Try the framework package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

framework
Structured Data Science Project Scaffolding

README.md
In framework: Structured Data Science Project Scaffolding