read_vc: Read a Git2rdata Object from Disk

View source: R/read_vc.R

read_vcR Documentation

Read a Git2rdata Object from Disk

Description

read_vc() handles git2rdata objects stored by write_vc(). It reads and verifies the metadata file (.yml). Then it reads and verifies the raw data. The last step is back-transforming any transformation done by meta() to return the data.frame as stored by write_vc().

read_vc() is an S3 generic on root which currently handles "character" (a path) and "git-repository" (from git2r). S3 methods for other version control system could be added.

Usage

read_vc(file, root = ".")

Arguments

file

the name of the git2rdata object. Git2rdata objects cannot have dots in their name. The name may include a relative path. file is a path relative to the root. Note that file must point to a location within root.

root

The root of a project. Can be a file path or a git-repository. Defaults to the current working directory (".").

Value

The data.frame with the file names and hashes as attributes.

See Also

Other storage: list_data(), prune_meta(), relabel(), rename_variable(), rm_data(), verify_vc(), write_vc()

Examples

## on file system

# create a directory
root <- tempfile("git2rdata-")
dir.create(root)

# write a dataframe to the directory
write_vc(iris[1:6, ], file = "iris", root = root, sorting = "Sepal.Length")
# check that a data file (.tsv) and a metadata file (.yml) exist.
list.files(root, recursive = TRUE)
# read the git2rdata object from the directory
read_vc("iris", root)

# store a new version with different observations but the same metadata
write_vc(iris[1:5, ], "iris", root)
list.files(root, recursive = TRUE)
# Removing a column requires version requires new metadata.
# Add strict = FALSE to override the existing metadata.
write_vc(
  iris[1:6, -2], "iris", root, sorting = "Sepal.Length", strict = FALSE
)
list.files(root, recursive = TRUE)
# storing the orignal version again requires another update of the metadata
write_vc(iris[1:6, ], "iris", root, sorting = "Sepal.Width", strict = FALSE)
list.files(root, recursive = TRUE)
# optimize = FALSE stores the data more verbose. This requires larger files.
write_vc(
  iris[1:6, ], "iris2", root, sorting = "Sepal.Width", optimize = FALSE
)
list.files(root, recursive = TRUE)



## on git repo using a git2r::git-repository

# initialise a git repo using the git2r package
repo_path <- tempfile("git2rdata-repo-")
dir.create(repo_path)
repo <- git2r::init(repo_path)
git2r::config(repo, user.name = "Alice", user.email = "alice@example.org")

# store a dataframe in git repo.
write_vc(iris[1:6, ], file = "iris", root = repo, sorting = "Sepal.Length")
# This git2rdata object is not staged by default.
status(repo)
# read a dataframe from a git repo
read_vc("iris", repo)

# store a new version in the git repo and stage it in one go
write_vc(iris[1:5, ], "iris", repo, stage = TRUE)
status(repo)

# store a verbose version in a different gir2data object
write_vc(
  iris[1:6, ], "iris2", repo, sorting = "Sepal.Width", optimize = FALSE
)
status(repo)

git2rdata documentation built on March 18, 2022, 5:06 p.m.