devtools::load_all("~/rrr/usethis")
#> ℹ Loading usethis
library(fs)
usethis has an unexported function tidy_unzip()
, which is used under
the hood in use_course()
and use_zip()
. It is a wrapper around
utils::unzip()
that uses some heuristics to choose a good value for
exdir
, which is the “the directory to extract files to.” Why do we do
this? Because it’s really easy to not get the desired result when
unpacking a ZIP archive.
Common aggravations:
tidy_unzip()
tries to get the nesting just right.
Why doesn’t unzipping “just work”? Because the people who make .zip
files make lots of different choices when they actually create the
archive and these details aren’t baked in, i.e. a successful roundtrip
isn’t automatic. It usually requires some peeking inside the archive and
adjusting the unpack options.
This README documents specific .zip
situations that we anticipate.
Consider the foo folder:
tree foo
#> foo
#> └── file.txt
#>
#> 1 directory, 1 file
Zip it up like so:
zip -r foo-explicit-parent.zip foo/
This is the type of ZIP file that we get from GitHub via links of the forms https://github.com/r-lib/usethis/archive/main.zip and http://github.com/r-lib/usethis/zipball/main/.
Inspect it in the shell:
unzip -Z1 foo-explicit-parent.zip
#> foo/
#> foo/file.txt
Or from R:
foo_files <- unzip("foo-explicit-parent.zip", list = TRUE)
with(
foo_files,
data.frame(Name = Name, dirname = path_dir(Name), basename = path_file(Name))
)
#> Name dirname basename
#> 1 foo/ . foo
#> 2 foo/file.txt foo file.txt
Note that the folder foo/
is explicitly included and all of the files
are contained in it (in this case, just one file).
Consider the foo folder:
tree foo
#> foo
#> └── file.txt
#>
#> 1 directory, 1 file
Zip it up like so:
zip -r foo-implicit-parent.zip foo/*
Note the use of foo/*
, as opposed to foo
or foo/
. This type of ZIP
file was reported in https://github.com/r-lib/usethis/issues/1961. The
example given there is
https://agdatacommons.nal.usda.gov/ndownloader/files/44576230.
Inspect our small example in the shell:
unzip -Z1 foo-implicit-parent.zip
#> foo/file.txt
Or from R:
foo_files <- unzip("foo-implicit-parent.zip", list = TRUE)
with(
foo_files,
data.frame(Name = Name, dirname = path_dir(Name), basename = path_file(Name))
)
#> Name dirname basename
#> 1 foo/file.txt foo file.txt
Note that foo/
is not included and its (original) existence is just
implicit in the relative path to, e.g., foo/file.txt
.
Here’s a similar look at the example from issue #1961:
~/rrr/usethis/tests/testthat/ref % unzip -l ~/Downloads/Species\ v2.3.zip
Archive: /Users/jenny/Downloads/Species v2.3.zip
Length Date Time Name
--------- ---------- ----- ----
1241 04-27-2023 09:16 species_v2/label_encoder.txt
175187560 04-06-2023 15:13 species_v2/model_arch.pt
174953575 04-14-2023 12:28 species_v2/model_weights.pth
--------- -------
350142376 3 files
Note also that the implicit parent folder species_v2
is not the base
of the ZIP file name Species v2.3.zip
.
Consider the foo folder:
tree foo
#> foo
#> └── file.txt
#>
#> 1 directory, 1 file
Zip it up like so:
(cd foo && zip -r ../foo-no-parent.zip .)
Note that we are zipping everything in a folder from inside the folder.
Inspect our small example in the shell:
unzip -Z1 foo-no-parent.zip
#> file.txt
Or from R:
foo_files <- unzip("foo-no-parent.zip", list = TRUE)
with(
foo_files,
data.frame(Name = Name, dirname = path_dir(Name), basename = path_file(Name))
)
#> Name dirname basename
#> 1 file.txt . file.txt
All the files are packaged in the ZIP archive as “loose parts”, i.e. there is no explicit or implicit top-level directory.
This is the structure of ZIP files yielded by DropBox via links of this
form https://www.dropbox.com/sh/12345abcde/6789wxyz?dl=1. I can’t
figure out how to even do this with zip locally, so I had to create an
example on DropBox and download it. Jim Hester reports it is possible
with archive::archive_write_files()
.
https://www.dropbox.com/sh/5qfvssimxf2ja58/AABz3zrpf-iPYgvQCgyjCVdKa?dl=1
It’s basically like the “no parent” example above, except it includes a
spurious top-level directory "/"
.
Inspect our small example in the shell:
unzip -Z1 foo-loose-dropbox.zip
#> /
#> file.txt
Or from R:
# curl::curl_download(
# "https://www.dropbox.com/sh/5qfvssimxf2ja58/AABz3zrpf-iPYgvQCgyjCVdKa?dl=1",
# destfile = "foo-loose-dropbox.zip"
# )
foo_files <- unzip("foo-loose-dropbox.zip", list = TRUE)
with(
foo_files,
data.frame(Name = Name, dirname = path_dir(Name), basename = path_file(Name))
)
#> Name dirname basename
#> 1 / /
#> 2 file.txt . file.txt
Also note that, when unzipping with unzip
in the shell, you get this
result:
Archive: foo-loose-dropbox.zip
warning: stripped absolute path spec from /
mapname: conversion of failed
inflating: file.txt
which indicates some tripping over the /
. Only file.txt
is left
behind.
This is a pretty odd ZIP packing strategy. But we need to plan for it.
For testing purposes, we also want an example where all the files are inside subdirectories.
Examples based on the yo directory here:
tree yo
#> yo
#> ├── subdir1
#> │ └── file1.txt
#> └── subdir2
#> └── file2.txt
#>
#> 3 directories, 2 files
Zip it up, in all the usual ways:
zip -r yo-explicit-parent.zip yo/
zip -r yo-implicit-parent.zip yo/*
(cd yo && zip -r ../yo-no-parent.zip .)
Again, I couldn’t create the DropBox variant locally, so I did it by downloading from DropBox.
# curl::curl_download(
# "https://www.dropbox.com/sh/afydxe6pkpz8v6m/AADHbMZAaW3IQ8zppH9mjNsga?dl=1",
# destfile = "yo-loose-dropbox.zip"
# )
Inspect each in the shell:
unzip -Z1 yo-explicit-parent.zip
#> yo/
#> yo/subdir2/
#> yo/subdir2/file2.txt
#> yo/subdir1/
#> yo/subdir1/file1.txt
unzip -Z1 yo-implicit-parent.zip
#> yo/subdir1/
#> yo/subdir1/file1.txt
#> yo/subdir2/
#> yo/subdir2/file2.txt
unzip -Z1 yo-no-parent.zip
#> subdir2/
#> subdir2/file2.txt
#> subdir1/
#> subdir1/file1.txt
unzip -Z1 yo-loose-dropbox.zip
#> /
#> subdir1/file1.txt
#> subdir2/file2.txt
#> subdir1/
#> subdir2/
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.