validate: Validate the duplicates using the "groups" column to identify...

Description Usage Arguments Value Examples

Description

Groups the files using the "groups" column, then performs extra validation of the groups. In practice, comparing file hashes is probably enough to identify duplicates for most purposes, but performing the extra validation check can privde extra peace of mind. The algorithm works by reading the files in small chunks, hashing the chunks, and comparing them. Successful validation requires that the hash of each chunk be identical.

Usage

1
validate(.data, ...)

Arguments

.data

A fduper object

chunk_size

The size of chunks to read, with default of 1e6 bytes

Value

The validated fduper object, or stops execution if a set doesn't pass validation

Examples

1
f %>% identify(hash) %>% validate(hash)

gmyrland/fduper documentation built on May 28, 2019, 8:53 p.m.