to_simple_names: Convert Long File Paths to Simple Paths

Description Usage Arguments Value Examples

View source: R/function.R

Description

Convert Long File Paths to Simple Paths

Usage

1
to_simple_names(paths, method = 1L, get_base = NULL, sha1_digits = 4)

Arguments

paths

vector of character containing file paths

method

method = 1: file names generated match the pattern file_<xx> with <xx> being an integer number of two digits. method = 2: file names generated match the pattern file_<sha> with <sha> being the first sha1_digits digits of the sha1 hash (see e.g. http://www.sha1-online.com/) of the base names of the paths. By default, the base name is the file name (without folder path) without extension. The base names can be determined individually by providing a function in get_base

get_base

function taking a vector of character as input and returning a vector of character as output. If not NULL, this function will be used to determine the base paths from the paths when method = 2 was specified.

sha1_digits

number of digits used when method = 2 is to be applied

Value

vector of character as long as paths

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
paths <- c("v1_ugly_name_1.doc",  "v1_very_ugly_name.xml",
           "v2_ugly_name_1.docx", "v2_very_ugly_name.xmlx")
           
to_simple_names(paths, method = 1L)
writeLines(sort(to_simple_names(paths, method = 2L)))

# All sha1 are different because all base names (file name without extension
# by default) are different. If you want to give the same sha1 to files that 
# correspond to each other but have a different extension, set the function 
# that extracts the "base name" of the file:

get_base <- function(x) kwb.utils::removeExtension(gsub("^v\\d+_", "", x))

writeLines(sort(to_simple_names(paths, method = 2L, get_base = get_base)))

# Now the file names that have the same base name (neglecting the prefix 
# v1_ or v2_) get the same sha1 and thus appear as groups in the sorted 
# file list

KWB-R/kwb.file documentation built on Dec. 31, 2021, 8:15 p.m.