bdImportData_hdf5: Import data from URL or file to HDF5 format

View source: R/ImportData_hdf5.R

bdImportData_hdf5R Documentation

Import data from URL or file to HDF5 format

Description

This function downloads data from a URL (if URL is provided) and decompresses it if needed, then imports the data into an HDF5 file. It supports both local files and remote URLs as input sources.

Usage

bdImportData_hdf5(
  inFile,
  destFile,
  destGroup,
  destDataset,
  header = TRUE,
  rownames = FALSE,
  overwrite = FALSE,
  overwriteFile = FALSE,
  sep = NULL,
  paral = NULL,
  threads = NULL
)

Arguments

inFile

Character string specifying either a local file path or URL containing the data to import

destFile

Character string specifying the file name and path where the HDF5 file will be stored

destGroup

Character string specifying the group name within the HDF5 file where the dataset will be stored

destDataset

Character string specifying the name for the dataset within the HDF5 file

header

Logical or character vector. If TRUE, the first row contains column names. If a character vector, use these as column names. Default is TRUE.

rownames

Logical or character vector. If TRUE, first column contains row names. If a character vector, use these as row names. Default is FALSE.

overwrite

Logical indicating if existing datasets should be overwritten. Default is FALSE.

overwriteFile

Logical indicating if the entire HDF5 file should be overwritten if it exists. CAUTION: This will delete all existing data. Default is FALSE.

sep

Character string specifying the field separator in the input file. Default is "\t" (tab).

paral

Logical indicating whether to use parallel computation. Default is TRUE.

threads

Integer specifying the number of threads to use for parallel computation. Only used if paral=TRUE. If NULL, uses maximum available threads.

Value

No return value. The function writes the data directly to the specified HDF5 file.

Examples


# Create a temporary CSV file to import
csv_file <- tempfile(fileext = ".csv")
hdf5_file <- tempfile(fileext = ".h5")

# Write sample data
data <- matrix(rnorm(50), nrow = 10, ncol = 5)
write.table(data, csv_file, sep = ",", row.names = FALSE, col.names = TRUE)

# Import CSV to HDF5
bdImportData_hdf5(
  inFile      = csv_file,
  destFile    = hdf5_file,
  destGroup   = "mydata",
  destDataset = "matrix1",
  header      = TRUE,
  sep         = ","
)

hdf5_close_all()
unlink(c(csv_file, hdf5_file))

   

BigDataStatMeth documentation built on May 15, 2026, 1:07 a.m.