ParquetBackend: ParquetBackend Class

ParquetBackendR Documentation

ParquetBackend Class

Description

Backend implementation using Apache Arrow Parquet files with lazy loading. Matrices are stored as individual Parquet files and loaded on-demand with LRU caching.

Super class

riemtan::DataBackend -> ParquetBackend

Methods

Public methods


Method new()

Load metadata from JSON file

Load a matrix from Parquet file

Update LRU cache with a new matrix

Initialize a ParquetBackend

Usage
ParquetBackend$new(data_dir, cache_size = 10)
Arguments
data_dir

Path to directory containing Parquet files and metadata.json

cache_size

Maximum number of matrices to cache (default 10)

i

Integer index

i

Integer index

mat

A dppMatrix object

Returns

A dppMatrix object


Method get_matrix()

Get a specific matrix by index

Usage
ParquetBackend$get_matrix(i)
Arguments
i

Integer index

Returns

A dppMatrix object


Method get_all_matrices()

Get all matrices (loads all from disk if necessary)

Usage
ParquetBackend$get_all_matrices(parallel = NULL, progress = FALSE)
Arguments
parallel

Logical indicating whether to use parallel loading (default: NULL, auto-detect)

progress

Logical indicating whether to show progress (default: FALSE)

Returns

A list of dppMatrix objects


Method get_matrices_parallel()

Load multiple matrices in parallel (batch loading)

Usage
ParquetBackend$get_matrices_parallel(indices, progress = FALSE)
Arguments
indices

Vector of integer indices to load

progress

Logical indicating whether to show progress (default: FALSE)

Returns

A list of dppMatrix objects


Method length()

Get the number of matrices

Usage
ParquetBackend$length()
Returns

Integer count


Method get_dimensions()

Get matrix dimensions

Usage
ParquetBackend$get_dimensions()
Returns

Integer p (matrices are p x p)


Method get_metadata()

Get metadata

Usage
ParquetBackend$get_metadata()
Returns

List containing metadata information


Method clear_cache()

Clear the cache

Usage
ParquetBackend$clear_cache()

Method get_cache_size()

Get current cache size

Usage
ParquetBackend$get_cache_size()
Returns

Integer number of cached matrices


Method clone()

The objects of this class are cloneable with this method.

Usage
ParquetBackend$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.


riemtan documentation built on Nov. 11, 2025, 1:06 a.m.