ParquetFileReader: ParquetFileReader class

ParquetFileReaderR Documentation

ParquetFileReader class

Description

This class enables you to interact with Parquet files.

Factory

The ParquetFileReader$create() factory method instantiates the object and takes the following arguments:

  • file A character file name, raw vector, or Arrow file connection object (e.g. RandomAccessFile).

  • props Optional ParquetArrowReaderProperties

  • mmap Logical: whether to memory-map the file (default TRUE)

  • reader_props Optional ParquetReaderProperties

  • ... Additional arguments, currently ignored

Methods

  • ⁠$ReadTable(column_indices)⁠: get an arrow::Table from the file. The optional ⁠column_indices=⁠ argument is a 0-based integer vector indicating which columns to retain.

  • ⁠$ReadRowGroup(i, column_indices)⁠: get an arrow::Table by reading the ith row group (0-based). The optional ⁠column_indices=⁠ argument is a 0-based integer vector indicating which columns to retain.

  • ⁠$ReadRowGroups(row_groups, column_indices)⁠: get an arrow::Table by reading several row groups (0-based integers). The optional ⁠column_indices=⁠ argument is a 0-based integer vector indicating which columns to retain.

  • ⁠$GetSchema()⁠: get the arrow::Schema of the data in the file

  • ⁠$ReadColumn(i)⁠: read the ith column (0-based) as a ChunkedArray.

Active bindings

  • ⁠$num_rows⁠: number of rows.

  • ⁠$num_columns⁠: number of columns.

  • ⁠$num_row_groups⁠: number of row groups.

Examples


f <- system.file("v0.7.1.parquet", package = "arrow")
pq <- ParquetFileReader$create(f)
pq$GetSchema()
if (codec_is_available("snappy")) {
  # This file has compressed data columns
  tab <- pq$ReadTable()
  tab$schema
}


arrow documentation built on Nov. 25, 2023, 1:09 a.m.