cash-.hdd: Extracts a single variable from a HDD object

$.hddR Documentation

Extracts a single variable from a HDD object

Description

This method extracts a single variable from a hard drive data set (HDD). There is an automatic protection to avoid extracting too large data into memory. The bound is set by the function setHdd_extract.cap.

Usage

## S3 method for class 'hdd'
x$name

Arguments

x

A HDD object.

name

The variable name to be extracted.Note that there is an automatic protection for not trying to import data that would not fit into memory. The extraction cap is set with the function setHdd_extract.cap.

Details

By default if the expected size of the variable to extract is greater than the value given by getHdd_extract.cap an error is raised. For numeric variables, the expected size is exact. For non-numeric data, the expected size is a guess that considers all the non-numeric variables being of the same size. This may lead to an over or under estimation depending on the cases. In any case, if your variable is large and you don't want to change the extraction cap (setHdd_extract.cap), you can still extract the variable with sub-.hdd for which there is no such protection.

Note that you cannot create variables with $, e.g. like base_hdd$x_new <- something. To create variables, use the [ instead (see sub-.hdd).

Value

It returns a vector.

Author(s)

Laurent Berge

See Also

See hdd, sub-.hdd and cash-.hdd for the extraction and manipulation of out of memory data. For importation of HDD data sets from text files: see txt2hdd.

See hdd_slice to apply functions to chunks of data (and create HDD objects) and hdd_merge to merge large files.

To create/reshape HDD objects from memory or from other HDD objects, see write_hdd.

To display general information from HDD objects: origin, summary.hdd, print.hdd, dim.hdd and names.hdd.

Examples


# Toy example with iris data
# We first create a hdd dataset with approx. 100KB
hdd_path = tempfile() # => folder where the data will be saved
write_hdd(iris, hdd_path)
for(i in 1:10) write_hdd(iris, hdd_path, add = TRUE)

base_hdd = hdd(hdd_path)
summary(base_hdd) # => 11 files

# we can extract the data from the 11 files with '$':
pl = base_hdd$Sepal.Length

#
# Illustration of the protection mechanism:
#

# By default when extracting a variable with '$'
# and the size exceeds the cap (default is greater than 3GB)
# a confirmation is needed.
# You can set the cap with setHdd_extract.cap.

# Following asks for confirmation in interactive mode:
setHdd_extract.cap(sizeMB = 0.005) # new cap of 5KB
pl = base_hdd$Sepal.Length

# To extract the variable without changing the cap:
pl = base_hdd[, Sepal.Length] # => no size control is performed

# Resetting the default cap
setHdd_extract.cap()


hdd documentation built on Aug. 25, 2023, 5:19 p.m.