guess_col_types: Guesses the columns types of a file

View source: R/core.R

guess_col_typesR Documentation

Guesses the columns types of a file

Description

This function is a facility to guess the column types of a text document. It returns columns formatted a la readr.

Usage

guess_col_types(dt_or_path, col_names, n = 10000)

Arguments

dt_or_path

Either a data frame or a path.

col_names

Optional: the vector of names of the columns, if not contained in the file. Must match the number of columns in the file.

n

Number of observations used to make the guess. By default, n = 100000.

Details

The guessing of the column types is based on the 10,000 (set with argument n) first rows.

Note that by default, columns that are found to be integers are imported as double (in want of integer64 type in readr). Note that for large data sets, sometimes integer-like identifiers can be larger than 16 digits: in these case you must import them as character not to lose information.

Value

It returns a cols object a la readr.

Author(s)

Laurent Berge

See Also

See peek to have a convenient look at the first lines of a text file. See guess_delim to guess the delimiter of a text data set. See guess_col_types to guess the column types of a text data set.

See hdd, sub-.hdd and cash-.hdd for the extraction and manipulation of out of memory data. For importation of HDD data sets from text files: see txt2hdd.

Examples


# Example with the iris data set
iris_path = tempfile()
fwrite(iris, iris_path)

# returns a readr columns set:
guess_col_types(iris_path)



hdd documentation built on Aug. 25, 2023, 5:19 p.m.