cb_as_col_spec_factors: Construct code to use as 'col_spec' for factors

View source: R/cb.R

cb_as_col_spec_factorsR Documentation

Construct code to use as col_spec for factors

Description

This function examines all character columns in a data.frame (typically one that was read from comma separated text file using a reader function such as readdr::read_csv()), and generates a specification suitable for reading those columns from the underlying file as factors.

The key benefit is that it will find all variables that are using the same factor levels and group them together, so editing the col_spec to reorder factor levels or make other changes is straightforward.

The result is returned as a catty vector to provide a more readable output by default.

The output can then be edited before the col_spec is used to read the data in fresh from the CSV file. Assuming the cols() result is assigned to a variable cspec, one might have the following:

d <- read_csv(csv_file)
cb_as_col_spec_factors(d)
# edit the output to fit your needs
cspec <- cols(...)
d <- read_csv(csv_file, col_types=cspec)

Usage

cb_as_col_spec_factors(d)

Arguments

d

The data.frame to use when generating the col_spec.

Value

A catty string with column definitions, which is suitable for defining a col_spec, after any needed editing to reorder factors.


torfason/zulutils documentation built on Aug. 21, 2023, 5:46 p.m.