fd_cols: Find all size one candidate functional dependencies of a set...

View source: R/db_tools.R

fd_colsR Documentation

Find all size one candidate functional dependencies of a set of columns.

Description

Returns names of columns in a data frame that may be individually functionally determined a supplied set of columns (the determinant set). This means that each returned column takes a single value for each unique combination in the determinant set of columns.

Usage

fd_cols(df, ...)

Arguments

df

A data frame.

...

Columns in determinant set. Given by either by name (quoted or unquoted) or integer positions.

Details

Any NAs are treated as a distinct value and a warning is given.

This is far from optimised and can be slow with large data frames. The run time is approximately of order (number of columns not in the determinant set) x (number of unique rows (groups) in the determinant set). I found that 10,000 groups takes about 0.5 seconds per column.


jedwards24/edwards documentation built on Sept. 2, 2023, 8:16 a.m.