getCounts: Compute Frequencies for Categorical Variables

Description Usage Arguments Details

View source: R/density_estimates.R

Description

Count, group and arrange categorical values from the most to the least frequent. Optionally, compute cumulative distribution function (cdf) and suppress entries that occur less than min_count of times.

Usage

1
2
3
getCounts(tbl, cols, compute_cdf = TRUE, min_count = 1, default = "''",
  con = options("synergetr_con")[[1]],
  sample_max = options("synergetr_sample_max")[[1]])

Arguments

tbl

Table name to inspect

cols

Vector of table fields

compute_cdf

Add cumulative distribution function to result set as a 'cdf' column.

min_count

The minimum amount of times a distinct value must appear in raw data to be included in the frequency count (this parameter is used to exclude rare values from appearing at all).

sample_max

The maximum number of rows to use

Details

If option("synergetr_con") points to a database connection, the computation of the frequencies will be done at the database and the tbl should be a character string (e.g. tbl == "schemaname.table_name"). If "synergetr_con" is not set (i.e. equals NULL), the computations will be done in R memory using the data.table package, and tbl can be either the actual data.frame or its variable name as a character string.


avirkki/synergetr documentation built on May 18, 2019, 9:16 p.m.