tdQuantile: tdQuantile

Description Usage Arguments Details Value See Also Examples

Description

Gets column quantiles from a Teradata table. Can take a JDBC connection object (conn) if provided. If no JDBC connection is provided, then a connection is attempted using the user, and password provided. If none is provided, then tries to locate a connection object (conn) in the global environment.

If a connection profile (e.g. username, password, etc.) is provided, then an attempt is made to connect to Teradata. Once the query is run, the connection is then closed. If a connection object (conn) is provided to the function (or one is found globally), then the connection remains open.

Usage

1
tdQuantile(table = NULL, probs = 0.5, cols = NULL, where = "", ...)

Arguments

table

A string stating the Teradata table name.

probs

Numeric vector of quantiles with values in [0,1]. Defaults to median (i.e. 0.5)

cols

Columns desired. Defaults to all columns.

where

Statement to subset table with.

...

Optional connection settings.

Details

This code is CPU intensive, especially for large data tables, as it requires that the column values be ordered. It is advised to take care when implementing, as user limits may prevent the code from sucessfully running. If CPU or spool limits are reached, a workaround could be implemented by first breaking the data table into smaller subsets and subsequently taking the percentiles over them.

The code is really meant for numeric valued columns. If string columns are provided, the code will still run. However, the results will be less interpretable.

Value

Returns a data.frame of the the Teradata table with the quantiles specified.

See Also

tdConn for connection, tdNames for table names, td for general queries, tdCpu for CPU usage, and tdHead for top rows in table.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
## NOT RUN (will also result in errors due to user restrictions) ##
## Runs a quick query based on connection profile
# tdQuantiles("ICDB_PERSON", username=<username>, password=<password>, db="GCA")

## Runs query using a separately established connection. Selects only two columns.
# conn = tdConn(<username>, <password>, db="GCA")
# tdQuantilesy("ICDB_PERSON", c("PERSON_ID", "INDIV_ID"), conn=conn)

## Uses same connection, but allows code to find globally. Also subsets on PERSON_ID.
# tdQuantiles("ICDB_PERSON", where="PERSON_ID mod 2 = 0")

tranlm/tdR documentation built on May 31, 2019, 7:45 p.m.