dbNBins: Determine number of distinct values of variables in a...

Description Usage Arguments Examples

Description

For margin = TRUE the function return the number of distinct levels each variable has. For numeric variables this number will be identical to the number of rows in case there are no ties. If margin = FALSE, the function returns the number of distinct tuples. If binwidth is specified, the number of distinct tuples of the binned data is returned. As the number of variables increases, the number of tuples will grow to at most the product of the marginal number of categories, but not beyond the number of records in the data table.

Usage

1
2
  dbNBins(data, vars = list(), binwidth = -1,
    margin = TRUE)

Arguments

data

dataDB object

vars

list of variable names

binwidth

vector of bin sizes for each variable. -1 for minimal binwidth

margin

logical value: should marginal distributions be considered or joint distribution?

Examples

1
2
3
4
5
6
7
8
connect <- dbConnect(dbDriver("MySQL"), user="2009Expo",
password="R R0cks", port=3306, dbname="baseball",
host="headnode.stat.iastate.edu")
pitch <- new("dataDB", co=connect, table="Pitching")
dbNBins(pitch, vars=list("G", "SO", "yearID"))
dbNBins(pitch, vars=list("G", "SO", "yearID"), binwidth=c(10,10,1))
dbNBins(pitch, vars=list("G", "SO", "yearID"), binwidth=c(10,10,1), margin=FALSE)
dbNBins(pitch, vars=list("yearID", "playerID"), margin=FALSE)

heike/dbData documentation built on May 17, 2019, 3:23 p.m.