Description Usage Arguments Value See Also Examples
Compute global correlation between all pairs of numeric columns in table.
Result includes all pairwise combinations of numeric columns in the table, with
optionally limiting columns to those in the parameter include
or/and
excluding columns defined by parameter except
. Limit computation
on the table subset defined with where
. Use output='matrix'
to produce
results in matrix format (compatible with function cor
).
1 2 3 |
channel |
connection object as returned by |
tableName |
database table name |
tableInfo |
pre-built summary of data to use (must have with |
include |
a vector of column names to include. Output never contains attributes other than in the list.
When missing all columns from |
except |
a vector of column names to exclude. Output never contains attributes from the list. |
where |
specifies criteria to satisfy by the table rows before applying
computation. The creteria are expressed in the form of SQL predicates (inside
|
by |
vector of column names to group results by one or more table columns for faceting or alike (optional). |
output |
Default output is a data frame of column pairs with correlation coefficient (melt format).
To return correlation matrix compatible with function |
test |
logical: if TRUE show what would be done, only (similar to parameter |
data frame with columns:
corr pair of 1st and 2d columns "column1:column2"
value computed correlation value
metric1 name of 1st column
metric2 name of 2d column
sign correlation value sign sign(value)
(-1, 0, or 1)
Note that while number of correlations function computes is choose(N, 2)
, where N
is
number of table columns specified, resulting data frame contains twice as many rows by duplicating
each correlation value with swaped column names (1st column to 2d and 2d to 1st positions). This
makes resulting data frame symmetrical with respect to column order in pairs and is necessary to
correctly visualize correlation matrix with createBubblechart
.
createBubblechart
and showData
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | if(interactive()){
# initialize connection to Lahman baseball database in Aster
conn = odbcDriverConnect(connection="driver={Aster ODBC Driver};
server=<dbhost>;port=2406;database=<dbname>;uid=<user>;pwd=<pw>")
cormat = computeCorrelations(channel=conn, "pitching_enh", sqlColumns(conn, "pitching_enh"),
include = c('w','l','cg','sho','sv','ipouts','h','er','hr','bb',
'so','baopp','era','whip','ktobb','fip'),
where = "decadeid = 2000", test=FALSE)
# remove duplicate correlation values (no symmetry)
cormat = cormat[cormat$metric1 < cormat$metric2, ]
createBubblechart(cormat, "metric1", "metric2", "value", label=NULL, fill="sign")
# Grouped by columns
cormatByLg = computeCorrelations(channel=conn, "pitching_enh",
include=c('w','sv','h','er','hr','bb','so'),
by=c('lgid','decadeid'),
where = "decadeid >= 1990")
createBubblechart(cormatByLg, "metric1", "metric2", "value",
label=NULL, fill="sign", facet=c('decadeid','lgid'),
title="Correlations by Leagues and Decades",
defaultTheme = theme_wsj(), legendPosition = 'none')
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.