Description Usage Arguments Value See Also Examples
View source: R/computeKmeans.R
Random sample of clustered data
1 2 | computeClusterSample(channel, km, sampleFraction, sampleSize, scaled = FALSE,
includeId = TRUE, test = FALSE)
|
channel |
connection object as returned by |
km |
an object of class |
sampleFraction |
vector with one or more sample fractions to use in the sampling of data.
Multiple fractions define sampling for each cluster in kmeans |
sampleSize |
vector with sample size (applies only when |
scaled |
logical: indicates if original (default) or scaled data returned. |
includeId |
logical indicates if sample should include key attribute identifying each data point. |
test |
logical: if TRUE show what would be done, only (similar to parameter |
computeClusterSample
returns an object of class "toakmeans"
(compatible with class "kmeans"
).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | if(interactive()){
# initialize connection to Lahman baseball database in Aster
conn = odbcDriverConnect(connection="driver={Aster ODBC Driver};
server=<dbhost>;port=2406;database=<dbname>;uid=<user>;pwd=<pw>")
km = computeKmeans(conn, "batting", centers=5, iterMax = 25,
aggregates = c("COUNT(*) cnt", "AVG(g) avg_g", "AVG(r) avg_r", "AVG(h) avg_h"),
id="playerid || '-' || stint || '-' || teamid || '-' || yearid",
include=c('g','r','h'), scaledTableName='kmeans_test_scaled',
centroidTableName='kmeans_test_centroids',
where="yearid > 2000")
km = computeClusterSample(conn, km, 0.01)
km
createClusterPairsPlot(km, title="Batters Clustered by G, H, R", ticks=FALSE)
# per cluster sample fractions
km = computeClusterSample(conn, km, c(0.01, 0.02, 0.03, 0.02, 0.01))
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.