DSD_Memory | R Documentation |
This class provides a data stream interface for data stored in memory as matrix-like objects (including data frames). All or a portion of the stored data can be replayed several times.
DSD_Memory( x, n, k = NA, outofpoints = c("warn", "ignore", "stop"), loop = FALSE, description = NULL )
x |
A matrix-like object containing the data. If |
n |
Number of points used if |
k |
Optional: The known number of clusters in the data |
outofpoints |
Action taken if less than
|
loop |
Should the stream start over when it reaches the end? |
description |
character string with a description. |
In addition to regular data.frames other matrix-like objects that provide
subsetting with the bracket operator can be used. This includes ffdf
(large data.frames stored on disk) from package ff and
big.matrix
from bigmemory.
Reading the whole stream
By using n = -1
in get_points()
, the whole stream is returned.
Returns a DSD_Memory
object (subclass of DSD_R, DSD).
Michael Hahsler
Other DSD:
DSD_BarsAndGaussians()
,
DSD_Benchmark()
,
DSD_Cubes()
,
DSD_Gaussians()
,
DSD_MG()
,
DSD_Mixture()
,
DSD_NULL()
,
DSD_ReadDB()
,
DSD_ReadStream()
,
DSD_Target()
,
DSD_UniformNoise()
,
DSD_mlbenchData()
,
DSD_mlbenchGenerator()
,
DSD()
,
DSF()
,
animate_data()
,
close_stream()
,
get_points()
,
plot.DSD()
,
reset_stream()
# Example 1: store 1000 points from a stream stream <- DSD_Gaussians(k = 3, d = 2) replayer <- DSD_Memory(stream, k = 3, n = 1000) replayer plot(replayer) # creating 2 clusterers of different algorithms dsc1 <- DSC_DBSTREAM(r = 0.1) dsc2 <- DSC_DStream(gridsize = 0.1, Cm = 1.5) # clustering the same data in 2 DSC objects reset_stream(replayer) # resetting the replayer to the first position update(dsc1, replayer, 500) reset_stream(replayer) update(dsc2, replayer, 500) # plot the resulting clusterings reset_stream(replayer) plot(dsc1, replayer, main = "DBSTREAM") reset_stream(replayer) plot(dsc2, replayer, main = "D-Stream") # Example 2: use a data.frame to create a stream (3rd col. contains the assignment) df <- data.frame(x = runif(100), y = runif(100), .class = sample(1:3, 100, replace = TRUE)) # add some outliers out <- runif(100) > .95 df[['.outlier']] <- out df[['.class']] <- NA head(df) stream <- DSD_Memory(df) stream reset_stream(stream) get_points(stream, n = 5) # get the remaining points rest <- get_points(stream, n = -1) nrow(rest) # plot all available points with n = -1 reset_stream(stream) plot(stream, n = -1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.