RC.read.table reads the contents of a column family into a data
RC.write.table writes the contents of a data frame into a
1 2 3
connection handle as obtained form
column family name (string)
data frame - it must have both row and column names
Cassandra is a key/value store with dynamic columns, so tables are not
the native format. Row names are used as keys and columns are treated
RC.read.table is really jsut a wrapper for
RC.get.range.slices(conn, c.family, fixed=TRUE).
RC.write.table uses the same facility as
RC.mutate but without actually creating the mutation
object on the R side.
Note that all updates in Cassandra are "upserts", i.e.,
RC.write.table updates any existing row key/coumn name
combinations or creates new ones where not present (insert). Additonal
columns (or even keys) may still exist in the column family and they
will not be touched.
RC.read.table creates a data frame from all columns that are
ever encountered in at least one key. All other values are filled with
RC.read.table returns the resulting data frame
IMPORTANT: Cassandra does NOT preserve order of keys and
columns. Internally, keys are ordered by their hash value and columns
are ordered lexicographically (treated as bytes). However, due to the
fact that columns are dynamic the order of columns will vary if keys
have different columns, because columns are added to the data frame in
the sequence they are encountered as the keys are loaded. You may want
df <- df[order(as.integer(row.names(df))),] on the
RC.read.table for tables with automatic row names to
obtain the original order of rows.
RC.read.table is more effcient than
RC.get.range.slices because it can store columns into
vectors and can pre-allocate the whole structure in advance.
Note that the current implementation of tables (
RC.write.table) supports only string-based representation
of columns and values ("UTF8Type", "AsciiType" or similar).