GroupedSolrFrame-class: GroupedSolrFrame

GroupedSolrFrame-classR Documentation

GroupedSolrFrame

Description

The GroupedSolrFrame is a highly experimental extension of SolrFrame that models each column as a list, formed by splitting the original vector by a common set of grouping factors.

Details

A GroupedSolrFrame should more or less behave analogously to a data frame where every column is split by a common grouping. Unlike SolrFrame, columns are always extracted lazily. Typical usage is to construct a GroupedSolrFrame by calling group on a SolrFrame, and then to extract columns (as promises) and aggregate them (by e.g. calling mean).

Functions that group the data, such as group and aggregate, simply add to the existing grouping. To clear the grouping, call ungroup or just coerce to a SolrFrame or SolrList.

Accessors

As GroupedSolrFrame inherits much of its functionality from SolrFrame; here we only outline concerns specific to grouped data.

  • ndoc(x): Gets the number of documents per group

  • rownames(x): Forms unique group identifiers by concatenating the grouping factor values.

  • x[i, j] <- value: Inserts value into the Solr core, where value is a data.frame of lists, or just a list (representing a single column). Preferably, i is a promise, because we need to the IDs of the selected documents in order to perform the atomic update, and the promise lets us avoid downloading all of the IDs. But otherwise, if i is atomic, then it indexes into the groups. If i is a list, then its names are matched to the group names, and its elements index into the matching group. The list does not need to be named if the elements are character vectors (and thus represent document IDs).

  • x[i, j, drop=FALSE]: Extracts data from x, as usual, but see the entry immediate above this one for the expectations of i. Try to make it a promise, so that we do not need to download IDs and then try to serialize them into a query, which has length limitations.

Extended API

Most of the typical data frame accessors and data manipulation functions will work analogously on GroupedSolrFrame (see Details). Below, we list some of the non-standard methods that might be seen as an extension of the data frame API.

  • heads(x, n), tails(x, n), windows(x, start, end): Perform head, tail or window on each group separately, returning a data.frame with grouped (list) columns.

  • ngroup(x): The number of groups, i.e., the number of rows.

Author(s)

Michael Lawrence


lawremi/rsolr documentation built on May 28, 2022, 6:17 a.m.