collect: Collect information

Description Usage Arguments Details Value See Also Examples

View source: R/coding.R

Description

Methods for collecting information from list-like objects into a matrix or data frame or for re-assigning values to columns in a matrix.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
  collect(x, what, ...)

  ## S3 method for class 'list'
 collect(x,
    what = c("counts", "occurrences", "values", "elements", "datasets", "rows"),
        min.cov = 1L, keep.unnamed = FALSE, dataframe = FALSE,
    optional = TRUE,
    stringsAsFactors = default.stringsAsFactors(), ...)

  ## S3 method for class 'matrix'
 collect(x, what = c("columns", "rows"),
    empty = "?", ...)

Arguments

x

List or matrix.

what

Character scalar indicating how to collect information. The following values are supported by the list method:

counts

For all non-list elements of x, count their occurrences.

occurrences

Like ‘counts’, but only indicate presence or absence.

values

Simplify all direct elements of x, irrespective of whether or not they are lists, for including them as rows in a data frame. Their names determine the columns. See keep.unnamed for the action in the case of missing names.

elements

Like ‘elements’, but collect only the non-list elements of x, i.e. flatten x in the first step.

datasets

Convert all elements to data frames or matrices, then merge them using rows and column names. In case of conflict, the last ones win. Here, the behaviour of other arguments is special if all elements of x are atomic. See below.

rows

Like datasets, but all rows are kept. This is like rbind from the base package but it also augments missing columns where necessary.

The matrix method currently only supports columns, which means assorting the values to the columns anew based on the majority of their occurrences, and rows, which does the same for the rows. This can be used to clean up messy data.

min.cov

Numeric scalar indicating the minimal coverage required in the resulting presence-absence matrix. Columns with a fewer number of non-zero entries are removed.

keep.unnamed

Logical scalar indicating whether names should be inserted for elements of x that miss them. If NA, they are skipped, but with a warning; if FALSE, they are skipped silently. This only has an effect in conjunction with the last three values of what. If datasets are chosen, it usually has only an effect if all elements of x are atomic.

dataframe

Logical scalar indicating whether a data frame should be produced instead of a matrix.

optional

See as.data.frame from the base package.

stringsAsFactors

See as.data.frame from the base package.

empty

Character scalar used as intermediary placeholder for empty and missing values.

...

Optional arguments passed to and from other methods (if requested to as.data.frame).

Details

The list method of flatten is based on https://stackoverflow.com/questions/8139677/ with some slight improvements.

Value

The list method of flatten returns a non-nested list. The collect methods yield a data frame or a matrix.

See Also

base::unlist base::as.data.frame base::rbind

Other coding-functions: L, LL, assert, case, check, contains, flatten, listing, map_names, map_values, must, set, sql, unnest

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
## collect()
x <- list(X = list(A = 1:3, B = 7L, C = list('c1', 1:3)),
  Y = list(A = 1:3, 11, B = -1L, D = "?"))

## collect values into a data frame or matrix
(got <- collect(x, "values", dataframe = TRUE))
stopifnot(LETTERS[1:4] == colnames(got))
stopifnot(names(x) == rownames(got))
stopifnot(is.list(got$A), is.integer(got$B), is.list(got$C),
  is.factor(got$D))
stopifnot(!is.na(got$A), !is.na(got$B), anyNA(got$C), anyNA(got$D))
# include the unnamed ones
got <- collect(x, "values", dataframe = TRUE, keep.unnamed = TRUE)
stopifnot(dim(got) == c(2, 5))
# simplify to matrix
(got <- collect(x, "values", dataframe = FALSE))
stopifnot(is.matrix(got), mode(got) == "list")

## collect elements into a data frame or matrix
(got <- collect(x, "elements", dataframe = TRUE))
stopifnot(dim(got) == c(2, 9), colnames(x) == rownames(got),
  is.data.frame(got))
(got <- collect(x, "elements", dataframe = FALSE))
stopifnot(dim(got) == c(2, 9), colnames(x) == rownames(got),
  !is.data.frame(got))

## count or just note occurrences
(got <- collect(x, "counts", dataframe = FALSE))
stopifnot(dim(got) == c(2, 8), rownames(got) == names(x),
  setequal(colnames(got), unlist(x)), any(got > 1))
(got <- collect(x, "occurrences", dataframe = FALSE))
stopifnot(dim(got) == c(2, 8), rownames(got) == names(x),
  setequal(colnames(got), unlist(x)), !any(got > 1))

## convert to data frames and insert everything in a single one
(got <- collect(x, "datasets", optional = FALSE, dataframe = TRUE))
stopifnot(dim(got) == c(3, 6), is.data.frame(got))

## a more useful application is to merge matrices
m1 <- matrix(1:4, ncol = 2, dimnames = list(c("A", "B"), c("x", "y")))
m2 <- matrix(1:4, ncol = 2, dimnames = list(c("C", "B"), c("x", "z")))
(got <- collect(list(m1, m2), "datasets"))
# values missing in some matrix yield NA
stopifnot(dim(got) == c(3, 3), anyNA(got))

pkgutils documentation built on May 2, 2019, 5:49 p.m.

Related to collect in pkgutils...