expand-methods: Unlist the list-like columns of a DataFrame object

Description Usage Arguments Value See Also Examples

Description

expand transforms a DataFrame object into a new DataFrame object where the columns specified by the user are unlisted. The transformed DataFrame object has the same colnames as the original but typically more rows.

Usage

1
2
## S4 method for signature 'DataFrame'
expand(x, colnames, keepEmptyRows = FALSE, recursive = TRUE)

Arguments

x

A DataFrame object with list-like columns or a Vector object with list-like metadata columns (i.e. with list-like columns in mcols(x)).

colnames

A character or numeric vector containing the names or indices of the list-like columns to unlist. The order in which columns are unlisted is controlled by the column order in this vector. This defaults to all of the recursive (list-like) columns in x.

keepEmptyRows

A logical indicating if rows containing empty list elements in the specified colnames should be retained or dropped. When TRUE, list elements are replaced with NA and all rows are kept. When FALSE, rows with empty list elements in the colnames columns are dropped.

recursive

If TRUE, expand each column recursively, with the result representing their cartesian product. If FALSE, expand all of the columns in parallel, which requires that they all share the same skeleton.

Value

A DataFrame object that has been expanded row-wise to match the length of the unlisted columns.

See Also

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
library(IRanges)
aa <- CharacterList("a", paste0("d", 1:2), paste0("b", 1:3), c(), "c")
bb <- CharacterList(paste0("sna", 1:2),"foo", paste0("bar",1:3),c(),"hica")
df <- DataFrame(aa=aa, bb=bb, cc=11:15)

## Expand by all list-like columns (aa, bb), dropping rows with empty
## list elements:
expand(df)

## Expand the aa column only:
expand(df, colnames="aa", keepEmptyRows=TRUE)
expand(df, colnames="aa", keepEmptyRows=FALSE)

## Expand the aa and then the bb column:
expand(df, colnames=c("aa","bb"), keepEmptyRows=TRUE)
expand(df, colnames=c("aa","bb"), keepEmptyRows=FALSE)

## Expand the aa and dd column in parallel:
df$dd <- relist(seq_along(unlist(aa)), aa)
expand(df, colnames=c("aa","dd"), recursive=FALSE)

Example output

Loading required package: stats4
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package:BiocGenericsThe following objects are masked frompackage:parallel:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked frompackage:stats:

    IQR, mad, sd, var, xtabs

The following objects are masked frompackage:base:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min


Attaching package:S4VectorsThe following object is masked frompackage:base:

    expand.grid

DataFrame with 14 rows and 3 columns
             aa          bb        cc
    <character> <character> <integer>
1             a        sna1        11
2             a        sna2        11
3            d1         foo        12
4            d2         foo        12
5            b1        bar1        13
...         ...         ...       ...
10           b2        bar3        13
11           b3        bar1        13
12           b3        bar2        13
13           b3        bar3        13
14            c        hica        15
DataFrame with 8 rows and 3 columns
           aa              bb        cc
  <character> <CharacterList> <integer>
1           a       sna1,sna2        11
2          d1             foo        12
3          d2             foo        12
4          b1  bar1,bar2,bar3        13
5          b2  bar1,bar2,bar3        13
6          b3  bar1,bar2,bar3        13
7          NA                        14
8           c            hica        15
DataFrame with 7 rows and 3 columns
           aa              bb        cc
  <character> <CharacterList> <integer>
1           a       sna1,sna2        11
2          d1             foo        12
3          d2             foo        12
4          b1  bar1,bar2,bar3        13
5          b2  bar1,bar2,bar3        13
6          b3  bar1,bar2,bar3        13
7           c            hica        15
DataFrame with 15 rows and 3 columns
             aa          bb        cc
    <character> <character> <integer>
1             a        sna1        11
2             a        sna2        11
3            d1         foo        12
4            d2         foo        12
5            b1        bar1        13
...         ...         ...       ...
11           b3        bar1        13
12           b3        bar2        13
13           b3        bar3        13
14           NA          NA        14
15            c        hica        15
DataFrame with 14 rows and 3 columns
             aa          bb        cc
    <character> <character> <integer>
1             a        sna1        11
2             a        sna2        11
3            d1         foo        12
4            d2         foo        12
5            b1        bar1        13
...         ...         ...       ...
10           b2        bar3        13
11           b3        bar1        13
12           b3        bar2        13
13           b3        bar3        13
14            c        hica        15
DataFrame with 7 rows and 4 columns
           aa              bb        cc        dd
  <character> <CharacterList> <integer> <integer>
1           a       sna1,sna2        11         1
2          d1             foo        12         2
3          d2             foo        12         3
4          b1  bar1,bar2,bar3        13         4
5          b2  bar1,bar2,bar3        13         5
6          b3  bar1,bar2,bar3        13         6
7           c            hica        15         7

S4Vectors documentation built on Dec. 11, 2020, 2:02 a.m.