reduce-data.frame-method: Reduce a data.frame

Description Usage Arguments Details Value Author(s) Examples

Description

Reduce a data.frame so that the (primary) key column contains only unique entries and other columns pertaining to that entry are combined into semicolon-separated values into a single row/observation.

Usage

1
2
## S4 method for signature 'data.frame'
reduce(x, key, sep = ";")

Arguments

x

A data.frame.

key

The column name (currenly only one is supported) to be used as primary key.

sep

The separator. Default is ;.

Details

An important side-effect of reducing a 'data.frame' is that all columns other than the key are converted to characters when they are collapsed to a semi-column separated value (even if only one value is present) as soon as one observation of transformed.

Value

A reduced data.frame.

Author(s)

Laurent Gatto

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
dfr <- data.frame(A = c(1, 1, 2),
                  B = c("x", "x", "z"),
                  C = LETTERS[1:3])
dfr
dfr2 <- reduce(dfr, key = "A")
dfr2
## column A used as key is still num
str(dfr2)
dfr3 <- reduce(dfr, key = "B")
dfr3
## A is converted to chr; B remains factor
str(dfr3)
dfr4 <- data.frame(A = 1:3,
                   B = LETTERS[1:3],
                   C = c(TRUE, FALSE, NA))
## No effect of reducing, column classes are maintained
str(reduce(dfr4, key = "B"))

Example output

Loading required package: BiocGenerics
Loading required package: parallel

Attaching package:BiocGenericsThe following objects are masked frompackage:parallel:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked frompackage:stats:

    IQR, mad, sd, var, xtabs

The following objects are masked frompackage:base:

    anyDuplicated, append, as.data.frame, basename, cbind, colnames,
    dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, Position, rank,
    rbind, Reduce, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which.max, which.min

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: mzR
Loading required package: Rcpp
Loading required package: S4Vectors
Loading required package: stats4

Attaching package:S4VectorsThe following object is masked frompackage:base:

    expand.grid

Loading required package: ProtGenerics

Attaching package:ProtGenericsThe following object is masked frompackage:stats:

    smooth


This is MSnbase version 2.16.0 
  Visit https://lgatto.github.io/MSnbase/ to get started.


Attaching package:MSnbaseThe following object is masked frompackage:base:

    trimws

  A B C
1 1 x A
2 1 x B
3 2 z C
  A   B   C
1 1 x;x A;B
2 2   z   C
'data.frame':	2 obs. of  3 variables:
 $ A: num  1 2
 $ B: chr  "x;x" "z"
 $ C: chr  "A;B" "C"
    A B   C
1 1;1 x A;B
2   2 z   C
'data.frame':	2 obs. of  3 variables:
 $ A: chr  "1;1" "2"
 $ B: chr  "x" "z"
 $ C: chr  "A;B" "C"
'data.frame':	3 obs. of  3 variables:
 $ A: int  1 2 3
 $ B: chr  "A" "B" "C"
 $ C: logi  TRUE FALSE NA

MSnbase documentation built on Jan. 23, 2021, 2 a.m.