unfoldDataFrame: Unfold a data frame

View source: R/reduce.R

unfoldDataFrameR Documentation

Unfold a data frame

Description

A data frame is said to be folded when some cells contain multiple elements. These are often encode as a semi-colon separated character , such as "a;b". This function will transform the data frame to that "a" and "b" are split and recorded across two lines.

The simple example below illustrates a trivial case, where the table below

X Y
1 a;b
2 c

is unfolded based on the Y variable and becomes

X Y
1 a
1 b
2 c

where the value 1 of variable X is now duplicated.

If there is a second variable that follows the same pattern as the one used to unfold the table, it also gets unfolded.

X Y Z
1 a;b x;y
2 c z

becomes

X Y Z
1 a x
1 b y
2 c z

because it is implied that the element in "a;b" are match to "x;y" by their respective indices. Note in the above example, unfolding by Y or Z produces the same result.

However, the following table unfolded by Y

X Y Z
1 a;b x;y
2 c x;y

produces

X Y Z
1 a x;y
1 b x;y
2 c x;y

because "c" and "x;y" along the second row don't match. In this case, unfolding by Z would produce a different result. These examples are also illustrated below.

Note that there is no foldDataFrame() function. See reduceDataFrame() and expandDataFrame() to flexibly encode and handle vectors of length > 1 within cells.

Usage

unfoldDataFrame(x, k, split = ";")

Arguments

x

A DataFrame or data.frame to be unfolded.

k

character(1) referring to a character variable in x, that will be used to unfold x.

split

character(1) passed to strsplit() to split x[[k]].

Value

A new object unfolded object of class class(x) with numbers of rows >= nrow(x) and columns identical to x.

Author(s)

Laurent Gatto

Examples


(x0 <- DataFrame(X = 1:2, Y = c("a;b", "c")))
unfoldDataFrame(x0, "Y")

(x1 <- DataFrame(X = 1:2, Y = c("a;b", "c"), Z = c("x;y", "z")))
unfoldDataFrame(x1, "Y")
unfoldDataFrame(x1, "Z") ## same

(x2 <- DataFrame(X = 1:2, Y = c("a;b", "c"), Z = c("x;y", "x;y")))
unfoldDataFrame(x2, "Y")
unfoldDataFrame(x2, "Z") ## different

rformassspectrometry/Features documentation built on Sept. 25, 2024, 11:30 a.m.