Description Usage Arguments Value Collapsed data uncollapse.rows collapse.rows Author(s) See Also Examples
collapse/uncollapse the rows in a data.frame
1 2 3 4 | collapse.rows(x, key.column = 1, cols2collapse = NULL, sep = " // ",
max.nchar = NULL)
uncollapse.rows(x, cols2uncollapse = NULL, sep = " // ")
|
x |
a |
key.column |
the column that must end up having one row per key. numeric or character allowed. |
cols2collapse |
which are the columns that you want to collapse. Often there are columns which will contain the same info repeated over and over, and you don't want these things to have the same word repeated N times. Just ignore these columns then, and only supply the column names of those columns that you want to be joined. |
sep |
the seperator, such as “, ” or “ // ” |
max.nchar |
the maximum number of characters in each collapsed cell. if |
cols2uncollapse |
Which columns need uncollapsing? Must specify at least
1 column (hint: this is the column that contains |
collapse.rows
: a data.frame
with same num columns, but
only N rows corresponding to
the N different values in the key column. alphanumerically sorted by key
column.
uncollapse.rows
: a data.frame
with same num columns, but with no
rows that have duplicate values in the cols2uncollapse
.
Collapsed data means a data.frame
with at least 1 column whose values
are sep
delimited. example:
* a cell of data with multiple gene symbols "Ankrd11|Galnt2"
* a cell of data with multiple GO terms, eg "GO:00001 /// GO:00002 /// GO:00003"
* a cell of data with multiple attributes, eg "TD, ND, CD"
These type of data are very common when there are multiple values per key
.
uncollapse.rows takes collapsed data, and increases the number of rows, such that these data have 1 element per row. so 1 row with "Ankrd11|Galnt2" becomes 2 rows with "Ankrd11" and "Galnt2" for example. Thus changing it from n:1 to 1:1.
All columns that are not specified in cols2uncollapse
will be repeated.
If you have just 1 column to uncollapse, then only that column will be changed.
If you have more than 1 column to expand, then within those rows that need uncollapsing,
all specified columns MUST have the same number of elements.
Eg consider a data.frame
with 1 row per gene with 3 GO-term columns:
GO.ID, GO.Name, GO.Evidence. For any given gene with 3 GO terms, there should
also be exactly 3 GO ID's, 3 GO Names and 3 GO term evidence codes. If there are different
numbers of elements found this will thow an error. @TODO: allow a different number
of values per collapsed row.
Strongly suggest using this function to reverse the effects of
collapse.rows
, using
the same arguments that were supplied to collapse.rows
itself.
Take a data.frame
which has rows that contain mostly the same info, but some
columns change. You want one row per unique value of key
from x[,key.column]
,
and in the columns that contain non-equal data, collapse these values into a
single value, separated by “, ” or “ // ” for example.
Mark Cowley, 2009-01-07
1 2 3 4 5 6 7 8 9 | df <- data.frame(
Name=rep(LETTERS[1:3], each=3),
Description=rep(letters[1:3], each=3),
Value=LETTERS[11:19],
stringsAsFactors=FALSE
)
a <- collapse.rows(df, 1, 3)
a
uncollapse.rows(a, 1, 3)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.