Description Usage Arguments Details Value Methods Author(s) See Also Examples
The combine
generic function handles methods for combining
or merging different Bioconductor data structures.
It should, given an arbitrary number of arguments of the same class
(possibly by inheritance), combine them into a single instance in
a sensible way (some methods may only combine 2 objects,
ignoring ...
in the argument list; because Bioconductor
data structures are complicated, check carefully that combine
does as you intend).
1 |
x |
One of the values. |
y |
A second value. |
... |
Any other objects of the same class as |
There are two basic combine strategies. One is an intersection strategy. The returned value should only have rows (or columns) that are found in all input data objects. The union strategy says that the return value will have all rows (or columns) found in any one of the input data objects (in which case some indication of what to use for missing values will need to be provided).
These functions and methods are currently under construction. Please let us know if there are features that you require.
A single value of the same class as the most specific common ancestor (in class terms) of the input values. This will contain the appropriate combination of the data in the input values.
The following methods are defined in the BiocGenerics package:
combine(x=ANY, missing)
Return the first (x) argument unchanged.
combine(data.frame, data.frame)
Combines two
data.frame
objects so that the resulting data.frame
contains all rows and columns of the original objects. Rows and
columns in the returned value are unique, that is, a row or column
represented in both arguments is represented only once in the
result. To perform this operation, combine
makes sure that data
in shared rows and columns are identical in the two
data.frames. Data differences in shared rows and columns usually cause an
error. combine
issues a warning when a column is a
factor
and the levels of the factor in the two
data.frames are different.
combine(matrix, matrix)
Combined two matrix
objects so that the resulting matrix
contains all rows and
columns of the original objects. Both matricies must have
dimnames
. Rows and columns in the returned
value are unique, that is, a row or column represented in both
arguments is represented only once in the result. To perform this
operation, combine
makes sure that data in shared rows and
columns are all equal in the two matricies.
Additional combine
methods are defined in the Biobase package
for AnnotatedDataFrame,
AssayData, MIAME,
and eSet objects.
Biocore
combine,AnnotatedDataFrame,AnnotatedDataFrame-method,
combine,AssayData,AssayData-method,
combine,MIAME,MIAME-method,
and combine,eSet,eSet-method in the Biobase
package for additional combine
methods.
merge
for merging two data frames (or data-frame-like)
objects.
showMethods
for displaying a summary of the
methods defined for a given generic function.
selectMethod
for getting the definition of
a specific method.
BiocGenerics for a summary of all the generics defined in the BiocGenerics package.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 | combine
showMethods("combine")
selectMethod("combine", c("ANY", "missing"))
selectMethod("combine", c("data.frame", "data.frame"))
selectMethod("combine", c("matrix", "matrix"))
## ---------------------------------------------------------------------
## COMBINING TWO DATA FRAMES
## ---------------------------------------------------------------------
x <- data.frame(x=1:5,
y=factor(letters[1:5], levels=letters[1:8]),
row.names=letters[1:5])
y <- data.frame(z=3:7,
y=factor(letters[3:7], levels=letters[1:8]),
row.names=letters[3:7])
combine(x,y)
w <- data.frame(w=4:8,
y=factor(letters[4:8], levels=letters[1:8]),
row.names=letters[4:8])
combine(w, x, y)
# y is converted to 'factor' with different levels
df1 <- data.frame(x=1:5,y=letters[1:5], row.names=letters[1:5])
df2 <- data.frame(z=3:7,y=letters[3:7], row.names=letters[3:7])
try(combine(df1, df2)) # fails
# solution 1: ensure identical levels
y1 <- factor(letters[1:5], levels=letters[1:7])
y2 <- factor(letters[3:7], levels=letters[1:7])
df1 <- data.frame(x=1:5,y=y1, row.names=letters[1:5])
df2 <- data.frame(z=3:7,y=y2, row.names=letters[3:7])
combine(df1, df2)
# solution 2: force column to be 'character'
df1 <- data.frame(x=1:5,y=I(letters[1:5]), row.names=letters[1:5])
df2 <- data.frame(z=3:7,y=I(letters[3:7]), row.names=letters[3:7])
combine(df1, df2)
## ---------------------------------------------------------------------
## COMBINING TWO MATRICES
## ---------------------------------------------------------------------
m <- matrix(1:20, nrow=5, dimnames=list(LETTERS[1:5], letters[1:4]))
combine(m[1:3,], m[4:5,])
combine(m[1:3, 1:3], m[3:5, 3:4]) # overlap
|
Loading required package: parallel
Attaching package: 'BiocGenerics'
The following objects are masked from 'package:parallel':
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from 'package:stats':
IQR, mad, sd, var, xtabs
The following objects are masked from 'package:base':
Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, basename, cbind, colMeans, colSums, colnames,
dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
intersect, is.unsorted, lapply, lengths, mapply, match, mget,
order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply,
union, unique, unsplit, which, which.max, which.min
nonstandardGenericFunction for "combine" defined from package "BiocGenerics"
function (x, y, ...)
{
if (length(list(...)) > 0L) {
combine(x, do.call(combine, list(y, ...)))
}
else {
standardGeneric("combine")
}
}
<environment: 0x2ed98f8>
Methods may be defined for arguments: x, y
Use showMethods("combine") for currently available ones.
Function: combine (package BiocGenerics)
x="ANY", y="missing"
x="data.frame", y="data.frame"
x="matrix", y="matrix"
Method Definition:
function (x, y, ...)
x
<environment: namespace:BiocGenerics>
Signatures:
x y
target "ANY" "missing"
defined "ANY" "missing"
Method Definition:
function (x, y, ...)
{
if (all(dim(x) == 0L) && all(dim(y) == 0L))
return(x)
else if (all(dim(x) == 0L))
return(y)
else if (all(dim(y) == 0L))
return(x)
uniqueRows <- unique(c(row.names(x), row.names(y)))
uniqueCols <- unique(c(names(x), names(y)))
sharedCols <- intersect(names(x), names(y))
alleq <- function(x, y) {
res <- all.equal(x, y, check.attributes = FALSE)
if (!is.logical(res)) {
warning(res)
FALSE
}
else TRUE
}
sharedRows <- intersect(row.names(x), row.names(y))
ok <- sapply(sharedCols, function(nm) {
if (!all(class(x[[nm]]) == class(y[[nm]])))
return(FALSE)
switch(class(x[[nm]])[[1L]], factor = {
if (!alleq(levels(x[[nm]]), levels(y[[nm]]))) {
warning("data frame column '", nm, "' levels not all.equal",
call. = FALSE)
TRUE
} else if (!alleq(x[sharedRows, nm, drop = FALSE],
y[sharedRows, nm, drop = FALSE])) {
warning("data frame column '", nm, "' shared rows not all equal",
call. = FALSE)
FALSE
} else TRUE
}, ordered = , if (!alleq(x[sharedRows, nm, drop = FALSE],
y[sharedRows, nm, drop = FALSE])) {
warning("data frame column '", nm, "' shared rows not all equal")
FALSE
} else TRUE)
})
if (!all(ok))
stop("data.frames contain conflicting data:", "\n\tnon-conforming colname(s): ",
paste(sharedCols[!ok], collapse = ", "))
if (length(uniqueRows) == 0L) {
x <- x["tmp", , drop = FALSE]
y <- y["tmp", , drop = FALSE]
}
else if (nrow(x) == 0L) {
x <- x[row.names(y), , drop = FALSE]
row.names(x) <- row.names(y)
}
else if (nrow(y) == 0L) {
y <- y[row.names(x), , drop = FALSE]
row.names(y) <- row.names(x)
}
if (length(uniqueCols) > 0L)
extLength <- max(nchar(sub(".*\\.", "", uniqueCols))) +
1L
else extLength <- 1L
extX <- paste(c(".", rep("x", extLength)), collapse = "")
extY <- paste(c(".", rep("y", extLength)), collapse = "")
z <- merge(x, y, by = "row.names", all = TRUE, suffixes = c(extX,
extY))
for (nm in sharedCols) {
nmx <- paste(nm, extX, sep = "")
nmy <- paste(nm, extY, sep = "")
z[[nm]] <- switch(class(z[[nmx]])[[1]], AsIs = I(ifelse(is.na(z[[nmx]]),
z[[nmy]], z[[nmx]])), factor = {
col <- ifelse(is.na(z[[nmx]]), as.character(z[[nmy]]),
as.character(z[[nmx]]))
if (!identical(levels(z[[nmx]]), levels(z[[nmy]]))) factor(col) else factor(col,
levels = levels(z[[nmx]]))
}, {
col <- ifelse(is.na(z[[nmx]]), z[[nmy]], z[[nmx]])
class(col) <- class(z[[nmx]])
col
})
}
row.names(z) <- if (is.integer(attr(x, "row.names")) && is.integer(attr(y,
"row.names")))
as.integer(z$Row.names)
else z$Row.names
z$Row.names <- NULL
z[uniqueRows, uniqueCols, drop = FALSE]
}
<environment: namespace:BiocGenerics>
Signatures:
x y
target "data.frame" "data.frame"
defined "data.frame" "data.frame"
Method Definition:
function (x, y, ...)
{
if (length(y) == 0L)
return(x)
else if (length(x) == 0L)
return(y)
if (mode(x) != mode(y))
stop("matrix modes ", mode(x), ", ", mode(y), " differ")
if (typeof(x) != typeof(y))
warning("matrix typeof ", typeof(x), ", ", typeof(y),
" differ")
xdim <- dimnames(x)
ydim <- dimnames(y)
if (is.null(xdim) || is.null(ydim) || any(sapply(xdim, is.null)) ||
any(sapply(ydim, is.null)))
stop("matricies must have dimnames for 'combine'")
sharedRows <- intersect(xdim[[1L]], ydim[[1L]])
sharedCols <- intersect(xdim[[2L]], ydim[[2L]])
ok <- all.equal(x[sharedRows, sharedCols], y[sharedRows,
sharedCols])
if (!isTRUE(ok))
stop("matrix shared row and column elements differ: ",
ok)
unionRows <- union(xdim[[1L]], ydim[[1L]])
unionCols <- union(xdim[[2L]], ydim[[2L]])
m <- matrix(new(class(as.vector(x))), nrow = length(unionRows),
ncol = length(unionCols), dimnames = list(unionRows,
unionCols))
m[rownames(x), colnames(x)] <- x
m[rownames(y), colnames(y)] <- y
m
}
<environment: namespace:BiocGenerics>
Signatures:
x y
target "matrix" "matrix"
defined "matrix" "matrix"
x y z
a 1 a NA
b 2 b NA
c 3 c 3
d 4 d 4
e 5 e 5
f NA f 6
g NA g 7
w y x z
d 4 d 4 4
e 5 e 5 5
f 6 f NA 6
g 7 g NA 7
h 8 h NA NA
a NA a 1 NA
b NA b 2 NA
c NA c 3 3
x y z
a 1 a NA
b 2 b NA
c 3 c 3
d 4 d 4
e 5 e 5
f NA f 6
g NA g 7
Warning messages:
1: In alleq(levels(x[[nm]]), levels(y[[nm]])) : 5 string mismatches
2: data frame column 'y' levels not all.equal
x y z
a 1 a NA
b 2 b NA
c 3 c 3
d 4 d 4
e 5 e 5
f NA f 6
g NA g 7
x y z
a 1 a NA
b 2 b NA
c 3 c 3
d 4 d 4
e 5 e 5
f NA f 6
g NA g 7
a b c d
A 1 6 11 16
B 2 7 12 17
C 3 8 13 18
D 4 9 14 19
E 5 10 15 20
a b c d
A 1 6 11 NA
B 2 7 12 NA
C 3 8 13 18
D NA NA 14 19
E NA NA 15 20
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.