Description Usage Arguments Details Value See Also Examples
From a given set of splitting characters select the ones that split a character vector in a regular way, yielding the same number of parts for all vector elements. Then apply these splitting characters to create a matrix. The data frame method applies this to all character vectors (and optionally also all factors) within a data frame.
1 2 3 4 5 6 7 8 9 10 | ## S4 method for signature 'character'
separate(object, split = opm_opt("split"),
simplify = FALSE, keep.const = TRUE, list.wise = FALSE,
strip.white = list.wise)
## S4 method for signature 'data.frame'
separate(object, split = opm_opt("split"),
simplify = FALSE, keep.const = TRUE, coerce = TRUE, name.sep = ".", ...)
## S4 method for signature 'factor'
separate(object, split = opm_opt("split"),
simplify = FALSE, keep.const = TRUE, ...)
|
object |
Character vector to be split, or data frame in which character vectors (or factors) shall be attempted to be split, or factor. |
split |
Character vector or
|
simplify |
Logical scalar indicating whether a
resulting matrix with one column should be simplified to
a vector (or such a data frame to a factor). If so, at
least one matrix column is kept, even if
|
keep.const |
Logical scalar indicating whether constant columns should be kept or removed. |
coerce |
Logical scalar indicating whether factors should be coerced to ‘character’ mode and then also be attempted to be split. The resulting columns will be coerced back to factors. |
name.sep |
Character scalar to be inserted in the
constructed column names. If more than one column results
from splitting, the names will contain (i) the original
column name, (ii) |
list.wise |
Logical scalar. Ignored if |
strip.white |
Logical scalar. Remove whitespace from
the ends of each resulting character scalar after
splitting? Has an effect on the removal of constant
columns. Whitespace is always removed if |
... |
Optional arguments passed between the methods. |
This function is useful if information coded in the elements of a character vector is to be converted to a matrix or data frame. For instance, file names created by a batch export conducted by a some software are usually more or less regularly structured and contain content at distinct positions. In such situations, the correct splitting approach can be recognised by yielding the same number of fields from each vector element.
Character matrix, its number of rows being equal to the
length of object
, or data frame with the same
number of rows as object
but potentially more
columns. May be character vector of factor with character
or factor input and simplify
set to TRUE
.
base::strsplit utils::read.fwf
Other auxiliary-functions: opm_opt
,
param_names
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 | # Splitting by characters
x <- c("a-b-cc", "d-ff-g")
(y <- separate(x, ".")) # a split character that does not occur
stopifnot(is.matrix(y), y[, 1L] == x)
(y <- separate(x, "-")) # a split character that does occur
stopifnot(is.matrix(y), dim(y) == c(2, 3))
# Fixed-with splitting
x <- c(" abd efgh", " ABCD EFGH ", " xyz")
(y <- separate(x, TRUE))
stopifnot(is.matrix(y), dim(y) == c(3, 2))
# Applied to factors
xx <- as.factor(x)
(yy <- separate(xx, TRUE))
stopifnot(identical(yy, as.data.frame(y)))
# List-wise splitting
x <- c("a,b", "c,b", "a,c")
(y <- separate(x, ",", list.wise = TRUE))
stopifnot(is.matrix(y), dim(y) == c(3, 3), is.logical(y))
# Data-frame method
x <- data.frame(a = 1:2, b = c("a-b-cc", "a-ff-g"))
(y <- separate(x, coerce = FALSE))
stopifnot(identical(x, y))
(y <- separate(x)) # only character/factor columns are split
stopifnot(is.data.frame(y), dim(y) == c(2, 4))
stopifnot(sapply(y, class) == c("integer", "factor", "factor", "factor"))
(y <- separate(x, keep.const = FALSE))
stopifnot(is.data.frame(y), dim(y) == c(2, 3))
stopifnot(sapply(y, class) == c("integer", "factor", "factor"))
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.