cross: Cross-tabulation

Description Usage Arguments Value Author(s) Examples

Description

Converts a data frame in long format (e.g. output from a data base) into wide format (cross-table) based on criteria from a single column.

Usage

1
2
cross(x, colsRowHead, colColHead, colValues, prefix = "", suffix = "",
  sortRows = 0, sortCols = 0, sep = "\t")

Arguments

x

Data frame to be transformed.

colsRowHead

Columns in x forming the row headers of the result. These columns keep their position. They usually contain factor data such as strings, integers, or logical values.

colColHead

Single column in x defining the newly created column headers in the result. This column usually contains factor data such as strings, integers, or logical values.

colValues

Column with the actual data used to populate the resulting cross table.

prefix

Character string used as a prefix when constructing the names of the newly created columns. This is usually required to get valid column names if the data in column colColHead are not strings.

suffix

Character string used as a suffix when constructing the names of the newly created columns.

sortRows

Numeric. If zero, no sorting is performed. Values greater (less than) zero sort the result in ascending (descending) order based on the combined columns in colsRowHead.

sortCols

Numeric. If zero, no sorting is performed. Values greater (less than) zero sort the newly created columns in ascending (descending) order based on their names.

sep

A character string used as a separator when collapsing information from different columns. This string must not appear in any of the involved columns. An error is generated if this is the case.

Value

A data frame with m + n columns where m is the length of colsRowHead and n equals the number of unique values in the column specified as colColHead.

Author(s)

david.kneis@tu-dresden.de

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Data base format: Meteorological variables at different times and locations
times= ISOdate(2000,6,1:10)
locs= c("Berlin","London","Rome")
vars= c("rain", "temp", "wind")
d= data.frame(stringsAsFactors=FALSE,
  time= rep(times, length(locs)*length(vars)),
  loc= rep(rep(locs, length(vars)), each=length(times)),
  var= rep(vars, each=length(times)*length(vars)),
  val= NA
)
d$val[d$var == "rain"]= runif(sum(d$var == "rain")) > 0.5
d$val[d$var == "temp"]= rnorm(n=sum(d$var == "temp"), mean=20, sd=5)
d$val[d$var == "wind"]= 10^runif(sum(d$var == "wind"), min=-3, max=2)

# Cross table with observed variables in columns
cross(x=d, colsRowHead=c("time","loc"), colColHead="var", colValues="val",
  prefix="", suffix="", sortRows=0, sortCols=0, sep="\t")

# Cross table with locations in columns
cross(x=d, colsRowHead=c("time","var"), colColHead="loc", colValues="val",
  prefix="", suffix="", sortRows=0, sortCols=0, sep="\t")

# A common mistake
## Not run: 
  cross(x=d, colsRowHead=c("time"), colColHead="loc", colValues="val",
    prefix="", suffix="", sortRows=0, sortCols=0, sep="\t")

## End(Not run)

dkneis/mcu documentation built on May 15, 2019, 9:12 a.m.

Related to cross in dkneis/mcu...