cross2long: Convert a dataset in wide (crosstab) format to long...
In FME: A Flexible Modelling Environment for Inverse Modelling, Sensitivity, Identifiability and Monte Carlo Analysis

cross2long

R Documentation

Convert a dataset in wide (crosstab) format to long (database) format

Description

Rearranges a data frame in cross tab format by putting all relevant columns below each other, replicating the independent variable and, if necessary, other specified columns. Optionally, an err column is added.

Usage

cross2long( data, x, select = NULL, replicate = NULL, 
            error = FALSE,  na.rm = FALSE)

Arguments

`data`	a data frame (or matrix) with crosstab layout
`x`	name of the independent variable to be replicated
`select`	a vector of column names to be included (see details). All columns are included if not specified.
`replicate`	a vector of names of variables (apart from the independent variable that have to be replicated for every included column (e.g. experimental treatment specification).
`error`	boolean indicating whether the final dataset in long format should contain an extra column for error values (cf. modCost); here filled with 1's.
`na.rm`	whether or not to remove the `NA`s.

Details

The original data frame is converted from a wide (crosstab) layout (one variable per column) to a long (database) layout (all variable value in one column).

As an example of both formats consider the data, called Dat consisting of two observed variables, called "Obs1" and "Obs2", both containing two observations, at time 1 and 2:

name	time	val	err
Obs1	1	50	5
Obs1	2	150	15
Obs2	1	1	0.1
Obs2	2	2	0.2

for the long format and

time	Obs1	Obs2
1	50	1
2	150	2

for the crosstab format.

The parameters x, select, and replicate should be disjoint. Although the independent variable always has to be replicated it should not be given by the replicate parameter.

Value

A data frame with the following columns:

`name`	Column containing the column names of the original crosstab data frame, `data`
`x`	A replication of the independent variable
`y`	The actual data stacked upon each other in one column
`err`	Optional column, filled with NA values (necessary for some other functions)
`...`	all other columns from the original dataset that had to be replicated (indicated by the parameter `replicate`)

Author(s)

Tom Van Engeland <tom.vanengeland@nioz.nl>

References

Soetaert, K. and Petzoldt, T. 2010. Inverse Modelling, Sensitivity and Monte Carlo Analysis in R Using Package FME. Journal of Statistical Software 33(3) 1–28. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v033.i03")}

Examples


## =======================================================================
## Suppose we have measured sediment oxygen concentration profiles
## =======================================================================

depth  <- 0:7
O2mud  <- c( 6,   1,   0.5, 0.1, 0.05,0,   0,   0)
O2silt <- c( 6,   5,   3,   2,   1.5, 1,   0.5, 0)
O2sand <- c( 6,   6,   5,   4,   3,   2,   1,   0)
zones  <- c("a", "b", "b", "c", "c", "d", "d", "e")
oxygen <- data.frame(depth = depth,
                     zone  = zones,
                     mud   = O2mud,
                     silt  = O2silt,
                     sand  = O2sand
          )

 cross2long(data = oxygen, x = depth, 
            select = c(silt, mud), replicate = zone)

 cross2long(data = oxygen, x = depth, 
            select = c(mud, -silt), replicate = zone)

# twice the same column name: replicates
 colnames(oxygen)[4] <- "mud"    

 cross2long(data=oxygen, x = depth, select = mud)

FME documentation built on July 9, 2023, 5:59 p.m.