to.long: Convert Data from Vector to Long Format
In metafor: Meta-Analysis Package for R

to.long

R Documentation

Convert Data from Vector to Long Format

Description

Function to convert summary data in vector format to the corresponding long format. \loadmathjax

Usage

to.long(measure, ai, bi, ci, di, n1i, n2i, x1i, x2i, t1i, t2i,
        m1i, m2i, sd1i, sd2i, xi, mi, ri, ti, sdi, ni, data, slab, subset,
        add=1/2, to="none", drop00=FALSE, vlong=FALSE, append=TRUE, var.names)

Arguments

`measure`	a character string to specify the effect size or outcome measure corresponding to the summary data supplied. See ‘Details’ and the documentation of the `escalc` function for possible options.
`ai`	vector with the \mjeqn2 \times 22x2 table frequencies (upper left cell).
`bi`	vector with the \mjeqn2 \times 22x2 table frequencies (upper right cell).
`ci`	vector with the \mjeqn2 \times 22x2 table frequencies (lower left cell).
`di`	vector with the \mjeqn2 \times 22x2 table frequencies (lower right cell).
`n1i`	vector with the group sizes or row totals (first group/row).
`n2i`	vector with the group sizes or row totals (second group/row).
`x1i`	vector with the number of events (first group).
`x2i`	vector with the number of events (second group).
`t1i`	vector with the total person-times (first group).
`t2i`	vector with the total person-times (second group).
`m1i`	vector with the means (first group or time point).
`m2i`	vector with the means (second group or time point).
`sd1i`	vector with the standard deviations (first group or time point).
`sd2i`	vector with the standard deviations (second group or time point).
`xi`	vector with the frequencies of the event of interest.
`mi`	vector with the frequencies of the complement of the event of interest or the group means.
`ri`	vector with the raw correlation coefficients.
`ti`	vector with the total person-times.
`sdi`	vector with the standard deviations.
`ni`	vector with the sample/group sizes.
`data`	optional data frame containing the variables given to the arguments above.
`slab`	optional vector with labels for the studies.
`subset`	optional (logical or numeric) vector to specify the subset of studies that should included in the data frame returned by the function.
`add`	see the documentation of the `escalc` function.
`to`	see the documentation of the `escalc` function.
`drop00`	see the documentation of the `escalc` function.
`vlong`	optional logical whether a very long format should be used (only relevant for \mjeqn2 \times 22x2 or \mjeqn1 \times 21x2 table data).
`append`	logical to specify whether the data frame specified via the `data` argument (if one has been specified) should be returned together with the long format data (the default is `TRUE`). Can also be a character or numeric vector to specify which variables from `data` to append.
`var.names`	optional character vector with variable names (the length depends on the data type). If unspecified, the function sets appropriate variable names by default.

Details

The escalc function describes a wide variety of effect sizes or outcome measures that can be computed for a meta-analysis. The summary data used to compute those measures are typically contained in vectors, each element corresponding to a study. The to.long function takes this information and constructs a long format dataset from these data.

For example, in various fields (such as the health and medical sciences), the response variable measured is often dichotomous (binary), so that the data from a study comparing two different groups can be expressed in terms of a \mjeqn2 \times 22x2 table, such as:

	\ics	outcome 1	\ics	outcome 2	\ics	total
group 1	\ics	`ai`	\ics	`bi`	\ics	`n1i`
group 2	\ics	`ci`	\ics	`di`	\ics	`n2i`

where ai, bi, ci, and di denote the cell frequencies (i.e., the number of individuals falling into a particular category) and n1i and n2i the row totals (i.e., the group sizes).

The cell frequencies in \mjseqnk such \mjeqn2 \times 22x2 tables can be specified via the ai, bi, ci, and di arguments (or alternatively, via the ai, ci, n1i, and n2i arguments). The function then creates the corresponding long format dataset. The measure argument should then be set equal to one of the outcome measures that can be computed based on this type of data, such as "RR", "OR", "RD" (it is not relevant which specific measure is chosen, as long as it corresponds to the specified summary data). See the documentation of the escalc function for more details on the types of data formats available.

The long format for data of this type consists of two rows per study, a factor indicating the study (default name study), a dummy variable indicating the group (default name group, coded as 1 and 2), and two variables indicating the number of individuals experiencing outcome 1 or outcome 2 (default names out1 and out2). Alternatively, if vlong=TRUE, then the long format consists of four rows per study, a factor indicating the study (default name study), a dummy variable indicating the group (default name group, coded as 1 and 2), a dummy variable indicating the outcome (default name outcome, coded as 1 and 2), and a variable indicating the frequency of the respective outcome (default name freq).

The default variable names can be changed via the var.names argument (must be of the appropriate length, depending on the data type).

The examples below illustrate the use of this function.

Value

A data frame with either \mjseqnk, \mjeqn2 \times k2*k, or \mjeqn4 \times k4*k rows and an appropriate number of columns (depending on the data type) with the data in long format. If append=TRUE and a data frame was specified via the data argument, then the data in long format are appended to the original data frame (with rows repeated an appropriate number of times).

Author(s)

Wolfgang Viechtbauer (wvb@metafor-project.org, https://www.metafor-project.org).

References

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1–48. ⁠https://doi.org/10.18637/jss.v036.i03⁠

Examples

### convert data to long format
dat.bcg
dat.long <- to.long(measure="OR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg)
dat.long

### extra long format
dat <- to.long(measure="OR", ai=tpos, bi=tneg, ci=cpos, di=cneg, data=dat.bcg, vlong=TRUE)
dat

### select variables to append
dat.long <- to.long(measure="OR", ai=tpos, bi=tneg, ci=cpos, di=cneg,
                    data=dat.bcg, append=c("author","year"))
dat.long
dat.long <- to.long(measure="OR", ai=tpos, bi=tneg, ci=cpos, di=cneg,
                    data=dat.bcg, append=2:3)
dat.long

### convert data to long format
dat.long <- to.long(measure="IRR", x1i=x1i, x2i=x2i, t1i=t1i, t2i=t2i,
                   data=dat.hart1999, var.names=c("id", "group", "events", "ptime"))
dat.long

### convert data to long format
dat.long <- to.long(measure="MD", m1i=m1i, sd1i=sd1i, n1i=n1i,
                    m2i=m2i, sd2i=sd2i, n2i=n2i, data=dat.normand1999,
                    var.names=c("id", "group", "mean", "sd", "n"))
dat.long

metafor documentation built on April 4, 2025, 3:06 a.m.