# expand.bpairs: Expand binomial-pair data from short to long form In earth: Multivariate Adaptive Regression Splines

## Description

Expand binomial-pair data from “short” to “long” form.

The short form specifies the response with two columns giving the numbers of successes and failures. Example short form:

 ```1 2 3 4 5``` ``` survived died dose sex 3 0 10 male 2 1 10 female 1 2 20 male 1 2 20 female ```

The long form specifies the response as single column of `TRUE`s and `FALSE`s. For example, the long form of the above data (spaces and comments added):

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16``` ``` survived dose sex TRUE 10 male # row 1 of short data: 0 died, 3 survived TRUE 10 male TRUE 10 male FALSE 10 female # row 2 of short data: 1 died, 2 survived TRUE 10 female TRUE 10 female FALSE 20 male # row 3 of short data: 2 died, 1 survived FALSE 20 male TRUE 20 male FALSE 20 female # row 4 of short data: 2 died, 1 survived FALSE 20 female TRUE 20 female ```

In this example the total number of survived and died for each row in the short data is the same, but in general that need not be true.

## Usage

 ```1 2 3 4 5``` ```## S3 method for class 'formula' expand.bpairs(formula = stop("no 'formula' argument"), data = NULL, sort = FALSE, ...) ## Default S3 method: expand.bpairs(data = stop("no 'data' argument"), y = NULL, sort = FALSE, ...) ```

## Arguments

 `formula` Model formula such as `survived + died ~ dose + temp`. `data` Matrix or dataframe containing the data. `y` Model response. One of: o Two column matrix or dataframe of binomial pairs e.g. `cbind(survived, died=20-survived)` o Two-element numeric vector specifying the response columns in `data` e.g. `c(1,2)` o Two-element character vector specifying the response column names in `data` e.g. `c("survived", "died")`. The full names must be used (partial matching isn't supported). `sort` Default `FALSE`. Use `TRUE` to sort the rows of the long data so it is returned in canonical form, independent of the row order of the short data. The long data is sorted on predictor values; predictors on the left take precedence in the sort order. `...` Unused, but provided for generic/method consistency.

## Value

A dataframe of the data in the long form, with expanded binomial pairs. The first column of the data will be the response column (a column of `TRUE`s and `FALSE`s).

Additionally, the returned value has two attached attributes:

`bpairs.index` A vector of row indices into the returned data. Can be used to reconstruct the short data from the long data (although this package does not yet provide a function to do so).

`ynames` Column names of the original response (a two-element character vector).

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15``` ```survived <- c(3,2,1,1) # short data for demo (too short to build a real model) died <- c(0,1,2,2) dose <- c(10,10,20,20) sex <- factor(c("male", "female", "male", "female")) short.data <- data.frame(survived, died, dose, sex) expand.bpairs(survived + died ~ ., short.data) # returns long form of the data # expand.bpairs(data=short.data, y=cbind(survived, died)) # equivalent # expand.bpairs(short.data, c(1,2)) # equivalent # expand.bpairs(short.data, c("survived", "died")) # equivalent # For example models, see the earth vignette # section "Short versus long binomial data". ```

### Example output

```Loading required package: Formula
Loading required package: plotmo
Loading required package: plotrix
Loading required package: TeachingDemos
survived dose    sex
row1.1     TRUE   10   male
row1.2     TRUE   10   male
row1.3     TRUE   10   male
row2.1    FALSE   10 female
row2.2     TRUE   10 female
row2.3     TRUE   10 female
row3.1    FALSE   20   male
row3.2    FALSE   20   male
row3.3     TRUE   20   male
row4.1    FALSE   20 female
row4.2    FALSE   20 female
row4.3     TRUE   20 female
```

earth documentation built on Oct. 23, 2020, 5:08 p.m.