# Parametric approach to analyze double-bounded dichotomous choice contingent valuation data

### Description

This function analyzes double-bounded dichotomous choice contingent valuation (CV) data on the basis of the utility difference approach.

### Usage

1 2 3 4 5 6 7 8 9 10 11 |

### Arguments

`formula` |
an object of S3 class |

`data` |
a data frame containing the variables in the model formula. |

`subset` |
an optional vector specifying a subset of observations. |

`na.action` |
a function which indicates what should happen when the data contains |

`dist` |
a character string setting the error distribution in the model, which
takes one of |

`par` |
a vector of initial parameters over which the optimization is carried out. |

`x` |
an object of class |

`digits` |
a number of digits to display. |

`object` |
an object of class |

`...` |
optional arguments. Currently not in use. |

### Details

The function `dbchoice()`

implements an analysis of double-bounded dichotomous choice
contingent valuation (CV) data on the basis of the utility difference approach (Hanemann, 1984).
A generic call to `dbchoice()`

is given by

`dbchoice(formula, data, dist = "log-logistic", ...)`

The extractor function `summary()`

is available for a `"dbchoice"`

class object.
See `summary.dbchoice`

for details.

There are two functions available for computing the confidence intervals for the estimates of WTPs.
`krCI`

implements simulations to construct empirical distributions of the WTP while
`bootCI`

carries out nonparametric bootstrapping.

The argument `formula`

defines the response variables and covariates. The argument `data`

is mandatory where the data frame containing the variables in the model is specified. The argument `dist`

sets the error distribution.
Currently, one of `"logistic"`

, `"normal"`

, `"log-logistic"`

, `"log-normal"`

,
or `"weibull"`

is available.
The default value is `dist = "log-logistic"`

, so that it may be omitted if the user wants to estimate a model
with log-logistic error distribution.

The difference between normal and log-normal models or between logistic or log-logistic ones is how the bid variable is incorporated into the model to be estimated. For the Weibull model, the bid variable must be entered in the natural log. Therefore, the user must be careful in defining the model formula that is explained in details below.

A typical structure of the formula for `dbchoice()`

is defined as follows:

`R1 + R2 ~ (the names of the covariates) | BD1 + BD2`

The formula is an object of class `"formula"`

and specifies the model structure. It has to be written in a
symbolic expression in **R**.
The formula consists of three parts. The first part, the left-hand side of the tilde sign (`~`

), must
contain the response variables for the suggested prices in the first and the second stage of CV questions.
In the example below, `R1`

denotes a binary or two-level factor response variable for a bid in the first
stage and `R2`

for a bid in the second stage. Each of `R1`

and `R2`

contains `"Yes"`

or
`"No"`

to the bid or `1`

for `"Yes"`

and `0`

for `"No"`

.

The covariates are defined in the second part in the place of `(the names of the covariates)`

. Each covariate is
connected with the arithmetic operator `+`

and `(the names of the covariates)`

in the above syntax should be
replaced with `var1 + var2`

and the like. The plus sign is nothing to do with addition of the two variables
in the symbolic expression. When the covariate contains only a constant term, a value of `1`

is set as the
covariate (that is, `R1 + R2 ~ 1 | BD1 + BD2`

)

The last part starts after the vertical bar (`|`

). The names of the two variables (`BD1`

and `BD2`

)
containing suggested prices in the first and second stage of double-bounded dichotomous choice CV question are
specified in this part. The two variables are also connected with the arithmetic operator (`+`

).

According to the structure of the formula, a data set (data frame) consists of three parts. An example of the
data set is as follows (`sex`

, `age`

, and `income`

are respondents characteristics and assumed
to be covariates):

`R1` | `R2` | `sex` | `age` | `income` | `BD1` | `BD2` |

Yes | Yes | Male | 20 | Low | 100 | 250 |

Yes | No | Male | 30 | Low | 500 | 1000 |

... |

The second bid in the double-bounded dichotomous choice CV question is larger or lower than the first bit
according to the response to the first stage: if the response to the first stage is `"Yes"`

, the second bid is
larger than the first bid; if the response is `"No"`

, the second bid is lower than the first bid. In the example
above, `BD2`

is set as the second bid according to each respondent faced in the second stage. However,
the followings style of data set is frequently prepared:

`R1` | `R2` | `sex` | `age` | `income` | `BD1` | `BD2H` | `BD2L` |

Yes | Yes | Male | 20 | Low | 100 | 250 | 50 |

Yes | No | Male | 30 | Low | 500 | 1000 | 250 |

... |

`BD2H`

is the second (higher) bid when the respondent answers `"Yes"`

in the first stage; `BD2L`

is the second (lower) bid when the respondent answers `"No"`

in the first stage. In this case, the users
have to convert `BD2H`

and `BD2L`

into `BD2`

(see the section "Examples").

The function `dbchoice()`

analyzes double-bounded dichotomous choice CV data using the function
`optim`

on the basis of the initial coefficients that are estimated from a binary
logit model analysis of the first-stage CV responses (the binary logit model is estimated internally
by the function `glm`

with the argument `family = binomial(link = "logit"))`

.

Nonparametric analysis of double-bounded dichotomous choice data can be done by `turnbull.db`

.
A single-bounded analogue of `dbchoice`

is called `sbchoice`

.

### Value

This function returns an S3 class object `"dbchoice"`

that is a list with the following components.

`f.stage` |
a list of components returned from the function |

`dbchoice` |
a list of components returned from the function |

`coefficients` |
a named vector of estimated coefficients. |

`call` |
the matched call. |

`formula` |
the formula supplied. |

`Hessian` |
an estimate of the Hessian. See also |

`distribution` |
a character string showing the error distribution used. |

`loglik` |
a value of the log likelihood at the estimates. |

`convergence` |
an logical code: |

`niter` |
a vector of two integers describing the number of calls to the object function and the
numerical gradient, respectively. See also |

`nobs` |
a number of observations. |

`covariates` |
a named matrix of the covariates used in the model. |

`bid` |
a named matrix of the bids used in the model. |

`yn` |
a named matrix of the responses to the initial and follow-up CV questions used in the model. |

`data.name` |
the data matrix. |

`terms` |
terms |

`contrast` |
contrasts used for factors |

`xlevels` |
levels used for factors |

### References

Bateman IJ, Carson RT, Day B, Hanemann M, Hanley N, Hett T, Jones-Lee M, Loomes
G, Mourato S, \"Ozdemiro\=glu E, Pearce DW, Sugden R, Swanson J (eds.) (2002).
*Economic Valuation with Stated Preference Techniques: A Manual.*
Edward Elger, Cheltenham, UK.

Carson RT, Hanemann WM (2005).
“Contingent Valuation.”
in KG M\"aler, JR Vincent (eds.), *Handbook of Environmental Economics*.
Elsevier, New York.

Croissant Y (2011).
*Ecdat: Data Sets for Econometrics,*
**R** package version 0.1-6.1,
http://CRAN.R-project.org/package=Ecdat.

Hanemann, WM (1984).
“Welfare Evaluations in Contingent Valuation Experiments with Discrete Responses”,
*American Journal of Agricultural Economics*,
**66**(2), 332–341.

Hanemann M, Kanninen B (1999).
“The Statistical Analysis of Discrete-Response CV Data.”,
in IJ Bateman, KG Willis (eds.),
*Valuing Environmental Preferences: Theory and Practice of the Contingent
Valuation Methods in the US, EU, and Developing Countries*,
302–441.
Oxford University Press, New York.

Hanemann WM, Loomis JB, Kanninen BJ (1991).
“Statistical Efficiency of Double-Bounded Dichotomous Choice
Contingent Valuation.”
*American Journal of Agricultural Economics*, **73**(4), 1255–1263.

### See Also

`summary.dbchoice`

, `krCI`

, `bootCI`

,
`sbchoice`

, `turnbull.db`

, `NaturalPark`

,
`glm`

, `optim`

, `formula`

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 | ```
## Examples are based on a data set NaturalPark in the package
## Ecdat (Croissant 2011): DBDCCV style question for measuring
## willingness to pay for the preservation of the Alentejo Natural
## Park. The data set (dataframe) contains seven variables:
## bid1 (bid in the initial question), bidh (higher bid in the follow-up
## question), bidl (lower bid in the follow-up question), answers
## (response outcomes in a factor format with four levels of "nn",
## "ny", "yn", "yy"), respondents' characteristic variables such
## as age, sex and income (see NaturalPark for details).
data(NaturalPark, package = "Ecdat")
head(NaturalPark)
## The variable answers are converted into a format that is suitable for the
## function dbchoice() as follows:
NaturalPark$R1 <- ifelse(substr(NaturalPark$answers, 1, 1) == "y", 1, 0)
NaturalPark$R2 <- ifelse(substr(NaturalPark$answers, 2, 2) == "y", 1, 0)
## We assume that the error distribution in the model is a
## log-logistic; therefore, the bid variables bid1 is converted
## into LBD1 as follows:
NaturalPark$LBD1 <- log(NaturalPark$bid1)
## Further, the variables bidh and bidl are integrated into one
## variable (bid2) and the variable is converted into LBD2 as follows:
NaturalPark$bid2 <- ifelse(NaturalPark$R1 == 1, NaturalPark$bidh, NaturalPark$bidl)
NaturalPark$LBD2 <- log(NaturalPark$bid2)
## The utility difference function is assumed to contain covariates (sex, age, and
## income) as well as two bid variables (LBD1 and LBD2) as follows:
fmdb <- R1 + R2 ~ sex + age + income | LBD1 + LBD2
## Not run:
## The formula may be alternatively defined as
fmdb <- R1 + R2 ~ sex + age + income | log(bid1) + log(bid2)
## End(Not run)
## The function dbchoice() with the function fmdb and the dataframe
## NP is executed as follows:
NPdb <- dbchoice(fmdb, data = NaturalPark)
NPdb
NPdbs <- summary(NPdb)
NPdbs
## The confidence intervals for these WTPs are calculated using the
## function krCI() or bootCI() as follows:
## Not run:
krCI(NPdb)
bootCI(NPdb)
## End(Not run)
## The WTP of a female with age = 5 and income = 3 is calculated
## using function krCI() or bootCI() as follows:
## Not run:
krCI(NPdb, individual = data.frame(sex = "female", age = 5, income = 3))
bootCI(NPdb, individual = data.frame(sex = "female", age = 5, income = 3))
## End(Not run)
## The variable age and income are deleted from the fitted model,
## and the updated model is fitted as follows:
update(NPdb, .~. - age - income |.)
## The bid design used in this example is created as follows:
bid.design <- unique(NaturalPark[, c(1:3)])
bid.design <- log(bid.design)
colnames(bid.design) <- c("LBD1", "LBDH", "LBDL")
bid.design
## Respondents' utility and probability of choosing Yes-Yes, Yes-No,
## No-Yes, and No-No under the fitted model and original data are
## predicted as follows:
head(predict(NPdb, type = "utility", bid = bid.design))
head(predict(NPdb, type = "probability", bid = bid.design))
## Utility and probability of choosing Yes for a female with age = 5
## and income = 3 under bid = 10 are predicted as follows:
predict(NPdb, type = "utility",
newdata = data.frame(sex = "female", age = 5, income = 3, LBD1 = log(10)))
predict(NPdb, type = "probability",
newdata = data.frame(sex = "female", age = 5, income = 3, LBD1 = log(10)))
## Plot of probabilities of choosing yes is drawn as drawn as follows:
plot(NPdb)
## The range of bid can be limited (e.g., [log(10), log(20)]):
plot(NPdb, bid = c(log(10), log(20)))
``` |