In cknotz/conjointdatachecks: An R Package to Check Data from Conjoint Experiments

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Replicating the Hainmueller et al. data quality checks

To show how the estimations performed by the carryTest() function in the conjointdatachecks package correspond to the procedure described in the Hainmueller et al. (2014, Political Analysis) article, this vignette replicates their tests and illustrates how the carryTest() provides equivalent results. In doing so, the vignette also provides a peak under the hood of the carryTest() function.

As a first step, we load the relevant packages and the dataset, and perform some data preparation:

library(conjointdatachecks)

# Loading Hainmueller et al. immigrant experiment data (from cregg GitHub)
utils::download.file("https://github.com/leeper/cregg/raw/main/data/immigration.rda",
              "immigration")
load("immigration")

# Convert ID vars to factors
immigration$contest_no <- factor(immigration$contest_no)
immigration$CaseID <- factor(immigration$CaseID)
immigration$profile <- factor(immigration$profile)


# Select vignette attributes ("dimensions")
vigdims <- c("Gender","JobExperience",
             "JobPlans","PriorEntry","LanguageSkills")

Testing for carryover effects: Hainmueller et al. results

Hainmueller et al. (p. 22) report that they test for carryover effects by testing if the effect of "using an interpreter" (one of the categories of the "Language Skills" vignette attribute) differs across the different rating tasks. They state that they estimate a model that includes the "Language skills" attribute, an indicator for the rating task, and an interaction between the two, and then run an F-test to see if the interaction terms are jointly significant.

A look at their Stata replication code shows that they also include the other vignette attributes plus some interactions between them (to account for excluded combinations of some attributes):

*test if language effects are significantly different across tasks
reg Chosen_Immigrant i.FeatGender i.FeatEd##i.FeatJob i.FeatLang##i.contest_no ///
                       ib6.FeatCountry##i.FeatReason i.FeatExp ib3.FeatPlans i.FeatTrips ///
                       ,  cl(CaseID)

[...]

*joint test
test 4.FeatLang#2.contest_no 4.FeatLang#3.contest_no 4.FeatLang#3.contest_no ///
     4.FeatLang#4.contest_no 4.FeatLang#5.contest_no

They report that the F-test produces a p-value of around 0.52.

To replicate this in R, we run an equivalent estimation and then use the clubSandwich package to cluster the standard errors and perform the joint hypothesis test. The test result corresponds to the one reported by Hainmueller et al.:

carrymod <- lm(ChosenImmigrant~contest_no*LanguageSkills +
                 Gender + Education*Job + CountryOfOrigin*ReasonForApplication +
                 JobExperience + JobPlans + PriorEntry,
               data = immigration)

carrymod_vc <- clubSandwich::vcovCR(carrymod,
                      cluster = immigration$CaseID,
                      type = "CR1S")

clubSandwich::Wald_test(carrymod,
          constraints = clubSandwich::constrain_zero(":LanguageSkillsUsed Interpreter",
                                       coefs = coef(carrymod),
                                       reg_ex = T),
          vcov = carrymod_vc,
          test = "Naive-F")

Carryover effects: Single attributes

Due to the randomization of vignettes in conjoint experiments, the effects of a single attribute should be independent of whether predictors for other attributes also are included in the model -- as long as this attribute in question is not linked to other attributes via excluded combinations, of course!

Thus, it should also be possible to test for carryover effects in the case of a particular attribute, and then also to consider effect differences for all of the respective attribute's levels simultaneously in the test, as shown below:

carrymod <- lm(ChosenImmigrant ~ LanguageSkills*contest_no,
               data = immigration)

carryvcov <- clubSandwich::vcovCR(carrymod,
                                  cluster = immigration$CaseID,
                                  type = "CR1S")

clubSandwich::Wald_test(carrymod,
                        constraints = clubSandwich::constrain_zero(":", 
                                                                   reg_ex = T),
                        vcov = carryvcov,
                        test = "Naive-F")

The F-test p-value here is slightly different, but the conclusion is obviously the same: There are no signs of carryover effects.

The carryTest() function runs, in essence, the code above and thus produces the same result:

carryTest(data = immigration,
          outcome = "ChosenImmigrant",
          attributes = "LanguageSkills",
          task = "contest_no",
          resID = "CaseID")

Testing for profile order effects: Hainmueller et al. results

Equivalent to their previous test, Hainmueller et al. test for profile order effects in the case of the "used interpreter" level of the "Language skills" attribute. They fail to reject the Null with a p-value of around 0.48:

*test if language effects are significantly different across profiles
reg Chosen_Immigrant i.FeatGender i.FeatEd##i.FeatJob i.FeatLang##i.profile ///
                       ib6.FeatCountry##i.FeatReason i.FeatExp ib3.FeatPlans i.FeatTrips ///
                       ,  cl(CaseID)            

[...]

*joint test
test 4.FeatLang#2.profile

The replication in R looks as follows:

profmod <- lm(ChosenImmigrant~profile*LanguageSkills +
                Gender + Education*Job + CountryOfOrigin*ReasonForApplication +
                JobExperience + JobPlans + PriorEntry,
              data = immigration)

profmod_vc <- clubSandwich::vcovCR(profmod,
                      cluster = immigration$CaseID,
                      type = "CR1S")


clubSandwich::Wald_test(profmod,
          constraints = clubSandwich::constrain_zero(":LanguageSkillsUsed Interpreter",
                                       coefs = coef(profmod),
                                       reg_ex = T),
          vcov = profmod_vc,
          test = "Naive-F")

Profile order effects: Single attributes

Estimating this test again with a reduced regression model but all attribute levels produces a different result, but the conclusion remains the same: no profile order effects:

ordmod <- lm(ChosenImmigrant ~ LanguageSkills*profile,
                         data = immigration)

ordvcov <- clubSandwich::vcovCR(ordmod,
                                  cluster = immigration$CaseID,
                                  type = "CR1S")

clubSandwich::Wald_test(ordmod,
                        constraints = clubSandwich::constrain_zero(":", 
                                                                   reg_ex = T),
                        vcov = ordvcov,
                        test = "Naive-F")

The result is again identical when using the carryTest() function:

carryTest(data = immigration,
          outcome = "ChosenImmigrant",
          attributes = "LanguageSkills",
          task = "profile",
          resID = "CaseID")