knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = TRUE
)
library(sqpr)

user <- Sys.getenv("SQP_USER")
pw <- Sys.getenv("SQP_PW")

sqp_login(user, pw)

The sqpr package gives you access to the API of the Survey Quality Prediction website, a data base that contains over 40,000 predictions on the quality of questions.

Registration

This vignette will cover how to log in to the API and download data interactively. First things first, you need to register in the SQP 3.0 website and confirm your registration through your email.

Got that ready? Let's move on.

Loging in

First, install and load the package in R with:

devtools::install_github("sociometricresearch/sqpr")
library(sqpr)

There's currently three ways of loging in to the API. The first one by setting your credentials as enviroment variables. How do I do that?

Sys.setenv(SQP_USER = 'This is where your email or username goes')
Sys.setenv(SQP_PW = 'This is where your password goes')

once you executed the previous lines, run sqp_login() and if you don't get an error, then you signed up just fine.

sqp_login()

That easy! The nice thing is that once you executed the previous lines you can erase them from your script and nobody will ever see your account credentials.

Very similarly, you can also set your credentials as options in your R session.

options(
  SQP_USER = 'This is where your email or username goes'),
  SQP_PW = 'This is where your password goes'
)
sqp_login()

This is very similar to the previous approach in that you can delete the options(...) statement and sqp_login() will always find it. Note that both approaches only work during your R session. If you're interested in setting the SQP_USER and SQP_PW variables permanently in your system environment (that means not having to write them down everytime you open/close your session), then you should check out this chapter on environmental variables.

Finally, there is one last approach, which I don't recommend. You can always sign up by placing your credentials inside sqp_login(), but that would require for your credentials to be visible to anyone you share your code to. That means any documents shared on Google Drive, Github, Bitbucket or just a simple share over gmail. In any case, here's how you'd do it:

sqp_login(
  'This is where your email or username goes',
  'This is where your password goes'
)

Once you've ran sqp_login() once, you're all set to work with the SQP 3.0 API! No need to run it again unless you close the R session.

Exploring the SQP 3.0 API

If you're aware of the study/question/country/language that you're searchin for, please have a look at get_sqp, a function that allows to make one single query to obtain SQP estimates. See for example here.

The SQP 3.0 API has questions nested into studies. Some of these studies/questions might be in many different languages because the SQP 3.0 community is very diverse. For that, we need to access the id's of the study and question of interest. Let's look for all the questions that have tv at the beginning of the word in the European Social Survey round 4. First, we need to figure out the id of that study. find_studies() accepts a string with your study of interest and it searches for similar names in the SQP 3.0 database.

find_studies("ess")

Aha! There it is. Then more specifically..

find_studies("ESS Round 4")

find_studies is very helpful for finding patterns in the SQP 3.0 database. However, if you can't find your study, you should use get_studies() which will return all the studies in the SQP 3.0 database and then perform your search manually.

Ok, so we know our study is there. Which questions are in that study? find_questions will do the work for you.

find_questions("ESS Round 4", "tv")

That might take a while because it's downloading all of the data to your computer.

Ok, that search was unsuccesfull, as we can see there are questions that don't begin with tv. We can also use regular expressions to find more detailed patterns. For example, ^tv will find all questions that start (^) with tv.

tv_qs <- find_questions("ESS Round 4", "^tv")
tv_qs

There it is (you can also supply more than one variable name with a character vector like c("tvtot", "tvpol")). We get all the tv questions in different languages and for different countries. Let's subset the questions for Spanish only.

sp_tv <- tv_qs[tv_qs$language_iso == "spa", ]
sp_tv

The hard part is done now. Once we have the id of your questions of interest, we supply it to get_estimates and it will bring the quality predictions for those questions.

get_estimates(sp_tv$id)

get_estimates will return all question names as lower case for increasing the chances of compatibility with the name in the questionnaire of the study.

It will also return empty fields for the questions that don't have any predictions, like tvpol in the example above. Moreover, get_estimates gives the option of grabbing more columns from the SQP 3.0 API that contain estimations such as standard errors and interquantile ranges of the predictions from above. See ?get_estimates for a list of all available columns. You can do that like this:

get_estimates(sp_tv$id, all_columns = TRUE)

The nice thing about get_estimates is that once you've explored and found your variables, if you save the id's of the questions in your script, once your restart your R session you can avoid most of the steps above and get your estimates with only sqp_login(); get_estimates(question_ids), assuming you placed your login information as environmental variables (or options) and your question ids are in a vector named question_ids.

I hope that gives a general idea of how to explore the SQP 3.0 API interactively. For more examples on how this blends with the other sqpr related functions, check out the case-study vignette of the sqpr package.



asqm/sqpr documentation built on May 28, 2020, 5:13 a.m.