knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = TRUE )
library(sqpr) user <- Sys.getenv("SQP_USER") pw <- Sys.getenv("SQP_PW") sqp_login(user, pw)
The sqpr
package gives you access to the API of the Survey Quality Prediction website, a data base that contains over 40,000 predictions on the quality of questions.
This vignette will cover how to log in to the API and download data interactively. First things first, you need to register in the SQP 3.0 website and confirm your registration through your email.
Got that ready? Let's move on.
First, install and load the package in R with:
devtools::install_github("sociometricresearch/sqpr") library(sqpr)
There's currently three ways of loging in to the API. The first one by setting your credentials as enviroment variables. How do I do that?
Sys.setenv(SQP_USER = 'This is where your email or username goes') Sys.setenv(SQP_PW = 'This is where your password goes')
once you executed the previous lines, run sqp_login()
and if you don't get an error, then you signed up just fine.
sqp_login()
That easy! The nice thing is that once you executed the previous lines you can erase them from your script and nobody will ever see your account credentials.
Very similarly, you can also set your credentials as options in your R session.
options( SQP_USER = 'This is where your email or username goes'), SQP_PW = 'This is where your password goes' ) sqp_login()
This is very similar to the previous approach in that you can delete the options(...)
statement and sqp_login()
will always find it. Note that both approaches only work during your R session. If you're interested in setting the SQP_USER
and SQP_PW
variables permanently in your system environment (that means not having to write them down everytime you open/close your session), then you should check out this chapter on environmental variables.
Finally, there is one last approach, which I don't recommend. You can always sign up by placing your credentials inside sqp_login()
, but that would require for your credentials to be visible to anyone you share your code to. That means any documents shared on Google Drive, Github, Bitbucket or just a simple share over gmail. In any case, here's how you'd do it:
sqp_login( 'This is where your email or username goes', 'This is where your password goes' )
Once you've ran sqp_login()
once, you're all set to work with the SQP 3.0 API! No need to run it again unless you close the R session.
If you're aware of the study/question/country/language that you're searchin for, please have a look at get_sqp
, a function that allows to make one single query to obtain SQP estimates. See for example here.
The SQP 3.0 API has questions nested into studies. Some of these studies/questions might be in many different languages because the SQP 3.0 community is very diverse. For that, we need to access the id
's of the study and question of interest. Let's look for all the questions that have tv
at the beginning of the word in the European Social Survey round 4. First, we need to figure out the id of that study. find_studies()
accepts a string with your study of interest and it searches for similar names in the SQP 3.0 database.
find_studies("ess")
Aha! There it is. Then more specifically..
find_studies("ESS Round 4")
find_studies
is very helpful for finding patterns in the SQP 3.0 database. However, if you can't find your study, you should use get_studies()
which will return all the studies in the SQP 3.0 database and then perform your search manually.
Ok, so we know our study is there. Which questions are in that study? find_questions
will do the work for you.
find_questions("ESS Round 4", "tv")
That might take a while because it's downloading all of the data to your computer.
Ok, that search was unsuccesfull, as we can see there are questions that don't begin with tv
. We can also use regular expressions to find more detailed patterns. For example, ^tv
will find all questions that start (^
) with tv
.
tv_qs <- find_questions("ESS Round 4", "^tv") tv_qs
There it is (you can also supply more than one variable name with a character vector like c("tvtot", "tvpol")
). We get all the tv
questions in different languages and for different countries. Let's subset the questions for Spanish only.
sp_tv <- tv_qs[tv_qs$language_iso == "spa", ] sp_tv
The hard part is done now. Once we have the id
of your questions of interest, we supply it to get_estimates
and it will bring the quality predictions for those questions.
get_estimates(sp_tv$id)
get_estimates
will return all question names as lower case for increasing the chances of compatibility with the name in the questionnaire of the study.
It will also return empty fields for the questions that don't have any predictions, like tvpol
in the example above. Moreover, get_estimates
gives the option of grabbing more columns from the SQP 3.0 API that contain estimations such as standard errors and interquantile ranges of the predictions from above. See ?get_estimates
for a list of all available columns. You can do that like this:
get_estimates(sp_tv$id, all_columns = TRUE)
The nice thing about get_estimates
is that once you've explored and found your variables, if you save the id's of the questions in your script, once your restart your R session you can avoid most of the steps above and get your estimates with only sqp_login(); get_estimates(question_ids)
, assuming you placed your login information as environmental variables (or options) and your question ids are in a vector named question_ids
.
I hope that gives a general idea of how to explore the SQP 3.0 API interactively. For more examples on how this blends with the other sqpr
related functions, check out the case-study vignette of the sqpr
package.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.