pisa | R Documentation |
A subset of PISA data, containing scores and other information from the triennial testing of 15 year olds around the globe. Original data available from https://www.oecd.org/pisa/data/. Data derived from https://github.com/kevinwang09/learningtower.
pisa
A tibble of the following variables
year the year of measurement
country the three letter country code. This data contains Australia, New Zealand, and Indonesia. The full data from learningtower contains 99 countries.
school_id The unique school identification number
student_id The student identification number
gender recorded gender - 1 female or 2 male or missing
math Simulated score in mathematics
read Simulated score in reading
science Simulated score in science
stu_wgt The final survey weight score for the student score
Understanding a bit more about the PISA data, the school_id
and
student_id
are not unique across time. This means the longitudinal element
is the country within a given year.
We can cast pisa
as a tsibble
, but we need to aggregate the data to each
year and country. In doing so, it is important that we provide some summary
statistics of each of the scores - we want to include the mean, and minimum
and maximum of the math, reading, and science scores, so that we do not lose
the information of the individuals.
The example code below does this, first grouping by year and country, then
calculating the weighted mean for math, reading, and science. This can be
done using the student weight variable stu_wgt
, to get the survey weighted
mean. The minimum and maximum are then calculated.
pisa
library(dplyr)
# Let's identify
#1. The **key**, the individual, who would have repeated measurements.
#2. The **index**, the time component.
#3. The **regularity** of the time interval (index).
# Here it looks like the key is the student_id, which is nested within
# school_id #' and country,
# And the index is year, so we would write the following
as_tsibble(pisa,
key = country,
index = year)
# We can assess the regularity of the year like so:
index_regular(pisa, year)
index_summary(pisa, year)
# We can now convert this into a `tsibble`:
pisa_ts <- as_tsibble(pisa,
key = country,
index = year,
regular = TRUE)
pisa_ts
pisa_ts_au_nz <- pisa_ts %>% filter(country %in% c("AUS", "NZL", "QAT"))
library(ggplot2)
ggplot(pisa_ts_au_nz,
aes(x = year,
y = math_mean,
group = country,
colour = country)) +
geom_ribbon(aes(ymin = math_min,
ymax = math_max),
fill = "grey70") +
geom_line(size = 1) +
lims(y = c(0, 1000)) +
labs(y = "math") +
facet_wrap(~country)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.