knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 4, fig.align = "center" )
stenR
is a package tailored mainly for users and creators of psychological questionnaires,
though other social science researchers and survey authors can benefit greatly
from it.
It provides tools to help with processes necessary for conducting such studies:
Furthermore, tools for developing or using norms on grouped basis are also provided (up to two intertwined grouping conditions are supported).
As there are few fairly independent and varied processes supported in the stenR
,
they will be described separately below. For more details, browse through documentation
and other vignettes.
library(stenR)
After conducting the study, results will be usually available in form of responses in some scoring scale for each separate items. For further analysis they need to be gathered into scales and factors (unless they are one-item scale).
stenR
provides functions to make this process straightforward.
We will use one of the datasets provided with the package: SLCS
, containing
responses for items in Self-Liking Self-Competence Scale. It consists of
16 items, which can be grouped into two subscales (Self-Liking, Self-Competence)
and General Score.
str(SLCS)
To summarize scores we need to create ScaleSpec objects with ScaleSpec()
constructor function. Such objects contain instructions for R how the scales
are structured, most importantly:
SL_spec <- ScaleSpec( name = "SelfLiking", min = 1, max = 5, item_names = c("SLCS_1", "SLCS_3", "SLCS_5", "SLCS_6", "SLCS_7", "SLCS_9", "SLCS_11", "SLCS_15"), reverse = c("SLCS_1", "SLCS_6", "SLCS_7", "SLCS_15") ) SC_spec <- ScaleSpec( name = "SelfCompetence", min = 1, max = 5, item_names = c("SLCS_2", "SLCS_4", "SLCS_8", "SLCS_10", "SLCS_12", "SLCS_13", "SLCS_14", "SLCS_16"), reverse = c("SLCS_8", "SLCS_10", "SLCS_13") )
If there are main factors or factors of higher level, the ScaleSpec
objects
can be also combined into CombScaleSpec object with CombScaleSpec()
constructor
function. In our example the General Score is such factor.
GS_spec <- CombScaleSpec( name = "GeneralScore", SL_spec, SC_spec ) # subscales can be also reversed GS_with_rev <- CombScaleSpec( name = "rev_example", SL_spec, SC_spec, reverse = "SelfCompetence" )
When all scale specifications are ready, they can be then used to get the factor/scale data, summarized in accordance to the instructions in provided ScaleSpec or CombScaleSpec objects.
summed_SCLS <- sum_items_to_scale( SLCS, SL_spec, SC_spec, GS_spec, GS_with_rev ) str(summed_SCLS)
For the times when you have great number of observations and prefer to develop norms (usually, when you are creator of questionnaire or its adaptation) it is recommended to generate FrequencyTable and ScoreTable objects. Resulting ScoreTable objects can be either used to normalize the scores or create exportable to non-R specific objects ScoringTable object.
There are also support for automatic grouping the observations using GroupedFrequencyTable and GroupedScoreTable objects. They will be mentioned in Grouping section.
We will use one of the datasets provided with the package: HEXACO_60
, containing
raw scores of scales in HEXACO 60-item questionnaire.
str(HEXACO_60)
Create FrequencyTable objects
At first, we need to create a FrequencyTable object for each variable
using FrequencyTable()
constructor function.
r
HEX_C_ft <- FrequencyTable(HEXACO_60$HEX_C)
HEX_E_ft <- FrequencyTable(HEXACO_60$HEX_E)
If there are some missing raw scores in your data, helpful message will
be displayed. You can check how the frequencies look like using plot()
function.
r
plot(HEX_E_ft)
As we can see, the missing values are gathered near tails of the distribution. It can happen even with many observations - and in case of our sample (103 observations) it is very likely.
Create ScoreTable objects
ScoreTable object is basically a frequency table with additional standard
scale specification attached. We can create our own specification using
StandardScale()
, but we will use in the example already provided STEN
(Standard TEN) score specification
r
HEX_C_st <- ScoreTable(
ft = HEX_C_ft,
scale = STEN
)
HEX_E_st <-ScoreTable(
ft = HEX_E_ft,
scale = STEN
)
Normalize and standardize scores
Created ScoreTables can be then used to calculate the normalized scores.
Normalization can be done either on individual vectors with basic
normalize_score()
function:
r
HEX_C_norm <- normalize_score(
HEXACO_60$HEX_C,
table = HEX_C_st,
what = "sten"
)
HEX_E_norm <- normalize_score(
HEXACO_60$HEX_E,
table = HEX_E_st,
what = "sten"
)
summary(HEX_C_norm)
summary(HEX_E_norm)
Or using the convienient wrapped for whole data.frame
r
HEX_CE_norm <- normalize_scores_df(
data = HEXACO_60,
vars = c("HEX_C", "HEX_E"),
HEX_C_st,
HEX_E_st,
what = "sten",
# by default no other variables will be retained
retain = FALSE
)
summary(HEX_CE_norm)
str(HEX_CE_norm)
C_ScoringTable <- tempfile(fileext = ".csv") E_ScoringTable <- tempfile(fileext = ".csv") export_ScoringTable( to_ScoringTable(HEX_C_st, min_raw = 10, max_raw = 50), C_ScoringTable, "csv", ) export_ScoringTable( to_ScoringTable(HEX_E_st, min_raw = 10, max_raw = 50), E_ScoringTable, "csv" )
Most users will be using already developed norms by the creators of questionnaire. Scoring tables should be provided in the measure documentation, and ScoringTable object is mirroring their usual representation.
ScoringTable object can be either created from ScoreTable or GroupedScoreTable object or imported from csv or json file.
For manual creation, the csv format is recommended. Such file should look similar to the one below (which is created on basis of Consciousness ScoreTable from code in section above)
"sten","Score" 1,"10-19" 2,"20-25" 3,"26-28" 4,"29-31" 5,"32-35" 6,"36-39" 7,"40-42" 8,"43-46" 9,"47-48" 10,"49-50"
{min}-{max}
that
need to be changed into each standardized scoreScoringTable objects also supports different groups of observations - in that case 2nd to n-th columns are reflecting scores for each of the group. They will be mentioned in Grouping section.
We can import ScoringTables using import_ScoringTable()
function.
HEX_C_Scoring <- import_ScoringTable( source = C_ScoringTable, method = "csv" ) HEX_E_Scoring <- import_ScoringTable( source = E_ScoringTable, method = "csv" ) summary(HEX_C_Scoring) summary(HEX_E_Scoring)
They can be then used to normalize scores, very similarly to normalize_scores_df
:
HEX_CE_norm <- normalize_scores_scoring( data = HEXACO_60, vars = c("HEX_C", "HEX_E"), HEX_C_Scoring, HEX_E_Scoring ) summary(HEX_CE_norm) str(HEX_CE_norm)
Very often the norms are different for different groups: most often varying in
some demographic variables, like biological sex or biological age. stenR
functions
provide support for such groups by intoducing Grouped variants of FrequencyTable
and ScoreTable (regular ScoringTable supports them) and GroupConditions
class.
GroupConditions works similarly to ScaleSpec and CombScaleSpec objects: it provides information about how to assign observations. They need the name of category (mainly for informative reasons) and conditions following the syntax of name of the group on the LHS and boolean condition on the RHS.
sex_grouping <- GroupConditions( conditions_category = "Sex", "M" ~ sex == "M", "F" ~ sex == "F" ) age_grouping <- GroupConditions( conditions_category = "Age", "to 30" ~ age < 30, "above 30" ~ age >= 31 ) sex_grouping age_grouping
They can be then used to create a GroupedFrequencyTable, and following that: GroupedScoreTable and, optionally, ScoringTable - or to create ScoringTable during import.
For this examples we will be using IPIP_NEO_300
dataset provided with the package.
It contains the age and sex variables, and summed raw scores of 5 scales from
IPIP NEO questionnaire (300 item version).
str(IPIP_NEO_300)
GroupedFrequencyTable, GroupedScoreTable and ScoringTable export
Workflow is very similiar to the ungrouped tables.
r
N_gft <- GroupedFrequencyTable(
data = IPIP_NEO_300,
conditions = list(age_grouping, sex_grouping),
var = "N",
# By default, norms are are also computed for '.all' groups. These are
# used if by any reason observation can't be assigned to any group
# in corresponding condition category
.all = TRUE
)
N_gst <- GroupedScoreTable(N_gft, scale = STEN)
plot(N_gst)
GroupedScoreTable can be then used to normalize scores using normalize_scores_grouped()
.
By default, other variables are not retained. You can also provide column name
to contain the assigned group names per observation.
r
NEO_norm <- normalize_scores_grouped(
data = IPIP_NEO_300,
vars = "N",
N_gst,
what = "sten",
group_col = "Group"
)
str(NEO_norm)
table(NEO_norm$Group)
GroupedScoreTable can be then transformed into ScoringTable and exported to csv or json file.
```r ST_csv <- tempfile(fileext = ".csv") cond_csv <- tempfile(fileext = ".csv")
N_ST <- to_ScoringTable( table = N_gst, min_raw = 60, max_raw = 300 )
summary(N_ST)
export_ScoringTable( table = N_ST, out_file = ST_csv, method = "csv", # you can also export GroupConditions to seperate csv file cond_file = cond_csv ) ```
ScoringTable import from file
To import ScoringTable with groups from csv, it needs to look accordingly:
csv
sten,to 30:M,to 30:F,to 30:.all2,above 30:M,above 30:F,above 30:.all2,.all1:M,.all1:F,.all1:.all2
1,60-94,60-111,60-101,60-85,60-98,60-92,60-90,60-104,60-95
2,95-110,112-128,102-117,86-101,99-112,93-106,91-106,105-119,96-111
3,111-126,129-144,118-134,102-117,113-128,107-122,107-122,120-136,112-128
4,127-143,145-162,135-152,118-135,129-146,123-140,123-140,137-154,129-147
5,144-162,163-180,153-171,136-154,147-165,141-160,141-159,155-174,148-166
6,163-181,181-199,172-190,155-175,166-185,161-180,160-179,175-194,167-186
7,182-201,200-218,191-210,176-198,186-208,181-203,180-200,195-214,187-208
8,202-222,219-238,211-232,199-222,209-229,204-226,201-222,215-234,209-229
9,223-244,239-256,233-251,223-245,230-247,227-247,223-245,235-251,230-248
10,245-300,257-300,252-300,246-300,248-300,248-300,246-300,252-300,249-300
Usually measure developers don't include norms for observations with unmet conditions
(groups with .all
names in stenR
convention). ScoringTable constructed
without these groups will produce NA
during normalize_scores_scoring()
when observation isn't matching condition provided (that's why GroupedFrequencyTable()
generates these groups them by default). In that case the csv file would be
smaller:
csv
sten,to 30:M,to 30:F,above 30:M,above 30:F
1,60-94,60-111,60-85,60-98
2,95-110,112-128,86-101,99-112
3,111-126,129-144,102-117,113-128
4,127-143,145-162,118-135,129-146
5,144-162,163-180,136-154,147-165
6,163-181,181-199,155-175,166-185
7,182-201,200-218,176-198,186-208
8,202-222,219-238,199-222,209-229
9,223-244,239-256,223-245,230-247
10,245-300,257-300,246-300,248-300
GroupConditions objects need to be provided either from csv file in
cond_file
argument or as R objects in conditions
argument of import_ScoringTable()
function.
```r imported_ST <- import_ScoringTable( source = ST_csv, method = "csv", conditions = list(age_grouping, sex_grouping) )
summary(imported_ST) ```
After import, ScoringTable can be used to generate scores.
r
NEO_norm <- normalize_scores_scoring(
data = IPIP_NEO_300,
vars = "N",
imported_ST,
group_col = "Group"
)
str(NEO_norm)
table(NEO_norm$Group)
Above information should be enough for basic usage of stenR
. It is developed
having in mind multiple use-cases and general customizability. Below are some
additional possibilities described.
In the examples above we used STEN
StandardScale object, which is provided
in the package. You can check all available scales with ?default_scales
doc.
You can also define your own StandardScale object using the StandardScale
function.
new_scale <- StandardScale("my_scale", 10, 3, 0, 20) # let's see if everything is correct new_scale # how does its distribution looks like? plot(new_scale)
R6
objectIn addition to procedural workflow described above, there is also an R6
class
definition prepared to handle the creation of ScoreTables and generation
of normalized scores: CompScoreTable.
There is one useful feature of this object, mainly the ability to automatically
recalculate ScoreTables based on raw score values calculated using the standardize
method. It can be helpful for inter-session continuity.
Currently there is only one object, supporting the ungrouped workflow. Grouped version of it is currently in works.
During object initialization you can attach some previously calculated FrequencyTables and/or StandardScales. It is fully optional, as it can also be done afterwards.
# attach during initialization HexCST <- CompScoreTable$new( tables = list(HEX_E = HEX_E_ft), scales = STEN ) # attach later altCST <- CompScoreTable$new() altCST$attach_FrequencyTable(HEX_E_ft, "HEX_E") altCST$attach_StandardScale(STEN) # there are no visible differences in objects structure summary(HexCST) summary(altCST)
After creation the object can be expanded with more FrequencyTables and StandardScales. All ScoreTables will be internally recalculated
# add new FrequencyTable HexCST$attach_FrequencyTable(FrequencyTable(HEXACO_60$HEX_C), "HEX_C") summary(HexCST) # add new StandardScale HexCST$attach_StandardScale(STANINE) summary(HexCST)
After the object is ready, the score standardization may begin. Let's feed it some raw scores!
# standardize the Honesty-Humility and Consciousness HexCST$standardize( data = head(HEXACO_60), what = "sten", vars = c("HEX_E", "HEX_C") ) # you can also do this easily with pipes! HEXACO_60[1:5, c("HEX_E", "HEX_C")] |> # no need to specify 'vars', as the correct columns are already selected HexCST$standardize("sten")
During score standardization, you can also automatically add new raw scores to existing frequencies and recalculate the ScoreTables automatically.
It is done before returning the values, so they will be based on the most recent ScoreTables.
You can actually use
standardize()
withcalc = TRUE
just after attaching the scale or scales. ScoreTables will be generated automatically before the data standardization - so you will receive both the data and computed ScoreTables
# check the current state of the object summary(HexCST) # now, standardize and recalculate! HEXACO_60[1:5, c("HEX_H", "HEX_C")] |> HexCST$standardize("sten", calc = TRUE) # check the new state summary(HexCST)
There is also option to export the ScoreTables - either to use them later in procedural way or to create new CompScoreTable in another session - for this reason there is also option to export them as FrequencyTables!
# export as ScoreTables st_list <- HexCST$export_ScoreTable() summary(st_list) # export as FrequencyTables ft_list <- HexCST$export_ScoreTable(strip = T) summary(ft_list)
Above examples described two most possible scenarios: either having raw scores to calculate norms yourself, or importing scoring table from measure documentation.
There are also more rare, but also possible scenario: having access only to descriptive statistics in research article. Using them we can create Simulated tables:
sim_ft <- SimFrequencyTable(min = 10, max = 50, M = 31.04, SD = 6.7, skew = -0.3, kurt = 2.89, seed = 2678) class(sim_ft) plot(sim_ft)
The Simulated class will be inherited by ScoreTable object created on its basis.
Simulated tables can be used in every way that regular ones can be with one
exception: if used to create CompScoreTable object, the raw scores cannot be
appended to this kind of table in standardize()
method.
SimCST <- CompScoreTable$new( tables = list("simmed" = sim_ft), scales = STEN ) SimCST$standardize( data = data.frame(simmed = round(runif(10, 10, 50), 0)), what = "sten", calc = TRUE)
There are also GroupAssignment()
and intersect_GroupAssignment()
functions
to assign observations on basis of one or two GroupConditions objects, described
in Groups section. They are used internally by GroupedFrequencyTable()
,
normalize_scores_grouped()
and normalize_scores_scoring()
, but are also
exported if you wish to extract_observations()
manually. Check the examples
in documentation for more information.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.