knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 4,
  fig.align = "center"
)

stenR is a package tailored mainly for users and creators of psychological questionnaires, though other social science researchers and survey authors can benefit greatly from it.

It provides tools to help with processes necessary for conducting such studies:

  1. processing data from raw item scores to raw factor/scale scores
  2. standardization of the raw scores into standard scale of your choosing, either by:
  3. normalization of the raw scores using frequency table (if no norms have been developed before). Usually for authors of questionnaires or their adaptations.
  4. importing scoring table developed by questionnaire authors - for researchers only using the measure

Furthermore, tools for developing or using norms on grouped basis are also provided (up to two intertwined grouping conditions are supported).

As there are few fairly independent and varied processes supported in the stenR, they will be described separately below. For more details, browse through documentation and other vignettes.

library(stenR)

Processing raw item scores to raw factor/scales

After conducting the study, results will be usually available in form of responses in some scoring scale for each separate items. For further analysis they need to be gathered into scales and factors (unless they are one-item scale).

stenR provides functions to make this process straightforward.

We will use one of the datasets provided with the package: SLCS, containing responses for items in Self-Liking Self-Competence Scale. It consists of 16 items, which can be grouped into two subscales (Self-Liking, Self-Competence) and General Score.

str(SLCS)

To summarize scores we need to create ScaleSpec objects with ScaleSpec() constructor function. Such objects contain instructions for R how the scales are structured, most importantly:

SL_spec <- ScaleSpec(
  name = "SelfLiking", min = 1, max = 5,
  item_names = c("SLCS_1", "SLCS_3", "SLCS_5", "SLCS_6", "SLCS_7", 
                 "SLCS_9", "SLCS_11", "SLCS_15"),
  reverse = c("SLCS_1", "SLCS_6", "SLCS_7", "SLCS_15")
)
SC_spec <- ScaleSpec(
  name = "SelfCompetence", min = 1, max = 5,
  item_names = c("SLCS_2", "SLCS_4", "SLCS_8", "SLCS_10", "SLCS_12",
                 "SLCS_13", "SLCS_14", "SLCS_16"),
  reverse = c("SLCS_8", "SLCS_10", "SLCS_13")
)

If there are main factors or factors of higher level, the ScaleSpec objects can be also combined into CombScaleSpec object with CombScaleSpec() constructor function. In our example the General Score is such factor.

GS_spec <- CombScaleSpec(
  name = "GeneralScore",
  SL_spec, SC_spec
)

# subscales can be also reversed
GS_with_rev <- CombScaleSpec(
  name = "rev_example",
  SL_spec, SC_spec,
  reverse = "SelfCompetence"
)

When all scale specifications are ready, they can be then used to get the factor/scale data, summarized in accordance to the instructions in provided ScaleSpec or CombScaleSpec objects.

summed_SCLS <- sum_items_to_scale(
  SLCS,
  SL_spec,
  SC_spec,
  GS_spec,
  GS_with_rev
)

str(summed_SCLS)

Normalize scores with FrequencyTable

For the times when you have great number of observations and prefer to develop norms (usually, when you are creator of questionnaire or its adaptation) it is recommended to generate FrequencyTable and ScoreTable objects. Resulting ScoreTable objects can be either used to normalize the scores or create exportable to non-R specific objects ScoringTable object.

There are also support for automatic grouping the observations using GroupedFrequencyTable and GroupedScoreTable objects. They will be mentioned in Grouping section.

We will use one of the datasets provided with the package: HEXACO_60, containing raw scores of scales in HEXACO 60-item questionnaire.

str(HEXACO_60)
  1. Create FrequencyTable objects

    At first, we need to create a FrequencyTable object for each variable using FrequencyTable() constructor function.

    r HEX_C_ft <- FrequencyTable(HEXACO_60$HEX_C) HEX_E_ft <- FrequencyTable(HEXACO_60$HEX_E)

    If there are some missing raw scores in your data, helpful message will be displayed. You can check how the frequencies look like using plot() function.

    r plot(HEX_E_ft)

    As we can see, the missing values are gathered near tails of the distribution. It can happen even with many observations - and in case of our sample (103 observations) it is very likely.

  2. Create ScoreTable objects

    ScoreTable object is basically a frequency table with additional standard scale specification attached. We can create our own specification using StandardScale(), but we will use in the example already provided STEN (Standard TEN) score specification

    r HEX_C_st <- ScoreTable( ft = HEX_C_ft, scale = STEN ) HEX_E_st <-ScoreTable( ft = HEX_E_ft, scale = STEN )

  3. Normalize and standardize scores

    Created ScoreTables can be then used to calculate the normalized scores. Normalization can be done either on individual vectors with basic normalize_score() function:

    r HEX_C_norm <- normalize_score( HEXACO_60$HEX_C, table = HEX_C_st, what = "sten" ) HEX_E_norm <- normalize_score( HEXACO_60$HEX_E, table = HEX_E_st, what = "sten" ) summary(HEX_C_norm) summary(HEX_E_norm)

    Or using the convienient wrapped for whole data.frame

    r HEX_CE_norm <- normalize_scores_df( data = HEXACO_60, vars = c("HEX_C", "HEX_E"), HEX_C_st, HEX_E_st, what = "sten", # by default no other variables will be retained retain = FALSE ) summary(HEX_CE_norm) str(HEX_CE_norm)

C_ScoringTable <- tempfile(fileext = ".csv")
E_ScoringTable <- tempfile(fileext = ".csv")

export_ScoringTable(
  to_ScoringTable(HEX_C_st,
                  min_raw = 10,
                  max_raw = 50),
  C_ScoringTable,
  "csv",

)

export_ScoringTable(
  to_ScoringTable(HEX_E_st,
                  min_raw = 10,
                  max_raw = 50),
  E_ScoringTable,
  "csv"
)

Normalize scores using imported ScoringTable

Most users will be using already developed norms by the creators of questionnaire. Scoring tables should be provided in the measure documentation, and ScoringTable object is mirroring their usual representation.

ScoringTable object can be either created from ScoreTable or GroupedScoreTable object or imported from csv or json file.

For manual creation, the csv format is recommended. Such file should look similar to the one below (which is created on basis of Consciousness ScoreTable from code in section above)

"sten","Score"
1,"10-19"
2,"20-25"
3,"26-28"
4,"29-31"
5,"32-35"
6,"36-39"
7,"40-42"
8,"43-46"
9,"47-48"
10,"49-50"

ScoringTable objects also supports different groups of observations - in that case 2nd to n-th columns are reflecting scores for each of the group. They will be mentioned in Grouping section.

We can import ScoringTables using import_ScoringTable() function.

HEX_C_Scoring <- import_ScoringTable(
  source = C_ScoringTable,
  method = "csv"
)
HEX_E_Scoring <- import_ScoringTable(
  source = E_ScoringTable,
  method = "csv"
)
summary(HEX_C_Scoring)
summary(HEX_E_Scoring)

They can be then used to normalize scores, very similarly to normalize_scores_df:

HEX_CE_norm <- normalize_scores_scoring(
  data = HEXACO_60,
  vars = c("HEX_C", "HEX_E"),
  HEX_C_Scoring,
  HEX_E_Scoring
)
summary(HEX_CE_norm)
str(HEX_CE_norm)

Groupings

Very often the norms are different for different groups: most often varying in some demographic variables, like biological sex or biological age. stenR functions provide support for such groups by intoducing Grouped variants of FrequencyTable and ScoreTable (regular ScoringTable supports them) and GroupConditions class.

GroupConditions works similarly to ScaleSpec and CombScaleSpec objects: it provides information about how to assign observations. They need the name of category (mainly for informative reasons) and conditions following the syntax of name of the group on the LHS and boolean condition on the RHS.

sex_grouping <- GroupConditions(
  conditions_category = "Sex",
  "M" ~ sex == "M",
  "F" ~ sex == "F"
)
age_grouping <- GroupConditions(
  conditions_category = "Age",
  "to 30" ~ age < 30,
  "above 30" ~ age >= 31
)
sex_grouping
age_grouping

They can be then used to create a GroupedFrequencyTable, and following that: GroupedScoreTable and, optionally, ScoringTable - or to create ScoringTable during import.

For this examples we will be using IPIP_NEO_300 dataset provided with the package. It contains the age and sex variables, and summed raw scores of 5 scales from IPIP NEO questionnaire (300 item version).

str(IPIP_NEO_300)
  1. GroupedFrequencyTable, GroupedScoreTable and ScoringTable export

    Workflow is very similiar to the ungrouped tables.

    r N_gft <- GroupedFrequencyTable( data = IPIP_NEO_300, conditions = list(age_grouping, sex_grouping), var = "N", # By default, norms are are also computed for '.all' groups. These are # used if by any reason observation can't be assigned to any group # in corresponding condition category .all = TRUE ) N_gst <- GroupedScoreTable(N_gft, scale = STEN) plot(N_gst) GroupedScoreTable can be then used to normalize scores using normalize_scores_grouped(). By default, other variables are not retained. You can also provide column name to contain the assigned group names per observation.

    r NEO_norm <- normalize_scores_grouped( data = IPIP_NEO_300, vars = "N", N_gst, what = "sten", group_col = "Group" ) str(NEO_norm) table(NEO_norm$Group)

    GroupedScoreTable can be then transformed into ScoringTable and exported to csv or json file.

    ```r ST_csv <- tempfile(fileext = ".csv") cond_csv <- tempfile(fileext = ".csv")

    N_ST <- to_ScoringTable( table = N_gst, min_raw = 60, max_raw = 300 )

    summary(N_ST)

    export_ScoringTable( table = N_ST, out_file = ST_csv, method = "csv", # you can also export GroupConditions to seperate csv file cond_file = cond_csv ) ```

  2. ScoringTable import from file

    To import ScoringTable with groups from csv, it needs to look accordingly:

    csv sten,to 30:M,to 30:F,to 30:.all2,above 30:M,above 30:F,above 30:.all2,.all1:M,.all1:F,.all1:.all2 1,60-94,60-111,60-101,60-85,60-98,60-92,60-90,60-104,60-95 2,95-110,112-128,102-117,86-101,99-112,93-106,91-106,105-119,96-111 3,111-126,129-144,118-134,102-117,113-128,107-122,107-122,120-136,112-128 4,127-143,145-162,135-152,118-135,129-146,123-140,123-140,137-154,129-147 5,144-162,163-180,153-171,136-154,147-165,141-160,141-159,155-174,148-166 6,163-181,181-199,172-190,155-175,166-185,161-180,160-179,175-194,167-186 7,182-201,200-218,191-210,176-198,186-208,181-203,180-200,195-214,187-208 8,202-222,219-238,211-232,199-222,209-229,204-226,201-222,215-234,209-229 9,223-244,239-256,233-251,223-245,230-247,227-247,223-245,235-251,230-248 10,245-300,257-300,252-300,246-300,248-300,248-300,246-300,252-300,249-300

    Usually measure developers don't include norms for observations with unmet conditions (groups with .all names in stenR convention). ScoringTable constructed without these groups will produce NA during normalize_scores_scoring() when observation isn't matching condition provided (that's why GroupedFrequencyTable() generates these groups them by default). In that case the csv file would be smaller:

    csv sten,to 30:M,to 30:F,above 30:M,above 30:F 1,60-94,60-111,60-85,60-98 2,95-110,112-128,86-101,99-112 3,111-126,129-144,102-117,113-128 4,127-143,145-162,118-135,129-146 5,144-162,163-180,136-154,147-165 6,163-181,181-199,155-175,166-185 7,182-201,200-218,176-198,186-208 8,202-222,219-238,199-222,209-229 9,223-244,239-256,223-245,230-247 10,245-300,257-300,246-300,248-300

    GroupConditions objects need to be provided either from csv file in cond_file argument or as R objects in conditions argument of import_ScoringTable() function.

    ```r imported_ST <- import_ScoringTable( source = ST_csv, method = "csv", conditions = list(age_grouping, sex_grouping) )

    summary(imported_ST) ```

    After import, ScoringTable can be used to generate scores.

    r NEO_norm <- normalize_scores_scoring( data = IPIP_NEO_300, vars = "N", imported_ST, group_col = "Group" ) str(NEO_norm) table(NEO_norm$Group)

Varia

Above information should be enough for basic usage of stenR. It is developed having in mind multiple use-cases and general customizability. Below are some additional possibilities described.

StandardScale

In the examples above we used STEN StandardScale object, which is provided in the package. You can check all available scales with ?default_scales doc.

You can also define your own StandardScale object using the StandardScale function.

new_scale <- StandardScale("my_scale", 10, 3, 0, 20)

# let's see if everything is correct
new_scale

# how does its distribution looks like?
plot(new_scale)

CompScoreTable R6 object

In addition to procedural workflow described above, there is also an R6 class definition prepared to handle the creation of ScoreTables and generation of normalized scores: CompScoreTable.

There is one useful feature of this object, mainly the ability to automatically recalculate ScoreTables based on raw score values calculated using the standardize method. It can be helpful for inter-session continuity.

Currently there is only one object, supporting the ungrouped workflow. Grouped version of it is currently in works.

Initialize the object

During object initialization you can attach some previously calculated FrequencyTables and/or StandardScales. It is fully optional, as it can also be done afterwards.

# attach during initialization
HexCST <- CompScoreTable$new(
  tables = list(HEX_E = HEX_E_ft),
  scales = STEN
)

# attach later
altCST <- CompScoreTable$new()
altCST$attach_FrequencyTable(HEX_E_ft, "HEX_E")
altCST$attach_StandardScale(STEN)

# there are no visible differences in objects structure
summary(HexCST)
summary(altCST)

Expand CompScoreTable

After creation the object can be expanded with more FrequencyTables and StandardScales. All ScoreTables will be internally recalculated

# add new FrequencyTable
HexCST$attach_FrequencyTable(FrequencyTable(HEXACO_60$HEX_C), "HEX_C")
summary(HexCST)

# add new StandardScale
HexCST$attach_StandardScale(STANINE)
summary(HexCST)

Standardize scores

After the object is ready, the score standardization may begin. Let's feed it some raw scores!

# standardize the Honesty-Humility and Consciousness
HexCST$standardize(
  data = head(HEXACO_60),
  what = "sten",
  vars = c("HEX_E", "HEX_C")
)

# you can also do this easily with pipes!
HEXACO_60[1:5, c("HEX_E", "HEX_C")] |>
  # no need to specify 'vars', as the correct columns are already selected
  HexCST$standardize("sten")

Automatically recalculate ScoreTables

During score standardization, you can also automatically add new raw scores to existing frequencies and recalculate the ScoreTables automatically.

It is done before returning the values, so they will be based on the most recent ScoreTables.

You can actually use standardize() with calc = TRUE just after attaching the scale or scales. ScoreTables will be generated automatically before the data standardization - so you will receive both the data and computed ScoreTables

# check the current state of the object
summary(HexCST)

# now, standardize and recalculate!
HEXACO_60[1:5, c("HEX_H", "HEX_C")] |>
  HexCST$standardize("sten", calc = TRUE)

# check the new state
summary(HexCST)

Export tables

There is also option to export the ScoreTables - either to use them later in procedural way or to create new CompScoreTable in another session - for this reason there is also option to export them as FrequencyTables!

# export as ScoreTables
st_list <- HexCST$export_ScoreTable()
summary(st_list)

# export as FrequencyTables
ft_list <- HexCST$export_ScoreTable(strip = T)
summary(ft_list)

Simulate FrequencyTable using raw score distribution data

Above examples described two most possible scenarios: either having raw scores to calculate norms yourself, or importing scoring table from measure documentation.

There are also more rare, but also possible scenario: having access only to descriptive statistics in research article. Using them we can create Simulated tables:

sim_ft <- SimFrequencyTable(min = 10, max = 50, M = 31.04, 
                            SD = 6.7, skew = -0.3, kurt = 2.89, seed = 2678)

class(sim_ft)

plot(sim_ft)

The Simulated class will be inherited by ScoreTable object created on its basis.

Simulated tables can be used in every way that regular ones can be with one exception: if used to create CompScoreTable object, the raw scores cannot be appended to this kind of table in standardize() method.

SimCST <- CompScoreTable$new(
  tables = list("simmed" = sim_ft),
  scales = STEN
)

SimCST$standardize(
  data = data.frame(simmed = round(runif(10, 10, 50), 0)),
  what = "sten",
  calc = TRUE)

Extracting observations by group

There are also GroupAssignment() and intersect_GroupAssignment() functions to assign observations on basis of one or two GroupConditions objects, described in Groups section. They are used internally by GroupedFrequencyTable(), normalize_scores_grouped() and normalize_scores_scoring(), but are also exported if you wish to extract_observations() manually. Check the examples in documentation for more information.



StatisMike/stenR documentation built on Aug. 20, 2022, 9:30 a.m.