In profandyfield/discovr: Interactive Tutorials and Data for "Discovering Statistics Using R and RStudio"

knitr::opts_chunk$set(
    echo = TRUE,
    message = FALSE,
    warning = FALSE
)
options(knitr.kable.NA = '')

#necessary to render tutorial correctly
library(learnr) 
library(htmltools)
#tidyverse
library(dplyr)
library(ggplot2)
#non tidyverse
library(correlation)
library(GPArotation)
library(knitr)
library(parameters)
library(psych)

source("./www/discovr_helpers.R")

# knitr::write_bib(c('forcats'), file = 'packages.bib')


# Read data files needed for the tutorial

raq_tib <- discovr::raq
raq_items_tib <- raq_tib |> dplyr::select(-id)

raq_poly <- psych::polychoric(raq_items_tib)
raq_cor <- raq_poly$rho
raq_fa <- raq_items_tib |> psych::fa(nfactors = 4, scores = "tenBerge")

# Create bib file for R packages
here::here("inst/tutorials/discovr_18/packages.bib") |>
  knitr::write_bib(c('here', 'tidyverse', 'dplyr', 'readr', 'forcats', 'tibble', 'knitr', 'correlation', 'GPArotation', 'Hmisc', 'parameters', 'psych'), file = _)

discovr: Exploratory Factor Analysis (EFA)

Overview

discovr package hex sticker, female space pirate with gun. Gunsmoke forms the letter R.

**Usage:** This tutorial accompanies [Discovering Statistics Using R and RStudio](https://www.discovr.rocks/) [@field_discovering_2023] by [Andy Field](https://en.wikipedia.org/wiki/Andy_Field_(academic)). It contains material from the book so there are some copyright considerations but I offer them under a [Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License](http://creativecommons.org/licenses/by-nc-nd/4.0/). Tl;dr: you can use this tutorial for teaching and non-profit activities but please don't meddle with it or claim it as your own work.

`r cat_space(fill = blu)` Welcome to the `discovr` space pirate academy

Hi, welcome to discovr space pirate academy. Well done on embarking on this brave mission to planet r rproj()s, which is a bit like Mars, but a less red and more hostile environment. That's right, more hostile than a planet without water. Fear not though, the fact you are here means that you can master r rproj(), and before you know it you'll be as brilliant as our pirate leader Mae Jemstone (she's the badass with the gun). I am the space cat-det, and I will pop up to offer you tips along your journey.

On your way you will face many challenges, but follow Mae's system to keep yourself on track:

r bmu(height = 1.5) This icon flags materials for teleporters. That's what we like to call the new cat-dets, you know, the ones who have just teleported into the academy. This material is the core knowledge that everyone arriving at space academy must learn and practice. For accessibility, these sections will also be labelled with [(1)]{.alt}.
r user_visor(height = 1.5) Once you have been at space pirate academy for a while, you get your own funky visor. It has various modes. My favourite is the one that allows you to see everything as a large plate of tuna. More important, sections marked for cat-dets with visors goes beyond the core material but is still important and should be studied by all cat-dets. However, try not to be disheartened if you find it difficult. For accessibility, these sections will also be labelled with [(2)]{.alt}.
r user_astronaut(height = 1.5) Those almost as brilliant as Mae (because no-one is quite as brilliant as her) get their own space suits so that they can go on space pirate adventures. They get to shout RRRRRR really loudly too. Actually, everyone here gets to should RRRRRR really loudly. Try it now. Go on. It feels good. Anyway, this material is the most advanced and you can consider it optional unless you are a postgraduate cat-det. For accessibility, these sections will also be labelled with [(3)]{.alt}.

It's not just me that's here to help though, you will meet other characters along the way:

r alien(height = 1.5) aliens love dropping down onto the planet and probing humanoids. Unfortunately you'll find them probing you quite a lot with little coding challenges. Helps is at hand though.
r robot(height = 1.5) bend-R is our coding robot. She will help you to try out bits of r rproj() by writing the code for you before you encounter each coding challenge.
r bug(height = 1.5) we also have our friendly alien bugs that will, erm, help you to avoid bugs in your code by highlighting common mistakes that even Mae Jemstone sometimes makes (but don't tell her I said that or my tuna supply will end).

Also, use hints and solutions to guide you through the exercises (Figure 1).

By for now and good luck - you'll be amazing!

Workflow

Before attempting this tutorial it's a good idea to work through this tutorial on how to install, set up and work within r rproj() and r rstudio().
The tutorials are self-contained (you practice code in code boxes). However, so you get practice at working in r rstudio() I strongly recommend that you create an Quarto document within an r rstudio() project and practice everything you do in the tutorial in the Quarto document, make notes on things that confused you or that you want to remember, and save it. Within this document you will need to load the relevant packages and data.

Packages

This tutorial uses the following packages:

correlation [@correlation2020; @R-correlation]
GPArotation [@R-GPArotation]
here [@R-here]
knitr [@R-knitr]
parameters [@parameters2020; @R-parameters]
psych [@R-psych]

It also uses these tidyverse packages [@R-tidyverse; @tidyverse2019]: dplyr [@R-dplyr] and readr [@R-readr].

Coding style

There are (broadly) two styles of coding:

Explicit: Using this style you declare the package when using a function: package::function(). For example, if I want to use the mutate() function from the package dplyr, I will type dplyr::mutate(). If you adopt an explicit style, you don't need to load packages at the start of your Quarto document (although see below for some exceptions).
Concise: Using this style you load all of the packages at the start of your Quarto document using library(package_name), and then refer to functions without their package. For example, if I want to use the mutate() function from the package dplyr, I will use library(dplyr) in my first code chunk and type the function as mutate() when I use it subsequently.

Coding style is a personal choice. The Google r rproj() style guide and tidyverse style guide recommend an explicit style, and I use it in teaching materials for two reasons (1) it helps you to remember which functions come from which packages, and (2) it prevents clashes resulting from using functions from different packages that have the same name. However, even with this style it makes sense to load tidyverse because the dplyr and ggplot2 packages contain functions that are often used within other functions and in these cases explicit code is difficult to read. Also, no-one wants to write ggplot2:: before every function from ggplot2.

You can use either style in this tutorial because all packages are pre-loaded. If working outside of the tutorial, load the tidyverse package (and any others if you're using a concise style) at the beginning of your Quarto document:

library(tidyverse)

Data

To work outside of this tutorial you need to download the following data files:

raq.csv

Set up an r rstudio() project in the way that I recommend in this tutorial, and save the data files to the folder within your project called [data]{.alt}. Place this code in the first code chunk in your Quarto document:

raq_tib <- here::here("data/raq.csv") |>
  readr::read_csv()

This code reads in the data into an object called raq_tib.

`r bmu()` An `r rproj()`-induced anxiety example [(2)]{.alt}

Factor analysis is used frequently to develop questionnaires. A lot of students become very stressed about r rproj(). Imagine that I wanted to design a questionnaire to measure a trait that I termed ‘r rproj() anxiety'. I devised a questionnaire to measure various aspects of students' anxiety towards learning r rproj() and statistics, the RAQ. I generated questions based on interviews with anxious and non-anxious students and came up with 23 possible questions to include. Each question was a statement followed by a five-point Likert scale: strongly disagree = 1, disagree = 2, neither agree nor disagree = 3, agree = 4 and strongly agree = 5. I wanted to know whether anxiety about r rproj() could be broken down into specific forms of anxiety. In other words, what latent variables contribute to anxiety about learning or using r rproj()?

The questions in the RAQ are as follows (each has responses from :

raq_01: Statistics make me cry
raq_02: My friends will think I'm stupid for not being able to cope with r rproj()
raq_03: Standard deviations excite me
raq_04: I dream that Pearson is attacking me with correlation coefficients
raq_05: I don't understand statistics
raq_06: I have little experience of computers
raq_07: All computers hate me
raq_08: I have never been good at mathematics
raq_09: My friends are better at statistics than me
raq_10: Computers are useful only for playing games
raq_11: I did badly at mathematics at school
raq_12: People try to tell you that r rproj() makes statistics easier to understand but it doesn't
raq_13: I worry that I will cause irreparable damage because of my incompetence with computers
raq_14: Computers have minds of their own and deliberately go wrong whenever I use them
raq_15: Computers are out to get me
raq_16: I weep openly at the mention of central tendency
raq_17: I slip into a coma whenever I see an equation
raq_18: R always crashes when I try to use it
raq_19: Everybody looks at me when I use r rproj()
raq_20: I can't sleep for thoughts of eigenvectors
raq_21: I wake up under my duvet thinking that I am trapped under a normal distribution
raq_22: My friends are better at r rproj() than I am
raq_23: If I am good at statistics people will think I am a nerd

With a little help from a few lecturer friends I collected 2571 completed questionnaires. The data are in [raq_tib]{.alt}.

`r alien()` Alien coding challenge

View the data in [raq_tib]{.alt}.

raq_tib

Note that there are 24 variables:

id: a character variable (note the <chr> under the name) containing the participant ID.
Twenty three variables labelled from raq_01 to raq_23 that are all numeric (note the <dbl> under the name) that contain the responses to the 23 questions on the questionnaire.

`r bmu()` Pre-analysis [(2)]{.alt}

As we've seen, the data has a variable id containing the participant ID. We won't want this variable in the analyses we do. We can use select() to remove it each time we don't need it, but it will save a lot of repetitious code if we store a version of the data that contains only the scores from the RAQ items.

`r alien()` Alien coding challenge

Use the code box to create an object called [raq_items_tib]{.alt} that contains only the questionnaire item variables. Inspect this new object and note that the [id]{.alt} variable has gone.

raq_items_tib <- raq_tib |> 
  dplyr::select(-id)
raq_items_tib

We will use this new object, [raq_items_tib]{.alt}, for the remainder of the tutorial.

`r bmu()` Inspect the correlation matrix [(2)]{.alt}

The first thing to do is to look at the correlations between RAQ items.

`r alien()` Alien coding challenge

Using what you've learnt in previous tutorials produce a correlation matrix for the 23 RAQ items.

`r cat_space()` **Tip: Correlation memory jog** * We can use `correlation::correlation()` to get a correlations, and `summary()` to present them as a matrix/grid. * You need to de-select the [id]{.alt} variable before computing the correlations, we can do this by using the [raq_items_tib]{.alt} version of the data that we just created. * You can round values by piping into `kable()` in the usual way.

# feed the tibble with only the RAQ items into the correlation() function
correlation::correlation(raq_items_tib)

# pipe into summary() to get a condensed table of correlations:
correlation::correlation(raq_items_tib) |>
  summary() |> 
  knitr::kable(digits = 2)

That was great practice, but because our items use Likert style scales it is more appropriate to look at something called the [polychoric correlation]{.alt} rather than the Pearson correlation coefficient. We can obtain polychoric correlations by placing the tibble containing our data into the polychoric() function from the psych package. For example,

raq_poly <- psych::polychoric(raq_items_tib)

This code takes the data from the RAQ (raq_items_tib), puts it through the polychoric() function to get the correlation coefficients and stores them in an object called raq_poly. This object contains information other than the correlation coefficients. The matrix of correlation coefficients is stored in a variable called rho, so we can access it using raq_poly$rho. However, we need to use this matrix at various points so it's helpful to store it as its own object.

raq_cor <- raq_poly$rho

This code stores the polychoric correlation matrix as an object called raq_cor. We view these correlations by executing the name of the object, but it's helpful to round them to 2 decimal places when viewing them using the kable() function that we have met before.

`r alien()` Alien coding challenge

Create an object called [raq_cor]{.alt} that contains the polychoric correlations for the 23 RAQ items. View it using kable() to round values to 2 decimal places.

# create the polychoric correlation object
raq_poly <- psych::polychoric(raq_items_tib)

# save the correlations from the object you just created
raq_cor <- raq_poly$rho

# view the correlations rounded to 2dp:
knitr::kable(raq_cor, digits = 2)

# put it all together
raq_poly <- psych::polychoric(raq_items_tib)
raq_cor <- raq_poly$rho
knitr::kable(raq_cor, digits = 2)

Staring at all of those numbers can make your eyes hurt. We can instead produce a heat map of correlations using the cor.plot() function from the psych package. This function plots a grid of the correlations and colours each cell by the strength and direction of the correlation. We can get a sense of the relationships in the data by looking for very lightly shaded cells (correlation close to zero) and very darkly shaded cells (correlations close to 1 or -1). Given that correlation matrices are symmetrical above and below the diagonal the information below the diagonal is identical to the information above it. It is, therefore, helpful to use [upper = FALSE]{.alt} to hide the information above the diagonal.

`r robot()` Code example

To use the cor.plot() function we feed the corrlation matrix we just created into it.

psych::cor.plot(raq_cor, upper = FALSE)

`r alien()` Alien coding challenge

Use the code box to visualise the correlation matrix using cor.plot().

psych::cor.plot(raq_cor, upper = FALSE)

We can visually scan the correlation matrix and look for correlations between about ±0.3. You're looking for any variables that have lots of correlations that are close to 0 (no relationship). There's nothing magic about the value 0.3, it's a fairly arbitrary and inclusive heuristic to allow you to narrow your focus! Also note correlations greater than (roughly) ±0.9 because these might indicate collinear or singular items.

All questions in the RAQ correlate reasonably well with all others and none of the correlation coefficients are excessively large; therefore, we won't eliminate any questions based on the size of the correlations.

`r bmu()` Bartlett's test and the KMO test [(2)]{.alt}

Another way to inspect the correlation matrix is with Bartlett's test of sphericity, which tests whether the correlation matrix is significantly different from an identity matrix (i.e. it tests whether the correlation coefficients are all 0). It's testing a pretty extreme scenario and because significance depends on sample size, and in factor analysis sample sizes are very large, the test will nearly always be significant. So it's not a particularly useful test, but worth doing because in the unlikely event that it is non-significant you likely have a big problem on your hands.

`r robot()` Code example

We can implement Bartlett's test using the psych::cortest.bartlett() function. To use it, we pipe the correlation matrix ([raq_cor]{.alt}) into it and specify the sample size using, for these data, [n = 2571]{.alt}. For other data sets change the value of 2571 to be the sample size.

psych::cortest.bartlett(raq_cor, n = 2571)

`r alien()` Alien coding challenge

Use the code box to get Bartlett's test.

psych::cortest.bartlett(raq_cor, n = 2571)

bartlett <- psych::cortest.bartlett(raq_cor, n = 2571)

As expected, given the huge sample size, Bartlett's test is highly significant, $\chi^2$(r bartlett$df) = r sprintf("%.2f", bartlett$chisq), p = r sprintf("%.2f", bartlett$p.value). This significant value only really tells us that we don't have a massive problem, which is nice to know, I suppose.

You might also want to check the sampling adequacy (spoiler: with 2571 participants I think we're good). The Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy (KMO) varies between 0 and 1 with a value of 0 indicating that factor analysis is likely to be inappropriate. A value close to 1 indicates that patterns of correlations are relatively compact and so factor analysis should yield distinct and reliable factors. Kaiser and Rice (1974) provided appealing guidelines, especially if you like the letter M:

Marvellous: values in the 0.90s
Meritorious: values in the 0.80s
Middling: values in the 0.70s
Mediocre: values in the 0.60s
Miserable: values in the 0.50s

Values smaller than 0.5 should lead you either to collect more data or to rethink which variables to include.

`r robot()` Code example

We can implement the KMO test using the psych::KMO() function. To use it, we again feed the correlation matrix ([raq_cor]{.alt}) into it. For example:

psych::KMO(raq_cor)

`r alien()` Alien coding challenge

Use the code box to get then KMO test.

psych::KMO(raq_cor)

kmo <- psych::KMO(raq_cor)

The KMO statistic ([Overall MSA]{.alt}) is r sprintf("%.2f", kmo$MSA), which is well above the minimum criterion of 0.5 and falls into the range of marvellous. The KMO values for individual variables range from r sprintf("%.2f", min(kmo$MSAi)) to r sprintf("%.2f", max(kmo$MSAi)). All values are, therefore, well above 0.5, which is good news.

`r info()` **Information** If you find variables with KMO values below 0.5 then consider excluding them from the analysis (or run the analysis with and without that variable and note the difference). Removal of a variable affects the KMO statistics, so if you plan to remove a variable be sure to re-examine the KMO for the correlation matrix that has the variable removed.

`r bmu()` Parallel analysis [(2)]{.alt}

To determine how many factors we should extract, we can use the psych::fa.parallel() function to run parallel analysis. This function has a lot of arguments, the ones most likley to be of use to you are

[n.obs]{.alt}. If you run the analysis from the correlation matrix then you need to tell the function the sample size on which these correlations are based (e.g., [n.obs = 2571]{.alt}).
[fm = "minres"]{.alt} There are various methods of factor analysis. By default the psych package uses minimum residual ([minres]{.alt}), which is basically the ordinary least squares (OLS) that you'll be familiar with from the tutorials on the linear model. There are other methods though such as principal axes ([pa]{.alt}), alpha factoring ([alpha]{.alt}), weighted least squares [wls]{.alt}, minimum rank ([minrank]{.alt}), or maximum likelihood ([ml]{.alt}). Match this option to the one you're going to use in the main factor analysis.
[fa = "both"]{.alt}. By default the function will tell you the number of factors to extract based on factor analysis but also the number of components (like factors, but different) to extract based on a related technique called principal component analysis (PCA). In the interests of de-cluttering the output, we can change this option to [fa = "fa"]{.alt} so we see only the number of factors to extract (and not the number of components). It's useful to look at both methods.
[use = "pairwise"]{.alt}. By default missing data are handled by using all complete pairwise observations to calculate the correlation coefficients between variables. Therefore, if a case is missing a value on, say item raq_22, that case will only be excluded when computing the correlation coefficients that directly involve raq_22. This default is sensible, because the other options are [everything]{.alt} (this result in missing values in the correlation matrix so not a good idea), [all.obs]{.alt} (throws up an error alerting you to missing values and doesn't compute the correlations), [complete.obs]{.alt} (deletes cases that have missing values for any item from the computation of all correlations - also known as listwise deletion, which is rarely, if ever, a good idea).
[cor]{.alt}. By default the parallel analysis assumes you want to compute (or are providing) Pearson correlation coefficients ([cor = "cor"]{.alt}). However, with ordinal indicator variables (e.g. Likert scales) you'd want polychoric correlations ([cor = "poly"]{.alt}), and for binary indicators tetrachoric correlations ([cor = "tet"]{.alt}). If you have a mix of variables types then [cor = "mixed"]{.alt} will use a mixture of coefficients.

`r robot()` Code example

You can feed your tibble containing the item data from the questionnaire into the fa.parallel() function. All of the default settings are fine for our purposes except that we need to use polychoric correlations. We do this by including the [cor = "poly"]{.alt} argument.

psych::fa.parallel(raq_items_tib, cor = "poly")

Alternatively, given that we have already computed the polychoric correlations and stored them in [raq_cor]{.alt}, it's faster to apply the function directly to this correlation matrix and specify the sample size with [n.obs = 2571]{.alt}.

psych::fa.parallel(raq_cor, n.obs = 2571)

`r alien()` Alien coding challenge

Use the code box to run a parallel analysis with the default settings but specifying [fa = "fa"]{.alt}. How many factors should be extracted?

# Choose one of
# Using the raw data
psych::fa.parallel(raq_items_tib, fa = "fa", cor = "poly")

# Using the correlation matrix (faster)
psych::fa.parallel(raq_cor, n.obs = 2571, fa = "fa")

In a parallel analysis each eigenvalue (which represents the size of the factor) is compared against an eigenvalue for the corresponding factor in many randomly generated data sets that have the same characteristics as the data being analysed. In doing so, each eigenvalue is compared to an eigenvalue from a data set that has no underlying factors. This is a bit like asking whether our observed factor is bigger than a non-existing factor. Factors that are bigger than their ‘random' counterparts are retained. In the scree plot, each factor is plotted on the x-axis with its corresponding eigenvalue on the y-axis. The observed eigenvalues for each factor are shown as blue triangles connected by a blue line to form the characteristic 'scree' shape. The red dotted line shows the corresponding line from randomly generated datasets. Given we want to keep factors that have eigen values larger than we'd expect in data that has no underlying factors, we want to keep factors that have triangles (eigenvalues) that fall above the red line. In this case, the first four factors have eigenvalues larger than their counterparts from random data. The output also contains a handy message confirming that we should extract 4 factors.

`r alien()` Alien coding challenge

Repeat the parallel analysis but this time using principal components to compute the eigenvalues.

# Choose one of
# Using the raw data
psych::fa.parallel(raq_items_tib, fa = "pc", cor = "poly")

# Using the correlation matrix (faster)
psych::fa.parallel(raq_cor, n.obs = 2571, fa = "pc")

quiz(caption = "Parallel analysis Quiz",
     question("Based on the parallel analysis that used principal components to compute the eiegenvalues, how many factors should be extracted?",
    answer("4", correct = T, message = "Yes, fortunately this analysis agrees with the parallel analysis based on eigenvalues from factor analysis."),
    answer("5", message = "Look at how many cpomonents have eigen values above the red line."),
    answer("3", message = "Look at how many cpomonents have eigen values above the red line"),
    answer("2", message = "Look at how many cpomonents have eigen values above the red line"),
    random_answer_order = TRUE,
    allow_retry = T
    )
)

`r bmu()` Factor analysis [(2)]{.alt}

We can now run the factor analysis, extracting 4 factors, using the psych::fa() function. This function has this general format (I'm including only the arguments you're most likely to use):

my_fa_object <- psych::fa(r,
                          nfactors = 1,
                          fm = "minres",
                          rotate = "oblimin",
                          scores = "regression",
                          max.iter = 50,
                          use = "pairwise",
                          cor = "cor"
                          )

In which you replace [my_fa_object]{.alt} with the name you want to give your factor analysis object.

[r]{.alt}: the data being fed into the function, in this case [r = raq_items_tib]{.alt} or [r = raq_cor]{.alt}]
[n.obs]{.alt}. As with parallel analysis, if we run the factor analysis from the correlation matrix instead of the raw data we must tell the function the sample size on which the correlations are based ([n.obs = 2571]{.alt}).
[nfactors = 1]{.alt}: the number of factors to extract. By default 1 factor is extracted, so we need to change this to [nfactors = 4]{.alt}.
[fm = "minres"]{.alt}: the method of factor analysis (see the previous section for the options, but we can leave the default alone.
[rotate = "oblimin"]{.alt}: the method of factor rotation. The default is an oblique rotation method called [oblimin]{.alt}, which is fine. You can choose a different oblique rotation by replacing [oblimin]{.alt} with one of [Promax]{.alt}, [promax]{.alt}, [simplimax]{.alt}, [bentlerQ]{.alt}, [geominQ]{.alt}, [biquartimin]{.alt} and [cluster]{.alt}. If you need an orthogonal rotation, then use [rotate = "varimax"]{.alt} or replace [varimax]{.alt} with one of [quartimax]{.alt}, [bentlerT]{.alt}, [equamax]{.alt}, [varimin]{.alt}, [geominT]{.alt} and [bifactor]{.alt}. If you don't want to rotate the solution, use [rotate = "none"]{.alt}.
[scores = "regression"]{.alt}: the method for computing factor scores. By default the [regression]{.alt} method is used, but because we should (other things being equal) use oblique rotation, we should change this argument to [scores = "tenBerge"]{.alt}.
[max.iter = 50]{.alt}: factor analysis works using an iterative process. By default the function allows r rproj() to search for an optimal solution 50 times. This number is fine in most cases but if you get an error message about convergence then try increasing this value.
[use = "pairwise"]{.alt}: this argument determines how missing values are treated (see the previous section). The default is fine.
[cor = "cor"]{.alt}: this argument is the same as described for fa.parallel().

`r cat_space()` **Tip: factor rotation** To do factor rotation we need to have the `GPArotation` package loaded. Make sure that you have it installed by executing the following in the console: wzxhzdk:37 Once installed, make sure that in your Quarto document you include (if you follow my advise you'll do this in the code chunk where you setup your Quarto document) wzxhzdk:38

`r robot()` Code example

As with parallel analysis we can feed the raw data into the function (remembering to set [cor = "poly"]{.alt} so that the analysis is based on the polychoric correlations), or we can feed in the correlation matrix and sample size ([n.obs = 2571]{.alt})

For the RAQ data, we could use the following code, which leaves a lot of the default settings alone, to create an object [raq_fa]{.alt}:

raq_fa <- psych::fa(raq_items_tib,
                    nfactors = 4,
                    scores = "tenBerge",
                    cor = "poly"
                    )

Using the correlation matrix instead of the raw data we'd get:

raq_fa <- psych::fa(raq_cor,
                    n.obs = 2571,
                    nfactors = 4,
                    scores = "tenBerge"
                    )

`r alien()` Alien coding challenge

Use the code box to create the object [raq_fa]{.alt} using the code above and then inspect it.

# Use either (but not both)
# From the raw data
raq_fa <- psych::fa(raq_items_tib, nfactors = 4, scores = "tenBerge", cor = "poly")
raq_fa
# From the correlation matrix
raq_fa <- psych::fa(raq_cor, n.obs = 2571, nfactors = 4, scores = "tenBerge") 
raq_fa

raq_fa <- psych::fa(raq_cor, n.obs = 2571, nfactors = 4, scores = "tenBerge")

There's a lot of output. First, we are shown the pattern matrix, but we'll return to this. Note that the factors are labelled [MR1]{.alt}, [MR2]{.alt}, [MR3]{.alt} and [MR4]{.alt}. We are given some information about how much variance each factor accounts for.

raq_fa$Vaccounted |>
  knitr::kable(digits = 2)

va <- raq_fa$Vaccounted

prop_var <- sprintf("%.2f", va[2, ])
perc_var <- sprintf("%.0f", 100*va[2, ])
cum_var <- sprintf("%.2f", va[3, ])
cum_perc_var <- sprintf("%.0f", 100*va[3, ])
pe <- sprintf("%.2f", va[4, ])
pe_perc <- sprintf("%.0f", 100*va[4, ])

We see, for example, from [Proportion Var]{.alt} that [MR1]{.alt} accounts for r prop_var[1] of the overall variance (r perc_var[1]%) and [MR2]{.alt} accounts for r prop_var[2] of the variance (r perc_var[2]%) and so on. The [Cumulative Var]{.alt} is the proportion of variance explained cumulatively by the factors. So, cumulatively, [MR1]{.alt} accounts for r prop_var[1] of the overall variance (13%) and [MR1]{.alt} and [MR2]{.alt} together account for r prop_var[1] + r prop_var[2] = r cum_var[2] of the variance (r cum_perc_var[2]%). Importantly, we can see that all four factors in combination explain r cum_var[4] of the overall variance (r cum_perc_var[4]%).

The [Proportion Explained]{.alt} is the proportion of the explained variance, that is explained by a factor. So, of the r cum_perc_var[4]% of variance accounted for, r pe[1] (r pe_perc[1]%) is attributable to [MR1]{.alt}, r pe[2] (r pe_perc[2]%) to [MR2]{.alt}, r pe[3] (r pe_perc[3]%) to [MR4]{.alt} and r pe[4] (r pe_perc[4]%) to [MR3]{.alt}.

The correlations between factors are also displayed:

raq_fa$Phi |>
  knitr::kable(digits = 2)

These are all non-zero indicating that factors are correlated (and oblique rotation was appropriate). It also tells us the degree to which factors are correlated. All of the factors positively, and fairly strongly, correlate with each other. In other words, the latent constructs represented by the factors are related.

There are several fit indices that tell us how well the model fits the data.

The chi-square statistic for the model is given as the likelihood Chi Square, $\chi^2$ = r sprintf("%.2f", raq_fa$STATISTIC), p < 0.001. We want this statistic to be non-significant but our value is highly significant. Given the sample size of r raq_fa$nh a significant result is not at all surprising because even small deviations from a good fit will be significant. This result highlights the limitation of using significance to indicate model fit.
The Tucker Lewis Index of factoring reliability (TFI) is given as r sprintf("%.3f", raq_fa$TLI).
The RMSEA is r sprintf("%.3f", raq_fa$RMSEA[1]) 90% CI [r sprintf("%.3f", raq_fa$RMSEA[2]), r sprintf("%.3f", raq_fa$RMSEA[3])].
The RMSR is r sprintf("%.2f", raq_fa$rms).

`r info()` **Fit indices** Good fit is (probably) indicated by * A combination of TLI > 0.96 and SRMR (RMSR in the output) < 0.06 * A combination of RMSEA < 0.05 and SRMR < 0.09 The TLI is `r sprintf("%.2f", raq_fa$TLI)`, which is greater than 0.96, and RMSR is `r sprintf("%.2f", raq_fa$rms)` which is smaller than both 0.09 and 0.06. Furthermore, RMSEA is `r sprintf("%.3f", raq_fa$RMSEA[1])`, which is less than 0.05. With the caveat that universal cut-offs need to be taken with a pinch of salt, it's reasonable to conclude that the model has excellent fit.

To interpret the factor analysis we look at the factor loadings for each question on each factor to see which items load most heavily onto which factors. The factor loadings are shown in the pattern matrix that forms part of the main output, but this is quite difficult to interpret in its raw form. To make life easier we can use the parameters::model_parameters() function, which provides an easy way to sort the items by their factor loadings and suppress factor loadings below a certain value (which can help you to see the wood for the trees). In general it takes the form:

parameters::model_parameters(my_fa_object, sort = TRUE, threshold = "max")

In which [my_fa_object]{.alt} is the factor analysis object containing the factor loadings. By setting [sort = TRUE]{.alt} we sort the items by their factor loadings, and we can set [threshold]{.alt} to a value above which we show values. By default the threshold is the maximum loading (i.e. for each item only the maximum loading is shown), to see all of the factor loadings we need to set [threshold = NULL]{.alt}, and to see loadings above a particular threshold (between 0 and 1) we can set the value of the threshold we want; for example, if we set [threshold = 0.2]{.alt} then only factor loadings above 0.2 will be displayed.

`r alien()` Alien coding challenge

Use the code box to sort the the factor loadings and display only those greater than 0.2. Round to 2 decimal places using kable() (Remember that earlier we saved the factor analysis object as [raq_fa]{.alt}.)

parameters::model_parameters(raq_fa, sort = TRUE, threshold = "0.2") |> 
  knitr::kable(digits = 2)

This makes it much easier to see the patterns in the questions that load onto the same factors. The questions that load highly on [MR1]{.alt} seem to be items that relate to Fear of computers:

raq_05: I don't understand statistics
raq_06: I have little experience of computers
raq_07: All computers hate me
raq_10: Computers are useful only for playing games
raq_13: I worry that I will cause irreparable damage because of my incompetence with computers
raq_14: Computers have minds of their own and deliberately go wrong whenever I use them
raq_15: Computers are out to get me
raq_18: R always crashes when I try to use it

Note that item 5 also loads highly onto [MR4]{.alt}.

The questions that load highly on [MR2]{.alt} seem to be items that relate to Fear of peer/social evaluation:

raq_02: My friends will think I'm stupid for not being able to cope with r rproj()
raq_09: My friends are better at statistics than me
raq_19: Everybody looks at me when I use r rproj()
raq_22: My friends are better at r rproj() than I am
raq_23: If I am good at statistics people will think I am a nerd

The questions that load highly on [MR4]{.alt} seem to be items that relate to Fear of statistics:

raq_01: Statistics make me cry
raq_03: Standard deviations excite me
raq_04: I dream that Pearson is attacking me with correlation coefficients
raq_05: I don't understand statistics
raq_12: People try to tell you that r rproj() makes statistics easier to understand but it doesn't
raq_16: I weep openly at the mention of central tendency
raq_20: I can't sleep for thoughts of eigenvectors
raq_21: I wake up under my duvet thinking that I am trapped under a normal distribution

The questions that load highly on [MR3]{.alt} seem to be items that relate to Fear of mathematics:

raq_08: I have never been good at mathematics
raq_11: I did badly at mathematics at school
raq_17: I slip into a coma whenever I see an equation

This analysis seems to reveal that the questionnaire is composed of four subscales: fear of statistics, fear of computers, fear of maths and fear of negative peer evaluation. There are two possibilities here. The first is that the RAQ failed to measure what it set out to (namely, r rproj() anxiety) but instead measures related constructs. The second is that these four constructs are sub-components of r rproj() anxiety; however, the factor analysis does not indicate which of these possibilities is true.

`r bmu()` Reliability analysis [(2)]{.alt}

`r bmu()` McDonald's $\omega_t$ and $\omega_h$ [(2)]{.alt}

If the items are all scored in the same direction then we can select the variables on a particular subscale and pipe them into the omega() functions from the psych package. This function has the following general format

my_omg <- psych::omega(
    my_tibble
    nfactors = 1,
    fm = "minres",
    key = c(1, 1, -1, 1, 1 … 1),
    rotate = "oblimin",
    poly = FALSE
  )

In which [my_omg]{.alt} is the name you want to give to the object that stores the results, and [my_tibble]{.alt} is the name of your data.

`r robot()` Code example

To compute omega you need to reconstruct the original factor analysis so many of the arguments will be familiar and you set them to however they were set for your factor analysis. For example, we extracted 4 factors so we'd use [nfactors = 4]{.alt}, we used the minres methods with oblimin rotation so we'd set [fm = "minres"]{.alt} and [rotate = "oblimin"]{.alt}, and because we use polychoric correlations we need to include [poly = TRUE]{.alt}.

The key argument allows us to reverse item scoring on the fly. We supply the key argument with a vector of 1s and −1s that is the same length as the number of variables being fed into omega(). For the RAQ we have 23 items, and one of them (raq_03) is reverse scored. Assuming we enter the items into omega() in order, we'd use a vector of 23 1s and assign the third one a minus sign:

key = c(1, 1, -1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1)

`r cat_space()` **Tip: using `rep()`** The `rep()` function, which we've met before, can help us to specify the key argument. It takes the form wzxhzdk:52 On the RAQ, after item 3 all items are scored normally meaning that we have 20 repetitions of 1. We can specify these with `rep(1, 20)` within our vector: wzxhzdk:53

`r robot()` Code example

Putting this all together we can get omega for the RAQ by executing

raq_omg <- psych::omega(raq_items_tib,
                        nfactors = 4,
                        fm = "minres",
                        key = c(1, 1, -1, rep(1, 20)),
                        poly = TRUE
                        )

This code puts the raw data (in [raq_items_tib]{.alt}) into the omega() function, replicates the arguments used for the initial factor analysis, reverse scores raq_03, and tells the function to compute polychoric correlations. The results are stored in an object that I've called [raq_omg]{.alt}. To view this object execute its name.

`r alien()` Alien coding challenge

Use the code box to run a reliability analysis on the RAQ.

raq_omg <- psych::omega(raq_items_tib,
                        nfactors = 4,
                        fm = "minres",
                        key = c(1, 1, -1, rep(1, 20)),
                        poly = TRUE
                        )
raq_omg

raq_omg <- psych::omega(raq_items_tib,
                        nfactors = 4,
                        fm = "minres",
                        key = c(1, 1, -1, rep(1, 20)),
                        poly = TRUE, plot = F
                        )

The first thing to look at is the table of factor loadings (reproduced below). Note that raq_03 is helpfully labelled as [raq_03-]{.alt} (note the minus sign suffix) to indicate that it is reverse scored. The column labelled [g]{.alt} shows the loading of each item on the general factor. This column should be inspected before anything else because both $\omega_h$ and $\omega_t$ are based on a model with a common factor. If any items load poorly to this general factor then a common factor model isn't appropriate. Also, remember that $\omega_h$ uses the communalities for each item with only the general factor so it is based on these loadings. All items here have loadings with the general factor far enough away from zero that we might consider a general factor model to be appropriate. The columns labelled F1 to F4 show the factor loadings for each item on the four factors that we asked to be extracted. Note that the values don't match those from the main factor analysis (because this model also includes a general factor) but the patterns of item loadings to factors do.

raq_omg$schmid$sl |>
  as.data.frame() |> 
  dplyr::mutate(
    across(`F1*`:`F4*`, \(x) ifelse(abs(x) > 0.2, x, NA))
  ) |> 
  knitr::kable(digits = 2)

At the top of the text output are the reliability estimates for the general factor, which include Cronbach's $\alpha$ = r sprintf("%.2f", raq_omg$alpha), $\omega_h$ = r sprintf("%.2f", raq_omg$omega_h) and $\omega_t$ = r sprintf("%.2f", raq_omg$omega.tot). If you're into $\alpha$ then there's a section for you later on. $\omega_h$ is a gauge of the extent to which items reflect a single construct and this value (r sprintf("%.2f", raq_omg$omega_h)) suggests that to some extent they do, but there is still a lot of variance in the general factor that is unexplained. In fact, further down we're told that only about half (r sprintf("%.2f", raq_omg$ECV)) of the variance in the general factor is explained by the items. $\omega_t$ is the total reliability and r sprintf("%.2f", raq_omg$omega.tot) is a high value suggesting the scores are reliable (i.e., the scale is reliable in this sample).

Next we get two lots of model fit statistics. The first set are for the model that has four factors and repeat the information from the original EFA. The chi-square statistic is significant, $\chi^2$ = r sprintf("%.2f", raq_omg$schmid$STATISTIC), p < 0.001, a bad thing but unsurprising given the sample size, the RMSR = r sprintf("%.2f", raq_omg$schmid$rms) and the RMSEA = r sprintf("%.2f", raq_omg$schmid$RMSEA[1]) 90% CI [r sprintf("%.2f", raq_omg$schmid$RMSEA[2]), r sprintf("%.2f", raq_omg$schmid$RMSEA[3])]. Next, we get the same information but for a model that contains only the general factor (and not the four sub-factors). The fit gets worse as shown by (1) a larger and more significant chi-square, $\chi^2$ = r sprintf("%.2f", raq_omg$gstats$STATISTIC), p < 0.001; (2) a larger RMSR = r sprintf("%.2f", raq_omg$gstats$rms); and (3) a larger RMSEA = r sprintf("%.2f", raq_omg$gstats$RMSEA[1]) 90% CI [r sprintf("%.2f", raq_omg$gstats$RMSEA[2]), r sprintf("%.2f", raq_omg$gstats$RMSEA[3])]. This tells us that the model that characterises the RAQ in terms of four-factors is a better fit of the data than a model that characterises it as a single factor.

Finally, under Total, General and Subset omega for each subset we get the $\omega_t$ (labelled omega total) and $\omega_h$ (labelled Omega general) for the general factor (column [g]{.alt}) and for the subfactors. The values for the general factor repeat the information at the start of the output. The values for the sub-factors are particularly relevant for $\omega_t$ because it represents the total reliability of the scores, so these values tell us the total reliability of the scores from the underlying sub-scales. For anxiety related to computers (F1) we have $\omega_t$ = r sprintf("%.2f", raq_omg$omega.group$total[2]), for anxiety around peer or social evaluation (F2) $\omega_t$ = r sprintf("%.2f", raq_omg$omega.group$total[3]), for anxiety around statistics (F3) $\omega_t$ = r sprintf("%.2f", raq_omg$omega.group$total[4]), and for anxiety around maths (F4) $\omega_t$ = r sprintf("%.2f", raq_omg$omega.group$total[5]). Basically scores from each subscale are reliable but somewhat less so for anxiety around statistics.

`r bmu()` Cronbach's $\alpha$ [(2)]{.alt}

There are lots of reasons not to use Cronbach's alpha (see the book for details). If you really must compute alpha then you need to compute it on the individual subscales. As a quick reminder, we had four subscales:

Anxiety related to computers: raq_06, raq_07, raq_10, raq_13, raq_14, raq_15 and raq_18.
Anxiety around peer or social evaluation: raq_02, raq_09, raq_19, raq_22, and raq_23
Anxiety around statistics: raq_01, raq_03 (reverse scored), raq_04, raq_05, raq_12, raq_16, raq_20, and raq_21
Anxiety around maths: raq_08, raq_11 and raq_17

We pipe the variables for each subscale into the alpha() function from psych to get the conventional statistics that people who use alpha like to have.

`r robot()` Code example

For the fear of computers subscale we could execute (I'm excluding raq_05 because it makes more sense conceptually on the fear of statistics factor):

raq_tib |> 
  dplyr::select(raq_06, raq_07, raq_10, raq_13, raq_14, raq_15, raq_18) |> 
  psych::alpha()

`r alien()` Alien coding challenge

Use the code box to run a reliability analysis on the fear of computers subscale:

raq_tib |> 
  dplyr::select(raq_06, raq_07, raq_10, raq_13, raq_14, raq_15, raq_18) |> 
  psych::alpha()

r1 <- raq_tib |> 
  dplyr::select(raq_06, raq_07, raq_10, raq_13, raq_14, raq_15, raq_18) |> 
  psych::alpha()

get_alpha <- function(x, digits = 2){
  dp <- paste0("%.", digits, "f")
  a <- x$total$raw_alpha
  se <- x$total$ase

  ci_up <- sprintf(dp, a + 1.96*se)
  ci_lo <- sprintf(dp, a - 1.96*se)


  paste0("$\\alpha$", " = ", sprintf(dp, a), " [", ci_lo, ", ", ci_up, "]")  
}

First, and perhaps most important, the value of Alpha at the very top is Cronbach's $\alpha$, and we are given its 95% confidence interval below: we're looking for values in the range of .7 to .8 (or thereabouts). In this case, r get_alpha(r1) so this probably indicates good reliability.

Next, we get a table giving the statistics for the scale if we deleted each item in turn. The values in the column labelled [raw_alpha]{.alt} are the values of the overall $\alpha$ if that item is not included in the calculation. Basically it's the change in Cronbach's $\alpha$ that would be seen if a particular item were deleted. The overall $\alpha$ is r sprintf("%.2f", r1$total$raw_alpha), and so all values greater than this indicate that the overall $\alpha$ increases (i.e. reliability improves) if the item is removed. None of the items here would improve reliability if they were deleted.

The table labelled [item statistics]{.alt} shows, in the column labelled [raw.r]{.alt} the correlations between each item and the total score from the scale — sometimes called item-total correlations. There's a problem with this statistic in that the item is included in the scale total, which inflates the overall correlation. Ideally we want these correlations to be computed without the item in question, and these are the values in [r.drop]{.alt}. In a reliable scale all items should correlate with the total. So, we're looking for items that don't correlate with the overall score from the sub-scale: if any of these values of [r.drop]{.alt} are less than about .3 then we've got problems, because it means that a particular item does not correlate very well with the subscale overall. [The .3 is an arbitrary value that represents a reasonable sized relationship, but there's nothing magic about so use your own judgement.] For the fear of computers subscale, all items have corrected item-total correlations above 0.3, which is encouraging.

The final table tells us what percentage of people gave each response to each of the items. This is useful to make sure that everyone in your sample is not giving the same response. It is usually the case that an item where everyone (or almost everyone) gives the same response will almost certainly have poor reliability statistics. For this subscale few people responded with a 1 on any of the items suggesting that no-one is really feeling the love for computers or that the items are doing a poor job of eliciting those extreme responses.

`r alien()` Alien coding challenge

Use the code box to run a reliability analysis on the Fear of peer/social evaluation subscale:

raq_tib |> 
  dplyr::select(raq_02, raq_09, raq_19, raq_22, raq_23) |> 
  psych::alpha()

r2 <- raq_tib |> 
  dplyr::select(raq_02, raq_09, raq_19, raq_22, raq_23) |> 
  psych::alpha()

Again we have good overall reliability (r get_alpha(r2)), no items improve this value if dropped, item correlations with the scale total are all really good and again we have issues with the items not eliciting extremely low responses (for all items only 0.02, or 2%, of responders responded with a 1).

`r alien()` Alien coding challenge

Use the code box to run a reliability analysis on the Fear of mathematics subscale:

raq_tib |> 
  dplyr::select(raq_08, raq_11, raq_17) |> 
  psych::alpha()

r3 <- raq_tib |> 
  dplyr::select(raq_08, raq_11, raq_17) |> 
  psych::alpha()

Overall reliability is still fairly high (r get_alpha(r3)), no items improve this value if dropped, item correlations with the scale total are all really good and again we have issues with the items not eliciting extremely low responses (again, for each item only 2%, of responders responded with a 1).

The fear of statistics subscale has a reverse scored item: Standard deviations excite me (raq_03), so we need to indicate that this item is reverse scored. Like with the omega() function, we need to tell the alpha() function which items are reversed by using the [keys]{.alt} argument within the function. Just to confuse you note that the argument is [keys]{.alt} (with an s) whereas for omega() it was [key]{.alt} (without an s). We supply this argument with a vector of 1s and -1s as we did before. If we selected the items for the Fear of statistics subscale in order by using

raq_tib |> 
  dplyr::select(raq_01, raq_03, raq_04, raq_05, raq_12, raq_16, raq_20, raq_21)

Then we'd need to indicate that the second item in the list of variables is reverse scored, which we'd do using:

psych::alpha(keys = c(1, -1, 1, 1, 1, 1, 1, 1))

Note that the -1 is in the same position (second) as [raq_03]{.alt} is in the variable list.

`r robot()` Code example

For the fear of statistics subscale we would execute:

raq_tib |> 
  dplyr::select(raq_01, raq_03, raq_04, raq_05, raq_12, raq_16, raq_20, raq_21) |> 
  psych::alpha(keys = c(1, -1, 1, 1, 1, 1, 1, 1))

`r alien()` Alien coding challenge

Use the code box to run a reliability analysis on the fear of computers subscale:

raq_tib |> 
  dplyr::select(raq_01, raq_03, raq_04, raq_05, raq_12, raq_16, raq_20, raq_21) |> 
  psych::alpha(keys = c(1, -1, 1, 1, 1, 1, 1, 1))

r4 <- raq_tib |> 
  dplyr::select(raq_01, raq_03, raq_04, raq_05, raq_12, raq_16, raq_20, raq_21) |> 
  psych::alpha(keys = c(1, -1, 1, 1, 1, 1, 1, 1))

Note in the output that [raq_03]{.alt} is helpfully labelled as [raq_03-]{.alt} to indicate that it is reverse scored. Again we have acceptable, but not staggering, overall reliability (r get_alpha(r4)), no items improves this value if it is dropped, item correlations with the scale total are not too bad (they range from 0.27 to 0.49) and again we have issues with the items not eliciting extreme responses.

**A message from Mae Jemstone:** Mae was burnt out. She thought the tutorial writing was over. 'Just one term, then you can rest' they had said. Not for the first time, they had lied. She raised her weary body and stared at her screen. It made her feel anxious. She couldn't do it anymore, she hadn't the strength. 'Who will believe me?' She thought. People depend on me, I'm not allowed to show weakness. But how can I find the will? She remembered the box, the box from the Senza and very slowly her fingers started to type.

Resources {data-progressive=FALSE}

Statistics

The tutorials typically follow examples described in detail in @field_discovering_2023. That book covers the theoretical side of the statistical models, and has more depth on conducting and interpreting the models in these tutorials.
If any of the statistical content doesn't make sense, you could try my more introductory book An adventure in statistics [@fieldAdventureStatisticsReality2016].
There are free lectures and screencasts on my YouTube channel.
There are free statistical resources on my websites www.discoveringstatistics.com and milton-the-cat.rocks.

`r rproj()`

R for data science by @wickhamDataScience2017 is an open-access book by the creator of the tidyverse (Hadley Wickham). It covers the tidyverse and data management.
ModernDive is an open-access textbook on r rproj() and r rstudio().
r rstudio() cheat sheets.
r rstudio() list of online resources.

Acknowledgement

I'm extremely grateful to Allison Horst for her very informative blog post on styling learnr tutorials with CSS and also for sending me a CSS template file and allowing me to adapt it. Without Allison, these tutorials would look a lot worse (but she can't be blamed for my colour scheme).

References

profandyfield/discovr documentation built on June 14, 2025, 5:31 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

profandyfield/discovr Interactive Tutorials and Data for "Discovering Statistics Using R and RStudio"

In profandyfield/discovr: Interactive Tutorials and Data for "Discovering Statistics Using R and RStudio"

discovr: Exploratory Factor Analysis (EFA)

Overview

r cat_space(fill = blu) Welcome to the discovr space pirate academy

Workflow

Packages

Coding style

Data

r bmu() An r rproj()-induced anxiety example [(2)]{.alt}

r alien() Alien coding challenge

r bmu() Pre-analysis [(2)]{.alt}

r alien() Alien coding challenge

r bmu() Inspect the correlation matrix [(2)]{.alt}

r alien() Alien coding challenge

r alien() Alien coding challenge

r robot() Code example

r alien() Alien coding challenge

r bmu() Bartlett's test and the KMO test [(2)]{.alt}

r robot() Code example

r alien() Alien coding challenge

r robot() Code example

r alien() Alien coding challenge

r bmu() Parallel analysis [(2)]{.alt}

r robot() Code example

r alien() Alien coding challenge

r alien() Alien coding challenge

r bmu() Factor analysis [(2)]{.alt}

r robot() Code example

r alien() Alien coding challenge

r alien() Alien coding challenge

r bmu() Reliability analysis [(2)]{.alt}

r bmu() McDonald's $\omega_t$ and $\omega_h$ [(2)]{.alt}

r robot() Code example

r robot() Code example

r alien() Alien coding challenge

r bmu() Cronbach's $\alpha$ [(2)]{.alt}

r robot() Code example

r alien() Alien coding challenge

r alien() Alien coding challenge

r alien() Alien coding challenge

r robot() Code example

r alien() Alien coding challenge

Resources {data-progressive=FALSE}

Statistics

r rproj()

Acknowledgement

References

R Package Documentation

Browse R Packages

We want your feedback!

profandyfield/discovr
Interactive Tutorials and Data for "Discovering Statistics Using R and RStudio"

`r cat_space(fill = blu)` Welcome to the `discovr` space pirate academy

`r bmu()` An `r rproj()`-induced anxiety example [(2)]{.alt}

`r alien()` Alien coding challenge

`r bmu()` Pre-analysis [(2)]{.alt}

`r alien()` Alien coding challenge

`r bmu()` Inspect the correlation matrix [(2)]{.alt}

`r alien()` Alien coding challenge

`r alien()` Alien coding challenge

`r robot()` Code example

`r alien()` Alien coding challenge

`r bmu()` Bartlett's test and the KMO test [(2)]{.alt}

`r robot()` Code example

`r alien()` Alien coding challenge

`r robot()` Code example

`r alien()` Alien coding challenge

`r bmu()` Parallel analysis [(2)]{.alt}

`r robot()` Code example

`r alien()` Alien coding challenge

`r alien()` Alien coding challenge

`r bmu()` Factor analysis [(2)]{.alt}

`r robot()` Code example

`r alien()` Alien coding challenge

`r alien()` Alien coding challenge

`r bmu()` Reliability analysis [(2)]{.alt}

`r bmu()` McDonald's $\omega_t$ and $\omega_h$ [(2)]{.alt}

`r robot()` Code example

`r robot()` Code example

`r alien()` Alien coding challenge

`r bmu()` Cronbach's $\alpha$ [(2)]{.alt}

`r robot()` Code example

`r alien()` Alien coding challenge

`r alien()` Alien coding challenge

`r alien()` Alien coding challenge

`r robot()` Code example

`r alien()` Alien coding challenge

`r rproj()`