In andjar/ALASCA: Assorted, Linear ASCA Functions

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  out.width = "100%", 
  fig.asp = 0.7,
  fig.width = 12,
  fig.align = "center",
  cache = FALSE,
  external = FALSE
)
library("ALASCA")
library("data.table")
library("ggplot2")
theme_set(
  theme_bw() + theme(legend.position = "bottom")
  )

Using ALASCA for classification or prediction

We will use the same code to simulate data sets here as in the Regression vignette. In brief, we generate a training and test data set, and use ALASCA and PLS-DA to test group classification.

Generate a data set

We will start by creating an artificial data set with 100 participants, 5 time points, 2 groups, and 20 variables. The variables follow four patterns

Linear increase
Linear decrease
A v-shape
An inverted v-shape

The two groups are different at baseline and one of the groups have larger changes throughout the study.

wzxhzdk:1

Overall (ignoring the random effects), the four patterns look like this:

ggplot(df[variable %in% c("variable_1", "variable_2", "variable_3", "variable_4"),],
       aes(time, value, color = group)) +
  geom_smooth() +
  facet_wrap(~variable, scales = "free_y") +
  scale_color_viridis_d(end = 0.8)

We want time to be a categorical variable:

df[, time := paste0("t_", time)]

Generate a second data set

We now generate a second data set using the same code as above. We will do classification on these data.

wzxhzdk:4

Subtract baseline

Later on, we will do classification on the test data set. But, as we would like to take individual differences into account, we create copies of the data sets and subtract the baseline for each participant.

wzxhzdk:5

Run ALASCA and calculate scores

We now use the first data set to create an ALASCA model

wzxhzdk:6

Next, we use the ALASCA::predict_scores() function introduced in version 1.0.14 to get a score for each data point. Note that the number of ASCA components can be specified. For simplicity, we only use three here, but increasing the number of components may improve the classification model.

wzxhzdk:7

Just for illustration, here is the first three PC scores of training set (on which we built the ALASCA model, without removing baseline):

wzxhzdk:8

And here is the test data set:

wzxhzdk:9

Using PLS-DA for classification

Since ASCA is not intended to be used for classification, we will construct a PLS-DA model using ASCA scores. Note that the number of components must be specified. In this example, we use four components as illustration.

wzxhzdk:10

Next, we do prediction on the test data set using the PLS-DA model above.

wzxhzdk:11

And, as we can see, the model does quite well:

caret::confusionMatrix(table(kkk[, .(pred, group)]))

andjar/ALASCA documentation built on Sept. 15, 2024, 8:56 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

andjar/ALASCA
Assorted, Linear ASCA Functions

In andjar/ALASCA: Assorted, Linear ASCA Functions

Using ALASCA for classification or prediction

Generate a data set

Generate a second data set

Subtract baseline

Run ALASCA and calculate scores

Using PLS-DA for classification

R Package Documentation

Browse R Packages

We want your feedback!

andjar/ALASCA Assorted, Linear ASCA Functions

In andjar/ALASCA: Assorted, Linear ASCA Functions

Using ALASCA for classification or prediction

Generate a data set

Generate a second data set

Subtract baseline

Run ALASCA and calculate scores

Using PLS-DA for classification

R Package Documentation

Browse R Packages

We want your feedback!

andjar/ALASCA
Assorted, Linear ASCA Functions