knitr::opts_chunk$set( # code or die echo = TRUE, # minimize verbosity warning = FALSE, message = FALSE, # dpi = 150, # for hires images comment = "#>") set.seed(0xFEED)
Under Construction
Easy to run a PCA over the rnaseq data, or whatever other assy you got there (CNV, proteomics, take your pick)
library(FacileData) library(FacileAnalysis) library(dplyr) pca.crc <- exampleFacileDataSet() |> filter_samples(indication == "CRC", sample_type == "tumor") |> fpca(assay_name = "rnaseq")
Vizualize the results:
viz(pca.crc, color_aes = "subtype_crc_cms", shape_aes = "sex")
Need to fix up report()
Not sure how to compare()
two PCA results ...
The principal components are a new basis of orthogonal axes that are linear combinations of the original axes, which in this examples are genes. By exploring which genes (and gene sets) are most highly loaded along each principal component, analysts can begin to better understand what biological processes might be driving the variability in their data.
The ranks(fpca())
result extracts the feature (gene) loadings onto the
principal components. signature(fpca())
extracts the top N ntop
features
from each principal component.
The code below returns the top 10 features over the first three PCs. The score
column corresponds to the loading:
pca.sig <- signature(pca.crc, dims = 1:3, ntop = 10) pca.sig |> tidy() |> select(feature_id, symbol, dimension, score)
When we map the expression values of the genes to the PCA viz
, we can see that
the expression of the genes with high abs(score)
values should track with
its position on its principal component.
:::note
# We need to hack the expression values for these genes into the pca result for # now, but we are working to upgrade the data retrieval abilities around # FacileAnalysisResult objects so that this can be made effortless. pcgenes <- c(IL8 = "3576", ARFGEF2 = "10564", BRIP1 = "83990") pca.crc$result <- pca.crc$result |> with_assay_data(pcgenes) # The following issue tracks our intent to enable easier query/retrieval of # data from different places to make interactive exploration more facile: # # https://github.com/facileverse/FacileData/issues/8 # # Instead of hacking the gene expression data back into the pca.crc$result # tibble, we should be able to do something like this: # # viz(pca.crc, color_aes = "feature:IL8|name")
:::
For example, we can see that IL8 is highly loaded on PC1:
viz(pca.crc, color_aes = "IL8", title = "IL8")
ARFGEF2 is also highly loaded on PC1, but witha flipped sign:
viz(pca.crc, color_aes = "ARFGEF2", title = "ARFGEF2")
and BRIP1 is a highly loaded gene on PC2
viz(pca.crc, color_aes = "BRIP1", title = "BRIP1")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.