knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

PlantPhysioSpace is a R data package which provides plant stress spaces to be used with the package PhysioSpaceMethods for in depth analysis of plant response to different types of stress.

Table of Contents

Installation Instructions
Usage Instructions

Installation Instructions

It is recommended to install PhysioSpaceMethods before PlantPhysioSpace. More information about how to install PhysioSpaceMethods is provided in https://github.com/JRC-COMBINE/PhysioSpaceMethods.

Installing via Devtools (Recommended method):

Easiest way to install PlantPhysioSpace is via Devtools. After installing Devtools from cran, you can install PlantPhysioSpace by:

devtools::install_github(repo = "JRC-COMBINE/PlantPhysioSpace", build_vignettes = TRUE)

Alternative installation methods (Manual download):

In case you encountered any problem while installing PlantPhysioSpace, you can download the repository first and install the package from downloaded local files. In your terminal, first clone the repository in your desired repository:

cd [Your desired directory]
git clone https://github.com/JRC-COMBINE/PlantPhysioSpace.git

Then install the downloaded package using Devtools:

R -e "devtools::install_local('./PlantPhysioSpace/', build_vignettes = TRUE)"

Usage Instructions

PlantPhysioSpace can map user samples inside a physiological space, calculated prior from a compendium of known samples. Here we demonstrate the power of the method with few examples.

Example One: GSE106635

The first dataset we will analyse is GSE106635 from NCBI's Gene Expression Omnibus or GEO.

There are numerous ways to acquire a dataset from GEO, for example by using GEOquery. Here we directly download the CEL files and normalise them with affy package:

#Download and untar:
download.file(url = "https://www.ncbi.nlm.nih.gov/geo/download/?acc=GSE106635&format=file",
              destfile= "GSE106635_RAW.tar") # We're downloading into the working directory, obviously using any other directory is possible.
untar(tarfile = "GSE106635_RAW.tar", exdir = "GSE106635_RAW")

#Normalising:
require(affy)
GSE106635ESet <-
  justRMA(
    filenames = list.files(path = "GSE106635_RAW", full.names = T)
  )

After downloading and normalising the data, we need to take four important steps before using the data as input for PlantPhysioSpace.

#Converting expression set to matrix:
GSE106635ESetExp <- exprs(GSE106635ESet)
#Converting AffyID to EntrezID:
ATH1ChipInfo <- read.delim(file = "data-raw/Arabidopsis/AT-raw/ArabidopsisATH1GenomeArray.txt",
                           header = T, skip=18)
rownames(GSE106635ESetExp) <- ATH1ChipInfo$Entrez.Gene[match(x = rownames(GSE106635ESetExp),
                                                          table = ATH1ChipInfo$Probe.Set.ID)]
GSE106635ESetExp <- GSE106635ESetExp[rownames(GSE106635ESetExp)!="---",]
#Calculating Fold-Changes:
GSE106635FC <- GSE106635ESetExp[,5:8] - GSE106635ESetExp[,1:4]

(although there are more sophisticated ways for this calculation, for example by calculating the signed p value in logarithm scale, which will come later in this vignette).

#Writing samples names into colnames:
colnames(GSE106635FC) <- c("WTrep1","WTrep2","MUTrep1","MUTrep2")

Now that we prepared the proper input for PlantPhysioSpace, the main calculation can be done easily by using the function calculatePhysioMap():

#Main calculation:
library(PhysioSpaceMethods)
library(PlantPhysioSpace)
RESULTS <- calculatePhysioMap(InputData = GSE106635FC, Space = AT_Stress_Space)

Note that calculatePhysioMap() has to have at least two inputs: 'InputData' which in our case is the fold change matrix, and 'Space' which is the Physiology Space we want the InputData to be mapped in. Here we used AT_Stress_Space, which is already included in PlantPhysioSpace package. For more information about the available Spaces in the package, detail explanation about AT_Stress_Space and information about other input options of calculatePhysioMap() we recommend the reader to check the documentation of the package.

The output of calculatePhysioMap(), which here we called 'RESULTS', is a matrix with the same number of columns as the number of samples(Columns) we had in 'InputData', and the same number of rows as the number of dimensions(Columns) we had in the 'Space'. The value in row M and Column N in RESULTS is the mapped values of Nth sample on Mth dimension of the Space.

In our case we had samples under cold stress condition, so we expect to see high values (similarities) on the 'Cold' dimension of the AT_Stress_Space:

#Plotting the results:
PhysioHeatmap(PhysioResults = RESULTS, main = "Stress Analysis of GSE106635", SymmetricColoring = T, SpaceClustering = T, Space = AT_Stress_Space)

RESULTS(similarities) Heatmap

In the output you can clearly see all samples have their highest value on Cold stress dimension. For more information about PhysioHeatmap() check PhysioSpaceMethods package documentation.

Example Two: GSE93420

Example Three:



JRC-COMBINE/PlantPhysioSpace documentation built on Nov. 25, 2020, 8:01 a.m.