baduel_5gs: Small portion of RNA-seq data from plant physiology study.

baduel_5gsR Documentation

Small portion of RNA-seq data from plant physiology study.

Description

A subsample of the RNA-seq data from Baduel et al. studying Arabidopsis Arenosa physiology.

Usage

data(baduel_5gs)

Format

3 objects

  • design: a design matrix for the 48 measured samples, containing the following variables:

    • SampleName corresponding column names from expr_norm_corr

    • Intercept an intercept variable

    • Population a factor identifying the plant population

    • Age_weeks numeric age of the plant at sampling time (in weeks)

    • Replicate a purely technical variable as replicates are not from the same individual over weeks. Should not be used in analysis.

    • Vernalized a logical variable indicating whether the plant had undergone vernalization (exposition to cold and short day photoperiods)

    • Vernalized a binary variable indicating whether the plant belonged to the KA population

    • AgeWeeks_Population interaction variable between the AgeWeeks and Population variables

    • AgeWeeks_Vernalized interaction variable between the AgeWeeks and Vernalized variables

    • Vernalized_Population interaction variable between the Vernalized and Population variables

    • AgeWeeks_Vernalized_Population interaction variable between the AgeWeeks, Vernalized and Population variables

  • baduel_gmt: a gmt object containing 5 gene sets of interest (see GSA.read.gmt)

  • expr_norm_corr: a numeric matrix containing the normalized batch corrected expression for the 2454 genes included in either of the 5 gene sets of interests

Source

https://www.ncbi.nlm.nih.gov/bioproject/PRJNA312410/

References

Baduel P, Arnold B, Weisman CM, Hunter B & Bomblies K (2016). Habitat-Associated Life History and Stress-Tolerance Variation in Arabidopsis Arenosa. Plant Physiology, 171(1):437-51. doi: 10.1104/pp.15.01875.

Agniel D & Hejblum BP (2017). Variance component score test for time-course gene set analysis of longitudinal RNA-seq data, Biostatistics, 18(4):589-604. doi: 10.1093/biostatistics/kxx005. arXiv:1605.02351.

Examples

## Not run: 
rm(list=ls())
data("baduel_5gs")

set.seed(54321)
KAvsTBG <- tcgsa_seq(y=log2(expr_norm_corr+1), x=apply(as.matrix(design[, c("Intercept",
   "Vernalized", "Age_weeks", "Vernalized_Population", "AgeWeeks_Population"), drop=FALSE]),
       2, as.numeric),
                     phi=as.matrix(design[, c("PopulationKA"), drop=FALSE]),
                     genesets=baduel_gmt$genesets[c(3,5)],
                     which_test = "permutation", which_weights = "loclin",
                     n_perm=1000, preprocessed = TRUE, doPlot = TRUE)

set.seed(54321)
Cold <- tcgsa_seq(y=log2(expr_norm_corr+1), x=apply(as.matrix(design[, c("Intercept",
   "Age_weeks", "PopulationKA", "AgeWeeks_Population"), drop=FALSE]), 2, as.numeric),
                 phi=as.matrix(design[, c("Vernalized", "Vernalized_Population")]),
                 genesets=baduel_gmt$genesets[c(3,5)],
                 which_test = "permutation", which_weights = "loclin",
                 n_perm=1000, preprocessed = TRUE, doPlot = TRUE)

## End(Not run)



denisagniel/tcgsaseq documentation built on May 7, 2022, 1:22 a.m.