knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

emtdata

The emtdata package is an ExperimentHub package for three data sets with an Epithelial to Mesenchymal Transition (EMT). This package provides pre-processed RNA-seq data where the epithelial to mesenchymal transition was induced on cell lines. These data come from three publications Cursons et al. (2015), Cursons etl al. (2018) and Foroutan et al. (2017). In each of these publications, EMT was induces across multiple cell lines following treatment by TGFb among other stimulants. This data will be useful in determining the regulatory programs modified in order to achieve an EMT. Data were processed by the Davis laboratory in the Bioinformatics division at WEHI.

This package can be installed using the code below:

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("emtdata")

Download data from the emtdata R package

Data in this package can be downloaded using the ExperimentHub interface as shown below. To download the data, we first need to get a list of the data available in the emtdata package and determine the unique identifiers for each data. The query() function assists in getting this list.

library(emtdata)
library(ExperimentHub)
library(SummarizedExperiment)
eh = ExperimentHub()
query(eh , 'emtdata')

Data can then be downloaded using the unique identifier.

eh[['EH5440']]

Alternatively, data can be downloaded using object name accessors in the emtdata package as below:

#metadata are displayed
cursons2018_se(metadata = TRUE)
#data are loaded
cursons2018_se()

Accessing SummarizedExperiment object

cursons2018_se = eh[['EH5440']]

#read counts
assay(cursons2018_se)[1:5, 1:5]

#genes
rowData(cursons2018_se)

#sample information
colData(cursons2018_se)

Exploratory analysis and visualization

Below we demonstrate how the SummarizedExperiment object can be interacted with. A simple MDS analyis is demonstrated for each of the datasets within this package. This transcriptomic data can be used for differential expression (DE) analyis and co-expression analysis to better understand the processes underlying EMT or MET.

cursons2018

This gene expression data comes from the human mammary epithelial (HMLE) cell line. A mesenchymal HMLE (mesHMLE) phenotype was induced following treatment with TGFb. The mesHMLE subline was then treated with mir200c to reinduce an epithelial phenotype.

See help page ?cursons2018_se for further reference

library(edgeR)
library(RColorBrewer)
cursons2018_dge <- asDGEList(cursons2018_se)
cursons2018_dge <- calcNormFactors(cursons2018_dge)
plotMDS(cursons2018_dge)

cursons2015

This gene expression data comes from the PMC42-ET, PMC42-LA and MDA-MB-468 cell lines. Mesenchymal phenotype was induced in PMC42 cell lines with EGF treatment and in MDA-MB-468 with either EGF treatment or kept under Hypoxia.

See help page ?cursons2015_se for further reference.

cursons2015_se = eh[['EH5441']]
cursons2015_dge <- asDGEList(cursons2015_se)
cursons2015_dge <- calcNormFactors(cursons2015_dge)
colours <- brewer.pal(7, name = "Paired")
plotMDS(cursons2015_dge, dim.plot = c(2,3), col=rep(colours, each = 3)) 

foroutan2017

This gene expression data comes from multiple different studies (microarary and RNA-seq), with cell lines treated using TGFb to induce a mesenchymal shift. Data were combined using SVA and ComBat to remove batch effects.

See help page ?foroutan2017_se for further reference

foroutan2017_se = eh[['EH5439']]
foroutan2017_dge <- asDGEList(foroutan2017_se, assay_name = "logExpr")
foroutan2017_dge <- calcNormFactors(foroutan2017_dge)
tgfb_col <- as.numeric(foroutan2017_dge$samples$Treatment %in% 'TGFb') + 1
plotMDS(foroutan2017_dge, labels = foroutan2017_dge$samples$Treatment, col = tgfb_col)

Session information

sessionInfo()


DavisLaboratory/emtdata documentation built on Dec. 17, 2021, 4:09 p.m.