Home

/

GitHub

/

xuanxuanyu-bios/modDream

/

README.md

README.md
In xuanxuanyu-bios/modDream: Single Cell rna-seqUencing Time series data Analysis

SCUTA

linear mixed modelling for single cell RNAseq time series data with multilevel design

library(devtools)
install_github("xuanxuanyu-bios/SCUTA")

pipelineflowchart

SCUTA is a tools that designed for fitting linear mixed models for single cell RNAseq datasets, especially with longitudinal multi-level design. The algorithm is based on Dream in VariancePartition package.

Load library and data
library("SCUTA")

The imput matrix is the count matrix and meta data. counts is expression matrix whhere columns represent cells, rows represent genes. coldata is the meta data including condition, individual and time information of each cell.

data(example.data)

Load libraries

library("zinbwave")
library("DESeq2")
library("edgeR")
library("variancePartition")

prepare DESeqDataSet file and specify the variables that are taken into acount handling dropout events.

counts  <- example.data$counts
coldata <- example.data$coldata

fluidigm <- DESeqDataSetFromMatrix(countData = counts,
                                   colData = coldata,
                                   design = ~ condition + time)

The zinbwave function is used to compute observational weights which unlock bulk RNA-seq tools for single-cell applications, as illustrated in Van den Berge et al. 2018.

zinb <- zinbFit(fluidigm, K=2, epsilon=1000)
fluidigm_zinb <- zinbwave(fluidigm, fitted_model = zinb, K = 2, epsilon=1000, observationalWeights = TRUE)
weights <- assay(fluidigm_zinb, "weights")

voomWithDreamWeights function is used to transform count data to log2-counts per million (logCPM), estimate the mean-variance relationship and compute observation-level weights. Then, the zinb weights and mean-variance weights are combined together as the overall weights. At last, a three-level linear mixed model is fitted for the data. The model can be expressed as $$ E(log_2(Y_ig)) = \beta_{0g} + \beta_{1g} * condition_{i} + \beta_{2g} * time_{i} + \alpha_{0ig} + \alpha_{1ig} * time_{i} $$ Where $i$ denotes cell, $g$ represents gene, $\alpha_{0ig}$ is the random sample effect, and $\alpha_{1ig}$ is the random time effect.

d0 <- DGEList(counts)
d0 <- calcNormFactors(d0)
form <- ~ condition + time + (1 + time |individual)
vobjDream = voomWithDreamWeights( d0, form, coldata )
vobjDream.weight<-vobjDream
vobjDream.weight$weights <- weights*vobjDream.weight$weights

getContrast is used to specify the contrast matrix for linear mixed model. The three-level linear mixed model is fitted by suing SCUTA function.

fit.SCUTA     <- SCUTA( vobjDream.weight, form, coldata)

The SCUTA() function is modified to replace the 'dream()' function in variancePartition, so that any function in variance partition that used combined with dream() function can be used in conjuction with 'SCUTA()' function. For example, the top 6 differentiall expressed genes between two conditions are

fit.SCUTA.res <- topTable(fit.SCUTA, coef="condition2", number=nrow(counts) )
head(fit.SCUTA.res)

            logFC  AveExpr         t      P.Value    adj.P.Val     z.std
Gene558 -1.451388 4.991292 -9.796536 1.481199e-09 1.478237e-06 -6.046398
Gene569 -1.479936 4.852454 -8.988242 6.702487e-08 3.195357e-05 -5.398969
Gene740 -1.337853 5.140980 -8.551072 9.605281e-08 3.195357e-05 -5.334037
Gene379 -1.452054 5.376776 -7.665623 4.786163e-07 9.380289e-05 -5.034693
Gene98  -1.358545 5.261751 -7.681800 5.068412e-07 9.380289e-05 -5.023705
Gene213 -1.286129 5.877459 -7.457352 5.639453e-07 9.380289e-05 -5.003172

Where logFC is the log fold change comparing condition 2 with condition 1, P.Value and adj.P.Val are the unadjusted and FDR adjusted P values, respectively.

For the real data application in the manuscript, we fit 3 linear mixed models with respect to different aims. The scRNA-seq study consists of 1529 cells from 88 human embryos across 5 days (Day 3 - Day 7). Three lineages were identified after Day 5, including trophectoderm (TE), primitive endoderm (PE), and epiblast (EPI). Cells from each embryo in each time point were sequenced. Model 1 was fitted for cells in each lineage, separately, by also including cells in Day 3 and Day 4 in the model. The aim was to identify genes which show temporal trend in each lineage when considering correlations among cells within the same embryo.

$$Model 1: E(log_2(Y_ig)) = \beta_{0g} + \beta_{1g} * time_{i} + \alpha_{ig} $$

Where g represents gene while i denotes embryo. Model 2 was fitted for cells in each time point from Day 5 to Day 7, respectively, to detect genes that show differential expression between lineages.

$$Model 2: E(log_2(Y_ig)) = \beta_{0g} + \beta_{2g} * lineage_{i} + \alpha_{ig} $$

Model 3 was fitted for cells from Day 5 to Day 7 with fixed effect of time and lineage, random effect of lineage and embryo according to the hierarchy structure.

$$Model 3: E(log_2(Y_ig)) = \beta_{0g} + \beta_{1g} * time_{i} + \beta_{2g} * lineage_{i} + \alpha_{1ig} + \alpha_{2ig} $$

Where $\alpha_{1ig}$ represents the random effect for embryo, and $\alpha_{2ig}$ capture the within lineage variation with k denoting the lineage.

xuanxuanyu-bios/modDream documentation built on April 10, 2023, 8:37 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

xuanxuanyu-bios/modDream
Single Cell rna-seqUencing Time series data Analysis

README.md
In xuanxuanyu-bios/modDream: Single Cell rna-seqUencing Time series data Analysis

SCUTA

install SCUTA

Analysis flowchart

Model fitting tutorial

Example of linear mixed models

R Package Documentation

Browse R Packages

We want your feedback!

xuanxuanyu-bios/modDream Single Cell rna-seqUencing Time series data Analysis

README.md In xuanxuanyu-bios/modDream: Single Cell rna-seqUencing Time series data Analysis

SCUTA

install SCUTA

Analysis flowchart

Model fitting tutorial

Example of linear mixed models

R Package Documentation

Browse R Packages

We want your feedback!

xuanxuanyu-bios/modDream
Single Cell rna-seqUencing Time series data Analysis

README.md
In xuanxuanyu-bios/modDream: Single Cell rna-seqUencing Time series data Analysis