Examples/Lightbulb_example_super_cell_creation_V1.md

require(Lightbulb)
Loading required package: Lightbulb
Loading required package: data.table
Loading required package: matrixStats
Loading required package: Matrix
Loading required package: gplots

Attaching package: ‘gplots’

The following object is masked from ‘package:stats’:

    lowess

Loading required package: ggplot2
Loading required package: Seurat
Loading required package: cowplot

Attaching package: ‘cowplot’

The following object is masked from ‘package:ggplot2’:

    ggsave


Attaching package: ‘Lightbulb’

The following objects are masked from ‘package:matrixStats’:

    colMaxs, colMins, colSds, rowMaxs, rowMins, rowSds

loading data

Exp_Seurat=readRDS("Lightbulb_example_ExpSeurat.rds")

Super cell creation

Exp_Seurat is a Seurat object. Seurat object stores the log2(TPM) matrix in @data slot. Our super cell function can take a seurat object as input or a cell-gene matrix as input. For details about generating your own Seurat object, please refer to Seurat manual (https://satijalab.org/seurat/get_started.html)

Exp_Seurat@data[1:5,1:5]
5 x 5 sparse Matrix of class "dgCMatrix"
        CAGTCCTTCCAAGTAC-1 TAAGTGCTCTCTGAGA-1 CGCTTCATCTTACCTA-1
Mrpl15            .                  8.470585                  .
Lypla1            .                  8.470585                  .
Tcea1             8.633139           8.470585                  .
Atp6v1h           .                  .                         .
Rb1cc1            .                  8.470585                  .
        TCTTCGGAGTTTAGGA-1 AATCCAGCATAGACTC-1
Mrpl15                   .                  .
Lypla1                   .                  .
Tcea1                    .                  .
Atp6v1h                  .                  .
Rb1cc1                   .                  .
Exp_Seurat@meta.data[1:10,]
nGenenUMIorig.identbatchres.1tissuetimepointtestres.3 CAGTCCTTCCAAGTAC-1945 2525 SeuratProject1 8 Spleen d0 0 0 TAAGTGCTCTCTGAGA-1891 2827 SeuratProject1 8 Spleen d0 0 0 CGCTTCATCTTACCTA-1981 3012 SeuratProject1 8 Spleen d0 0 0 TCTTCGGAGTTTAGGA-1744 2212 SeuratProject1 8 Spleen d0 0 0 AATCCAGCATAGACTC-1890 2455 SeuratProject1 8 Spleen d0 0 0 CTTAGGAGTGTTAAGA-1680 1628 SeuratProject1 8 Spleen d0 0 0 TGACTAGTCCTTTCGG-1996 3462 SeuratProject1 8 Spleen d0 0 0 GACACGCGTACCGTAT-1717 2552 SeuratProject1 8 Spleen d0 0 0 CTAATGGCACTGTGTA-1716 1631 SeuratProject1 8 Spleen d0 0 0 TTAGGCATCCTGCCAT-1856 2483 SeuratProject1 8 Spleen d0 0 0

using seurat object as input

When taking Seurat object as input, the calculation is based on @scale.data slot. We need to make sure that @scale.data is addible. In this example, we simply put the TPM matrix in the @scale.data slot.

k_merge an integer indicates the number of cells we want to merge to create a super-cell. n an integer of the number of super-cell we want to get. sampling_ref a vector of group_id for each cell. The sampling process will try to make sure the group_id distribution remains the same after sampling.

Exp_Seurat@scale.data=2^Exp_Seurat@data-1

#calculate super cell
supercell_mat=Super_cell_creation(Exp_Seurat,k_merge = 50,n=6000,sampling_ref = Exp_Seurat@meta.data$batch)
supercell_mat=log2(supercell_mat+1)
Seurat object detected as input
MNN done: time consumed: 0 hr 1 min 51.37 s
merging finished: time consumed: 0 hr 4 min 54.86 s

using cell gene matrix as input

The cell gene matrix is a matrix which each row represents a cell and each column is a gene. We need to make sure the value of cell-gene matrix is addible. In this example, we simply use the TPM matrix.

TPM_matrix=as.matrix(t(2^Exp_Seurat@data-1))

#calculate super cell
supercell_mat=Super_cell_creation(TPM_matrix,k_merge = 50,n=6000,sampling_ref = Exp_Seurat@meta.data$batch)
supercell_mat=log2(supercell_mat+1)
Assuming input is cell-gene matrix
MNN done: time consumed: 0 hr 1 min 34.67 s
merging finished: time consumed: 0 hr 5 min 26.77 s

The row name of the super-cell matrix indicate the position of the cell center in the original data. For example in the following super-cell matrix, the first super-cell is centered on cell 193 in orginal data (row 193 in TPM_matrix or column 193 in Exp_Seurat@scale.data).

supercell_mat[1:5,1:5]
Mrpl15Lypla1Tcea1Atp6v1hRb1cc1 1935.4582305.2650785.9264504.8289825.057072 2765.7547555.3421824.2190485.0587004.709162 12346.4671625.1426265.4240175.2120624.988101 6516.3538714.0177805.6027134.4890833.023346 4955.6067215.5077855.8747364.4823523.114476



Arthurhe/Lightbulb documentation built on April 13, 2020, 5:12 p.m.